[Step 3] Database structure

The database structure is organized to simplify navigation and selection cell type-specific (categories of) regulatory datasets.

The ENCODE data is organized using [category]/([subcategory])/[tier] schema.

The DNase, Histone, TFBS_cellspecific categories contain corresponding cell type-specific regulatory data. The TFBS_combined category contains the non-cell type-specific summary of binding of 161 transcription factors. The [chromStates](ENCODE chromStates) category contains cell type-specific chromatin states obtained using different methods.
The tier system reflects [cell type specificity and quality](ENCODE cell types) of the data.

The Roadmap Epigenomics data follows the [category]/[cell/tissue type] schema.

The DNase/Histone categories contain corresponding cell type-specific regulatory data. The _bPk/_gPk/_nPk suffixes correspond to peaks called using broad/gapped/narrow peaks settings, respectively. See c. Peak Calling section for more details. We recommend using _bPk data. The processed/imputed suffixes correspond to experimentally obtained/computationally imputed regulatory data, respectively. See Imputed signal tracks for more details. We recommend using processed data.
The cell/tissue type system organizes data derived from [general anatomical categories](Roadmap cell types).

The file names generally follow [cell]-[factor]-[category] schema to quickly identify regulatory datasets without the need to consult detailed description.

Welcome

[Step 1] Input data format

[Step 2] Background selection

[Step 3] Database structure

[Advanced] User provided data

[Advanced] Statistical tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Step 3] Database structure

Clone this wiki locally