Skip to content

Commit

Permalink
Merge pull request #7 from EvoAlias/master
Browse files Browse the repository at this point in the history
Adding Documentation
  • Loading branch information
EvoAlias committed Oct 18, 2018
2 parents 5eea028 + e0832ee commit 2685e85
Show file tree
Hide file tree
Showing 6 changed files with 53 additions and 15 deletions.
Binary file added docs/_static/Annotation_Tracks.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/Annotation_Tracks_Expanded.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/More_Information.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ WashU Epigenome Browser

.. figure:: _static/eg.png

The gateway to epigenomes. (Art by **Ting Wang**)
Gateway to the epigenome. (Art by **Ting Wang**)

.. toctree::
:maxdepth: 2
Expand Down
42 changes: 31 additions & 11 deletions docs/tracks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,22 +119,42 @@ This format needs to be compressed by bgzip and indexed by tabix for submission
refbed
~~~~~~

``refbed`` format files allow you to upload custom gene annotation track. It's like
refGene bed like file downloaded from UCSC but with slightly modifications. Each of
The ``refbed`` format files allows you to upload a custom gene annotation track. It is similar to the
refGene bed-like file downloaded from UCSC but with slight modifications. Each file of
this format contains (each column is separated by *Tab*):

chr, transript_start, transript_stop, translation_start,stranslation_stop, strand, gene_name, transcript_id, type, exons(including UTRs) start, exons(including UTRs) stops, additional gene info
chr, transcript_start, transcript_stop, translation_start, translation_stop, strand, gene_name, transcript_id, type, exon(including UTR bases) starts, exon(including UTR bases) stops, and additional gene info (*optional*)

This format needs to be compressed by bgzip and indexed by tabix for submission as a track. See `Prepare track files`_.

.. hint:: The 9th column contains gene type, but is simplified from the Gencode/Ensembl annotations to coding, pseudo, nonCoding, problem, and other. These classes of gene type are colored differently when the track is displayed on the browser.

.. hint:: The 10th and 11th columns contain exon starts and ends respectively. Each start or end is seperated by a comma. For example::

start1,start2,start3,start4 stop1,stop2,stop3,stop4
100,120,140,160 110,130,150,170

.. hint:: The 12th column contains extra information. This information can be manually annotated or we suggest using `Ensembl Biomart`_ to download paired Transcript stable IDs and Gene descriptions. The information in this column must be seperated by *spaces* and not tabs. All of the below lines will work for additional information in the 12th column::

Example lines::
Gene ID:ENSMUSG00000103482.1 Gene Type:TEC Transcript Type:TEC Additional Info:predicted gene, 37999 [Source:MGI Symbol;Acc:MGI:5611227]
Gene ID:ENSMUSG00000103482.1 Gene Type:TEC Transcript Type:TEC
ENSMUSG00000103482.1 TEC
Additional Info:predicted gene, 37999 [Source:MGI Symbol;Acc:MGI:5611227]
My Favorite Gene

.. _`Ensembl Biomart`: http://useast.ensembl.org/biomart/martview/

chr1 3073253 3074322 3073253 3074322 + RP23-271O17.1 ENSMUST00000193812.1 TEC 3073253, 3074322,
chr1 3102016 3102125 3102016 3102125 + Gm26206 ENSMUST00000082908.1 nonCoding 3102016, 3102125,
chr1 3214482 3671498 3216024 3671349 - Xkr4 ENSMUST00000070533.4 coding 3670552,3421702,3214482, 3671498,3421901,3216968, Mus musculus X-linked Kx blood group related 4 (Xkr4), mRNA.
chr1 3252757 3253236 3252757 3253236 + RP23-317L18.1 ENSMUST00000192857.1 pseudo 3252757, 3253236,
Here are a few example lines in refbed format from gencode.vM17.annotation.gtf (mouse mm10 format)::
chr1 24910461 24911659 24910461 24911659 - RP23-109H7.1 ENSMUST00000187022.1 pseudo 24911220,24910461 24911659,24910681 Gene ID:ENSMUSG00000100808.1 Gene Type:processed_pseudogene Transcript Type:processed_pseudogene Additional Info:predicted gene 28594 [Source:MGI Symbol;Acc:MGI:5579300]
chr1 25203443 25205696 25203443 25205696 - Adgrb3 ENSMUST00000190202.1 coding 25203443 25205696 Gene ID:ENSMUSG00000033569.17 Gene Type:protein_coding Transcript Type:retained_intron Additional Info:adhesion G protein-coupled receptor B3 [Source:MGI Symbol;Acc:MGI:2441837]
chr1 25276404 25277954 25276404 25277954 - RP23-21P2.4 ENSMUST00000193138.1 problem 25276404 25277954 Gene ID:ENSMUSG00000104257.1 Gene Type:TEC Transcript Type:TEC Additional Info:predicted gene, 20172 [Source:MGI Symbol;Acc:MGI:5012357]
chr1 26566833 26566938 26566833 26566938 + Gm24064 ENSMUST00000157486.1 nonCoding 26566833 26566938 Gene ID:ENSMUSG00000088111.1 Gene Type:snoRNA Transcript Type:snoRNA Additional Info:predicted gene, 24064 [Source:MGI Symbol;Acc:MGI:5453841]

.. note:: Last column is optional, dislayed as gene description when you click a gene on the track.
This format can be easily obtain from refGene.bed file downloaded from UCSC, or converted from
a GTF or GFF3 format file. Check out our scripts_ for help on converting file to this format.
.. note:: The last optional column is dislayed as a gene description when a gene is clicked on the browser.
Our modified format can be easily obtained from available refGene.bed file downloads from UCSC. A Gencode GTF format file can be manipulated to this format using the Converting_Gencode_GTF_to_refBed.bash script in scripts_. The script by default puts "Gene ID:", "Gene Type:", and "Transcript Type" in the additional information column. Run with an annotation file, with columns Transcript_ID Description (seperated by a tab), the script will also add "Additional Info" to the 12th column. The script depends on bedtools, bgzip, and tabix. Lastly, within the script an ``awk`` array is used to reclassify gene type and can easily be modified for additional gene types. The script is run as follows::
bash Converting_Gencode_GTF_to_refBed.bash my.gtf my_optional_annotation.txt
bash Converting_Gencode_GTF_to_refBed.bash gencode.vM17.annotation.gtf
bash Converting_Gencode_GTF_to_refBed.bash gencode.vM17.annotation.gtf biomart_2col.txt

.. _scripts: https://github.com/lidaof/eg-react/tree/master/backend/scripts

Expand Down
24 changes: 21 additions & 3 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -248,8 +248,8 @@ From the ``Tracks`` menu choose **Public Data Hubs**. This will display all of t

.. image:: _static/mm10_4dn.png

After a hub is added, a facet table containing all tracks will pop up. This allows you to choose
any tracks you are interested in:
After a hub is added, a ``facet table`` containing all tracks will pop up. This allows you to choose
any tracks you are interested in. The ``facet table`` can also be revisted through the menu when you choose **Track Facet Table**:

.. image:: _static/mm10_4dn_facet.png

Expand All @@ -265,7 +265,18 @@ Click the *Add* button to add the track(s) you want. You can then view tracks in

.. image:: _static/mm10_4dn_track_added.png

Adding custom tracks or data hub
Adding annotation tracks
~~~~~~~~~~~~~~~~~~~~~~~~

Users can add numerous annotation tracks from the ``Tracks`` menu by choosing **Annotation Tracks**.

.. image:: _static/Annotation_Tracks.png

Each header can be expanded to one or more submenus that display tracks that can be added to the browser. The tracks include CpG island information, repeat information, G/C content information, and conservation information to name a few.

.. image:: _static/Annotation_Tracks_Expanded.png

Adding a custom track or data hub
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Users can also submit their own track as a custom track. For example, say we have a bigWig track located at
Expand Down Expand Up @@ -322,3 +333,10 @@ Track Y-axis Scale
~~~~~~~~~~~~~~~~~~

For each ``numerical`` track the y-axis can be displayed in ``AUTO`` or ``FIXED`` mode by right clicking on the track. The ``AUTO`` mode will scale the axis based on numerical values in the immediate area of the view range. The ``FIXED`` mode allows the user to select ``a Y-Axis min`` or ``Y-axis max``. For values above the set max the ``Primary color above max`` can be set for easy viewing. For values below the set minimum the ``Primary color below min`` can bet set.

Track Information
~~~~~~~~~~~~~~~~~

If ``details`` were specified for a track in the data hub file these can be viewed by right clicking on the sample and clicking on the arrow to the right. An easy access ``copy`` but is also available to copy the ``URL`` for the track.

.. image:: _static/More_Information.png

0 comments on commit 2685e85

Please sign in to comment.