Skip to content

Latest commit

 

History

History
63 lines (33 loc) · 4.81 KB

TUTORIAL.md

File metadata and controls

63 lines (33 loc) · 4.81 KB

Deploying Community Tracks on the Araport JBrowse using the CyVerse Discovery Environment

The following tutorial explains the steps to upload Genomic Data Format (GFF3, BED, VCF) files into the CyVerse Data Store and run an app in the CyVerse Discovery Environment to process and prepare them for visualization in the Araport genome browser (JBrowse) instance.

Background Information

  • GFF3: Generic Feature Format Version 3 is a 9-column TAB-delimited text file format used to represent genomic features, allowing for hierarchical grouping of features, with feature types based on a controlled vocabulary called Sequence Ontology (SO). The coordinate system is 1-based.

  • BED: Browser Extensible Data format is a 12-column TAB-delimited text file format used to represent genomic features, developed by UCSC. The coordinate system is 0-based.

  • VCF: Variant Call Format is a TAB-delimited text file format (most likely stored in a compressed manner), which contains meta-information lines, a header line, and then data lines each containing information about a position in the genome. The coordinate system is 1-based.

  • tabix: tabix is a generic indexing tool for TAB-delimited genomic feature files like GFF3, VCF, BED, etc.

    • Depends on the bgzip Block compression/decompression utility
  • CyVerse: An NSF-funded, community-driven, cyber-infrastructure initiative providing access to powerful computational infrastructure to scientists in the form of high performance computing and storage systems. The CyVerse Data Store is a cloud-based storage system that enables researchers to store and share data related to their research.

  • Araport: Araport is a one-stop-shop for Arabidopsis thaliana genomics. Araport offers gene and protein reports with orthology, expression, interactions and the latest annotation, plus analysis tools, community apps, and web services. Araport is 100% free and open-source.

  • JBrowse: Fast, scalable, customizable, client-side genome browser with a fully dynamic AJAX interface.

Step-by-step guide

  1. Login to the CyVerse Discovery Environment at https://de.cyverse.org/de/ with your CyVerse ID.

  2. Upload Genomic Data Format file (GFF3, BED, VCF): In the DE, click on the Data button in the upper left corner to open up the Data window. From there select the Upload menu and then Simple Upload from Desktop.

    File upload 1

  3. In the Upload dialog box, click the Browse button to locate the file on your system and then click the Upload button.

    File upload 2

  4. Verify that the file has been uploaded successfully by checking the Data window (Hit the Refresh button if needed).

    File uploaded

  5. Run the Publish App: Click the Apps button in the upper left corner to open up the Apps window. Find the app called "Publish Community Tracks to Araport JBrowse 1.0.0". The easiest way is to type "araport" into the Search Apps box.

    Search Apps

  6. Click on the app to use it.

    Launch App

  7. Change the output folder by clicking on the Browse button and navigate to Community Data -> araport -> community-tracks -> staging (/iplant/home/shared/araport/community-tracks/staging) and then click on the Inputs tab.

    Select input

  8. Click the Browse button to locate the file that was uploaded earlier and then click on the Parameters tab.

    Enter description

  9. Enter a short description for the track that will appear in the JBrowse track selector and then click the Launch Analysis button.

  10. (Optional) Click the Analyses button in the upper left corner to open up the Analyses window. From here the job status can be monitored.

    Monitor job

When the job completes, the output files will have been uploaded to the Araport community data directory. At this point, the Araport administrators will review the track, move it to the Community Data -> araport -> community-tracks -> shared directory, and ultimately point to it from the Araport JBrowse instance.

For any questions or comments, please email araport@jcvi.org.