Skip to content

Commit

Permalink
Update deepsea.rst
Browse files Browse the repository at this point in the history
  • Loading branch information
aaronkw committed Dec 2, 2019
1 parent 5627b90 commit 92b8c38
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions docs/deepsea.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ Input

DeepSEA predicts genomic variant effects on a wide range of chromatin features at the variant position (Transcription factors binding, DNase I hypersensitive sites, and histone marks in multiple human cell types). DeepSEA can also be ultilized for predicting chromatin features for any DNA sequence.

We support three types of input: vcf, fasta, bed. If you want to predict effects of noncoding variants, use vcf format input. If you want to predict chromatin feature probabilities for DNA sequences, use fasta format. If you want to specify sequences from the human reference genome (GRCh37/hg19), you can use bed format. See below for a quick introduction, and we provide detailed description of format requirement on the DeepSEA input page:
We support three types of input: vcf, fasta, bed. If you want to predict effects of noncoding variants, use vcf format input. If you want to predict chromatin feature probabilities for DNA sequences, use fasta format. If you want to specify sequences from the human reference genome (GRCh37/hg19), you can use bed format. See below for a quick introduction:

VCF format is used for specifying a genomic variant. A minimal example is chr1 109817590 - G T (if you want to copy cover this text as input, you need to change spaces to tabs since html webpage can not display tab). The five columns are chromosome, position, name, reference allele, and alternative allele.
**VCF format** is used for specifying a genomic variant. A minimal example is ``chr1 109817590 - G T`` (if you want to copy cover this text as input, you will need to change spaces to tabs). The five columns are chromosome, position, name, reference allele, and alternative allele.

Fasta format input should include sequences of 1000bp length each. If a sequence is longer than 1000bp, only the center 1000bp will be used. A minimal example is ::
**Fasta format** input should include sequences of 1000bp length each. If a sequence is longer than 1000bp, only the center 1000bp will be used. A minimal example is ::

>TestSequence
TATCTCTCATGTTTCTGGTATAGATGGTATATATGTTAATCTTGTTCCTGAGGTCTGTTTTTTATTTTTGTCATTAAAGT
Expand All @@ -42,9 +42,9 @@ Fasta format input should include sequences of 1000bp length each. If a sequence
AATATCATCCTATATCAACTATAGAGAGAAGATCGCAAGA


Bed format provides another way to specify sequences in human reference genome (hg19). The bed input should specify 1000bp-length regions. A minimal example is chr1 109817091 109818090. The three columns are chromosome, start position, and end position.
**Bed format** provides another way to specify sequences in human reference genome (hg19). The bed input should specify 1000bp-length regions. A minimal example is ``chr1 109817091 109818090``. The three columns are chromosome, start position, and end position.

We recommend using the server if you have <50,000 variants or sequences. For larger set, you may run the standalone version on your local machine, or contact our group directy.
We recommend using the web server if you have <10,000 variants or sequences. You will experience degraded performance when submitting a larger set of sequeneces. In those instances, you may run the standalone version on your local machine, or contact our group directy.

We support only GRCh37/hg19 genome coordinates. You can use LiftOver to convert your coordinates to the correct version.

Expand Down

0 comments on commit 92b8c38

Please sign in to comment.