Permalink
Browse files

Remove FTP download docs - now in eg.org

  • Loading branch information...
1 parent 275a8d1 commit ad0385f2a54d43f35ac0db2003a3e901fb7bc746 @nicklangridge nicklangridge committed Mar 5, 2014
Showing with 16 additions and 145 deletions.
  1. +2 −129 htdocs/info/data/ftp/index.html
  2. +14 −16 modules/EnsEMBL/Web/Document/HTML/FTPtable.pm
@@ -8,138 +8,11 @@
<h1>The Ensembl Genomes FTP Server</h1>
-<p>If required, entire databases can be downloaded from our FTP site
-in a variety of formats, from flat files to MySQL dumps. Please be aware that these
-files can run to many gigabytes of data.</p>
-
-<p>The data for this release is available at <a href="[[SPECIESDEFS::ENSEMBL_FTP_URL]]/current">[[SPECIESDEFS::ENSEMBL_FTP_URL]]/current</a>. The data from previous releases are available at <a href="[[SPECIESDEFS::ENSEMBL_FTP_URL]]">[[SPECIESDEFS::ENSEMBL_FTP_URL]]</a>.</p>
-
-<p>To facilitate storage and download all databases are
-<a href="http://directory.fsf.org/gzip.html" rel="external">GNU Zip</a>
-(gzip, *.gz) compressed.
-</p>
-
-<p>
-<strong>Please note:</strong>
-Ensembl Genomes supports downloading of many correlation tables via the highly
-customisable
-<a href="/biomart/martview">BioMart</a> data mining tool.
-You may find exploring this web-based data mining tool easier than extracting
-information from our database dumps.
-</p>
+<p>Detailed information about the available data and file formats can be found <a href="http://dev.ensemblgenomes.org/info/access/ftp
+">here</a>.</p>
[[SCRIPT::EnsEMBL::Web::Document::HTML::FTPtable]]
-<h2>About the data</h2>
-
-<p>
-The following types of data dumps are available on the FTP site.
-</p>
-<dl>
-<dt>FASTA</dt>
-<dd>FASTA sequence databases of Ensembl gene, transcript and protein model
-predictions.
-Since the
-<a href="http://www.ebi.ac.uk/help/formats.html#fasta" rel="external">FASTA format</a>
-does not permit sequence annotation, these database files are mainly intended
-for use with local sequence similarity search algorithms.
-Each directory has a README file with a detailed description of the header line format and the file naming conventions.
-
- <dl>
- <dt>DNA</dt>
- <dd><a href="http://www.repeatmasker.org/" rel="external">Masked</a> and unmasked genome sequences associated with the assembly (contigs, chromosomes etc.).</dd>
- <dt>cDNA</dt>
- <dd>cDNA sequences for protein-coding genes</dd>
- <dt>Peptides</dt>
- <dd>Protein sequences for protein-coding genes.</dd>
- <dt>RNA</dt>
- <dd>Non-coding RNA gene preditions.</dd>
- </dl>
-</dd>
-
-<dt>Flatfile</dt>
-<dd>Flat files allow more extensive sequence annotation by means of feature tables. The currently annotation displayed in Ensembl Genomes consists of protein calling genes derived from one of the following data resources:
-<ul>
-<li>EMBL Nucleotide Sequence Database (used for Ensembl Bacteria)
-<li>VectorBase (used for insect vectors of human diseases)
-<li>WormBase (used for <i>Caenorhaditis elegans</i>)
-<li>FlyBase (used for <i>Drosophila melanogaster</i>)
-<li>PlasmoDB (used for <i>Plasmodium falciparum</i>
-</ul>
-along with the results of additional analyses performed by ourselves and others. See <a href="/info/docs/genebuild/index.html">this page</a> for more information.
-
-<p>Each nucleotide sequence record in a flat file represents a 1Mb slice of the
-genome sequence.
-Flat files are broken into chunks of 1000 sequence records for easier
-downloading.
- <dl>
- <dt>EMBL</dt>
- <dd>Ensembl Genomes database dumps in <a href="http://www.ebi.ac.uk/embl/" rel="external">EMBL</a> nucleotide
-sequence <a href="http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html" rel="external">database format</a>
-</dd>
- <dt>GenBank</dt>
- <dd>Ensembl Genomes database dumps in <a href="http://www.ncbi.nlm.nih.gov/Genbank/" rel="external">GenBank</a>
-nucleotide sequence
-<a href="http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html" rel="external">database format</a></dd>
- </dl>
-</dd>
-
-<dt>MySQL</dt>
-<dd>All Ensembl Genomes <a href="http://www.mysql.com/" rel="external">MySQL</a> databases are
-available in text format as are the SQL table definition files. These can be imported into to any SQL database for a local
-<a href="/info/docs/webcode/install/">installation</a> of a mirror site.
-Generally, the FTP directory tree contains one one directory per database.
-For more information about these databases and their Application
-Programming Interfaces (or APIs) see the
-<a href="/info/docs/api/">API</a> section.
-</dd>
-
-<dt>GTF</dt>
-<dd>Gene sets for each species. These files include annotations of both coding
-and non-coding genes. This file format is described <a href="http://mblab.wustl.edu/GTF2.html">here</a>.</dd>
-
-<dt>GFF3</dt>
-<dd>Gene sets for each species. These files include annotations of both coding
-and non-coding genes. This file format is described <a href="http://www.sequenceontology.org/gff3.shtml">here</a>.</dd>
-
-<dt>EMF flatfile dumps (variation and comparative data)</dt>
-<dd>Alignments of resequencing data are available for several species as Ensembl
-Multi Format (EMF) flatfile dumps. The accompanying README file describes the file
-format.
-<p>
-Also, the same format is used to dump whole-genome multiple alignments (where available) as
-well as gene-based multiple alignments and phylogentic trees used to infer
-Ensembl orthologues and paralogues. These files are available in the ensembl_compara
-database which will be found in the <a href="[[SPECIESDEFS::SITE_FTP]]/release-[[SPECIESDEFS::SITE_RELEASE_VERSION]]/emf/compara/">compara directory</a>.
-</p>
-<p>
-<b>Note:</b> the EMF directories for compara also contain trees in phyloxml, nh and newick format.
-</p>
-</dd>
-
-<dt>TSV</dt>
-<dd>
-Tab separated files containing selected data for individual species and from comparative genomics provided for convenience. All files are gzipped and contain header lines detailing the contents of each file. The files available are:</br></br>
-Per species:</br>
-&lt;species_name&gt;.&lt;assembly&gt;.&lt;release&gt;.uniprot.tsv - Provides mappings from Gene, Transcript and Translation stable identifiers to
-UniProtKB accessions with reports as to the % identity of the hit where applicable.</br></br>
-Per compara:</br>
-Compara.homologies.&lt;release&gt;.tsv - Homologies and identity between proteins in different species retrieved from Compara.</br>
-Compara.stableid_to_genetreeid.&lt;release&gt;.tsv - Provides mappings from compara gene tree stable identifiers to the component gene and translation stable identifiers.</br>
-</dd>
-
-</dl>
-
-<p>
-Each directory on
-<a href="ftp://ftp.ensemblgenomes.org/pub/" rel="external">ftp.ensemblengenomes.org</a> contains a
-<a href="ftp://ftp.ensemblgenomes.org/pub/current_README">README</a> file.
- This additional document explains the FTP directory
-<a href="index.html">structure</a>.
-</p>
-
-<br />
-
[[SCRIPT::EnsEMBL::Web::Document::HTML::FTPMetadata]]
<p>&nbsp;</p>
View
@@ -27,19 +27,19 @@ use EnsEMBL::Web::Hub;
use EnsEMBL::Web::Document::Table;
use base qw(EnsEMBL::Web::Document::HTML);
-use JSON::Parse qw /json_to_perl valid_json/;
-use List::MoreUtils qw /first_index/;
-
-sub get_resources {
- open FILE, "<".$SiteDefs::ENSEMBL_SERVERROOT."/eg-plugins/common/htdocs/species_metadata.json";
- my $file_contents = do { local $/; <FILE> };
- close FILE;
-
- if (valid_json ($file_contents)) {
- my $data = json_to_perl ($file_contents);
- return $data->{genome};
- }
-}
+#use JSON::Parse qw /json_to_perl valid_json/;
+#use List::MoreUtils qw /first_index/;
+#
+#sub get_resources {
+# open FILE, "<".$SiteDefs::ENSEMBL_SERVERROOT."/eg-plugins/common/htdocs/species_metadata.json";
+# my $file_contents = do { local $/; <FILE> };
+# close FILE;
+#
+# if (valid_json ($file_contents)) {
+# my $data = json_to_perl ($file_contents);
+# return $data->{genome};
+# }
+#}
sub render {
my $self = shift;
@@ -50,7 +50,6 @@ sub render {
my $rel = $species_defs->SITE_RELEASE_VERSION;
my @species = $species_defs->valid_species;
- my $species_resources = get_resources();
my %title = (
dna => 'Masked and unmasked genome sequences associated with the assembly (contigs, chromosomes etc.)',
cdna => 'cDNA sequences for protein-coding genes',
@@ -119,8 +118,7 @@ tsv => qq{<a rel="external" title="$title{'tsv'}" href="$ftp_base_path_s
vep => qq{<a rel="external" title="$title{'vep'}" href="$ftp_base_path_stub/vep/$sp_dir">VEP</a>},
};
- my $index = first_index { $_->{species} eq lc($spp) } @$species_resources;
- if (keys %{$$species_resources[$index]->{variation}}) {
+ if ($hub->databases_species($spp, 'variation')) {
$data->{'gvf'} = qq{<a rel="external" title="$title{'gvf'}" href="$ftp_base_path_stub/gvf/$sp_dir">GVF</a>};
$data->{'vcf'} = qq{<a rel="external" title="$title{'vcf'}" href="$ftp_base_path_stub/vcf/$sp_dir">VCF</a>};
}

0 comments on commit ad0385f

Please sign in to comment.