Skip to content

Add Genomes

drbecavin edited this page May 16, 2020 · 12 revisions

The first section being the most important is the one for genomic datasets. One has to provide a simple genomics.txt tabdelimited text file containing genome names, metadata information, and most importantly, RefSeq ftp link. Using bacnet.e4.rap.setup Genomic panel, you can download the genome (.fna and .gtf files) and serialize it to allow Bacnet platform to quickly load all genomes.

If genomics.txt is found in your database folder you should see all your genomes listed in the table with the validated box unchecked. To validate them click on Validate Genomics database. This can take a few minutes because it needs to load all sequences for each genome.

Capture_d_écran__11_

In the console you can see the number of chromosomes and genes found for each validated genome.

Verify whether all Validated boxes are checked, otherwise you will need to add the genomes to the database.

First you should download the genomes from RefSeq. For that you need in Genomes.txt a RefSeq.FTP column containing ftp links. Click on Download genomes from RefSeq. This will download sequence files in .fna, annotation file in .gff, and protein sequences in .faa file.

After downloading you can click on Add unvalidated Genomes to the database. Once each genome is validated create Genomes_summary.txt table by clicking the Create Genome summary table button. You can use Genomes_summary.txt to check if all genomes and genes wereintegrated in the database, and you can add the summary columns to Genomes.txt.

If everything went well you can now deploy "bacnet.e4.rap" by running it on Eclipse. Go to Genomics tool, you should obtain the list of genomes integrated.

Genomic tool

By clicking on Listeria monocytogenes EGD-e for example you should access the Gene panel with all gene information.

Gene tool

Now pursue on the Phylogeny data creation panel in "bacnet.e4.rap.setup".