Skip to content

dezordi/ncbi-acess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

ncbi-acess

This script it's a toolbox to automatic recovery information of NCBI.

Dependencies

This script was build on python 3.6.5+ and have only two dependencies:

Recomended lectures

Usage

  • To recovery genbank information from nucleotide sequences:

python ncbi_seq_retrieve.py -in file_with_access_ids.txt -db nucleotide -ot gb

Or to recovery in xml format, just insert the parameter -tf xml.

  • To recovery cds translated to aminoacids from nucleotide sequences:

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db nucleotide -ot fasta_cds_aa

Or to recovery cds not translated, just change fasta_cds_aa for fasta_cds_na

  • To recovery nucleotide of aminoacid sequences

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db (nucleotide or protein) -ot fasta

Or to recovery in xml format, just insert the parameter -tf xml.

  • To recovery taxonomy information of ncbi acess IDs

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db (nucleotide or protein) -ot gb -tx True

  • To recovery taxonomy information of host of ncbi acess IDs (ideal for viruses)

python ncbi_seq_retrieve.py -in file_with_acess_ids.txt -db (nucleotide or protein) -ot gb -tx True -th True

Some considerations

If you have a file with IDs from nucleotide sequences, you can't use this file in a protein database, and vice-versa. If you call help function, a table with which text formats are allowed per output type, and which output types are allowed per database.

Disclaimer

  • This script will continue to be developed to englobe others functions, like features of sequences, for example.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages