-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 111 + 112 uniprot #115
Conversation
retry failed batches and parse ids individually when they may be unlisted in NCBI
remove unused sections of code, and use new code in ncbi.tax.linege
Do not rely on entrez.elink to retrieve the correct protein record id for a given protein accession. Batch fetch protein records to pair the protein id and accession correctly.
Skip downloaded the latest taxonomic information for NCBI were multiple taxonomic classifications are retrieved from CAZy for a protein. The first taxonomy retrieved from CAZy will be added to the local CAZyme database.
no longer retrieve tax data from ncbi as default. Will use the first taxon listed in the cazy db dump. If the new flag is called, retrieve the latest taxonomic classifications from NCBI for proteins listed with multiple taxas in CAZy
retrieve taxonomic classifications from uniprot and add to the local db
associate genus and species
Codecov Report
@@ Coverage Diff @@
## master #115 +/- ##
==========================================
- Coverage 56.22% 53.10% -3.13%
==========================================
Files 61 69 +8
Lines 5576 6011 +435
==========================================
+ Hits 3135 3192 +57
- Misses 2441 2819 +378 |
all reatrieved taxa are placed under the key 'taxonomy', the finally selected organism is placed under 'organism'
return to default to retrieve the latest tax classifications from ncbi for those protiens listed under multple taxa. CAZy seems to have reduced the number of multipel taxa organsisms
add missing 'db_id' key to extract uniport db id from the uniprot table dic
New in version 2.3.0
|
Migrate from
bioservices.UniProt().get_df()
tobioservies.UniProt().mapping()
:bioservices
get_df
method