You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering if it would be possible to provide a filtering option based on assembly (species/assigned) name? I often want to pull a group of microbes with a general metabolic capabilities (say methanogenesis) but I have to manually pick out the TaxIDs currently to do so. Not a major problem, but the feature might be useful for other people too!
The text was updated successfully, but these errors were encountered:
Hi, thanks for the suggestion. genome_updater selects and filters data based on the assembly_summary.txt file provided by NCBI (more info https://ftp.ncbi.nlm.nih.gov/genomes/README_assembly_summary.txt). Besides the filter parameters, the -F option allow custom filtering for data selection. However, I'm not sure the information you refer to is contained in that file.
Column 8 would be the target, I think. I believe right now the -F option is an exact match though, so I am thinking of another flag that basically uses grep behind the scenes to implement the matching. I'd basically want to grab all the assemblies with an organism name matching "methano*", if that makes sense. Obviously would not be perfect, but could be handy if you have a specific enough search string.
Partial matching should be doable, will mark it as enhancement. For now one can download the full assembly_summary.txt from genbank or refseq and apply the filter/grep manually and use the resulting file as an external assembly_summary.txt (param. -e).
Hi,
I was wondering if it would be possible to provide a filtering option based on assembly (species/assigned) name? I often want to pull a group of microbes with a general metabolic capabilities (say methanogenesis) but I have to manually pick out the TaxIDs currently to do so. Not a major problem, but the feature might be useful for other people too!
The text was updated successfully, but these errors were encountered: