Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow gzipped lineage files #214

Open
taylorreiter opened this issue Jun 1, 2022 · 0 comments
Open

allow gzipped lineage files #214

taylorreiter opened this issue Jun 1, 2022 · 0 comments

Comments

@taylorreiter
Copy link
Member

The GTDB lineage files are stored as gzip compressed files. rule rule make_contigs_search_taxonomy_wc fails with a gzipped lineage file:

Traceback (most recent call last):
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/runpy.py", line 197, in _run_m
odule_as_main
    return _run_code(code, main_globals, None,
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/runpy.py", line 87, in _run_co
de
    exec(code, run_globals)
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/charcoal/contigs_search_taxonomy.py", line 151, in <module>
    returncode = cmdline(sys.argv[1:])
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/charcoal/contigs
_search_taxonomy.py", line 146, in cmdline
    return main(args)
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/charcoal/contigs_search_taxonomy.py", line 27, in main
    tax_assign, _ = load_taxonomy_assignments(args.lineages_csv,
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/site-packages/sourmash/lca/com
mand_index.py", line 39, in load_taxonomy_assignments
    first_row = next(iter(r))
  File "/home/tereiter/github/2022-dominating-set-differential-abundance-example/.snakemake/conda/df16191f60f78adeb9f40112bb67409b/lib/python3.9/codecs.py", line 322, in decod
e
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

It would be super convenient to allow for gzipped lineage csv files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant