Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

taxonkit lca gives error `[ERRO] bufio.Scanner: token too long' sometimes #75

Closed
4 tasks done
taylorreiter opened this issue Mar 24, 2023 · 4 comments
Closed
4 tasks done
Labels

Comments

@taylorreiter
Copy link

Prerequisites

  • make sure you're are using the latest version by taxonkit version: I'm using taxonkit v0.14.1 (installed via conda)
  • read the usage

Describe your issue

  • describe the problem: taxonkit lca gives error [ERRO] bufio.Scanner: token too long on some files. I ran this command on 64k of these files, and it gave this error for 71 of them. I've attached one file below.
  • provide a reproducible example
# install taxonkit
conda install taxonkit
# download taxid -> lineage file. required for taxonkit
wget https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
tar xf taxdump.tar.gz
taxonkit lca --data-dir . -i 2 -s ";" -o test_lca.txt test.txt

test.txt: A TSV-formatted input file

@shenwei356 shenwei356 added the bug label Mar 24, 2023
@shenwei356
Copy link
Owner

It's due to some long rows, similar to shenwei356/seqkit#214.

@shenwei356
Copy link
Owner

shenwei356 commented Mar 26, 2023

Hi Taylor, thanks for reporting this. I've just increased the default size of the line buffer from 4096 to 1M. It works now.

If the error still occurs, you can use a bigger buffer size:

  -b, --buffer-size string   size of line buffer, supported unit: K, M, G. You need to increase the
                             value when "bufio.Scanner: token too long" error occured (default "1M")

shenwei356 added a commit that referenced this issue Mar 27, 2023
@taylorreiter
Copy link
Author

thank you so much @shenwei356! I've tried this on both Mac and Linux and it solved the problem. Thank you!!!!!

@shenwei356
Copy link
Owner

Please don't hesitate to let me know if you encounter any additional problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants