Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERRO] bufio.Scanner: token too long #214

Closed
Axze-rgb opened this issue Jun 3, 2021 · 6 comments
Closed

[ERRO] bufio.Scanner: token too long #214

Axze-rgb opened this issue Jun 3, 2021 · 6 comments
Labels

Comments

@Axze-rgb
Copy link

Axze-rgb commented Jun 3, 2021

Hello, I have a tab file like this

head tmp.tab
dummy_TRINITY_DN39498_c0_g1_i2 len=256 path=[0:0-255]	ATGCCGCAAGAAAGACCTCAGAGGTTTATCTGAACAGTGTTTGATCCAGCAGGTGAAAATAGAAGTGTTCAGATAAACCTCTGTCCTTCAGCTATGTTGTTGTTCATTCCCTGGCTGAACACTGGGGATCGCCCTGTCTATGCATCCTCGCGGGCGCTTGCGGGGTCGCTGGGTGCGGTTCTTGGGGTGAGAGGATTGCGTGAGTGGCTGTTTTTTGTTGCCTCGTCACATGGGCAGCCGCACTCCATCTTCAGAG	
dummy__TRINITY_DN39438_c0_g1_i1 len=717 path=[0:0-274 2:275-607 4:608-716]	CGATCTGTTTTCCGGACACTGAGTGGGCCCCGGCTTATTAGGATTTTAAAGCACGACAATAGAGAGGGCAGAGACTATGTGCCAAACATACTGGAATCTAGTGGGGACAAGTTTTTACAAGTGATTCATGCATGTTGGTCGATATCTTGGTGGAGCTCCCCTGTCATCCTGACCGTGCTGTATCGACGAGGATATTTCAGCCACGAAGGAATGGTGTCCATTAGCAAGTTCCTGTTCTCTGTTGGGGTTGTCTATGTTGGTGCTTACTGTTTAAGAGGTCTGGGTCGGCTGACCAACCCTGACTACATGATGTTCATCAATGTGCTGGTTCAAGCCCTCCACTCTCCCAGCTTTCAGAACAGCAAGAGGCTGCTGCACCACTATGACTTTGAGTTCTGGGCTTCACCAGTAGACTTCCAGTGGAACGAAGGCTCAGAGAGCAAAGGGGAGCGGGTAGGACACAGCAAAGTTCAAGCACCTGTGGAATCCAACTCTTCTTCCGGGATCATGGGACTGCCCTTCAGGCTGCTCAGTTACTTTGCCATTCACACCTTTGGACGACGCATGGTCTATCCCGGGGCCACTTCTTTGATGCAGATGCTTGTAGGTTAGTAACCTTTTTCTCTCTTCTCTTTAAACTTTTTTTGTTAAATTTAGAAATCAATCATTTGTTATATATTTTGAGCATTGTCAGACCTGTTCAGTCTTAAAATTGTG	
dummy__TRINITY_DN39438_c0_g1_i2 len=1791 path=[0:0-274 2:275-607 3:608-1790]	CGATCTGTTTTCCGGACACTGAGTGGGCCCCGGCTTATTAGGATTTTAAAGCACGACAATAGAGAGGGCAGAGACTATGTGCCAAACATACTGGAATCTAGTGGGGACAAGTTTTTACAAGTGATTCATGCATGTTGGTCGATATCTTGGTGGAGCTCCCCTGTCATCCTGACCGTGCTGTATCGACGAGGATATTTCAGCCACGAAGGAATGGTGTCCATTAGCAAGTTCCTGTTCTCTGTTGGGGTTGTCTATGTTGGTGCTTACTGTTTAAGAGGTCTGGGTCGGCTGACCAACCCTGACTACATGATGTTCATCAATGTGCTGGTTCAAGCCCTCCACTCTCCCAGCTTTCAGAACAGCAAGAGGCTGCTGCACCACTATGACTTTGAGTTCTGGGCTTCACCAGTAGACTTCCAGTGGAACGAAGGCTCAGAGAGCAAAGGGGAGCGGGTAGGACACAGCAAAGTTCAAGCACCTGTGGAATCCAACTCTTCTTCCGGGATCATGGGACTGCCCTTCAGGCTGCTCAGTTACTTTGCCATTCACACCTTTGGACGACGCATGGTCTATCCCGGGGCCACTTCTTTGATGCAGATGCTTGTAGGAGGTATGCTGAACCAGGGTCGGGCAAATCTCCTTGAAGAGAAGAAAGGAATACGAGCCAAGCTGCTGACAGAAGACAACAGTGAGATTGACACCATCTTCGTGGACAGGCGCAGAGTTGGCAGCCGCTATGGAAACACACTGGTGGTGTGCTGTGAGGGCAATGCAGGGTTCTATGAGATGGGCTGCTCAGGGACCCCCCTGGAGTGTGGATACTCTGTGCTGGGCTGGAACCACCCAGGCTTTGCCGGCAGCTCTGGGGTACCATTCCCCGATTCAGAACAGAATGCCATTGATGCTGTGATGAAGTACGCCATCTACCGCCTGGGCTTCCAGCCACACACCATCGTTTTGTTTGCCTGGTCCATTGGTGGGTACAGTGCCTCCTGGGCAGCCATGAACTACCCCGATGTCAAATCTGTGATCCTGGATGCCACCTTTGATGATGTTGTTCCTCTGGCCCTGACCCAGATGCCTCAGTCATGGAGTGGCATTGTGACAATGGCGATGAGAAACCACATGAATCTGAACATTGCTGATCAAATTTTAAAATACCCAGGCCCCATCAAGCTCATACGCAGAGTCCGAGATGAAATCATTGCCACAGAGAAGACGGAGGGAATGAACTCAGTATTGTCATCCAACCGTGGCAATTTTCTGCTCATTCGGCTCCTGCAGCATCGGTACCCCAAAATTGTGGATCGTGAGACCACACCTCTCCTGGAAGACTTCCTTAGTGGTACCAGACAACACCAAGACAAAATGCTGATGGAGCACCATGTAGTGCTGGAGGAGTGTGAAGCCAAGCTGCGTAGCTACGCCCAGGACATCTCTTCCTCCTACCCCATGCTGATAGGGGACGGGGAAACACCAGCAGTCAGAAAGCAGCTGACATTATTCCTGGCCTACAAATACATGGAGGACTTTGATTCTACGCACTGCACTCCTCTACCGATGTTCTACCTGTCACCCCCCTGGTCACCAATGGATTGATCCACTTCCACCCAGCGTTTTTATTGAAAAGTTCTGTTGTCTCCAAAAATATTTGTGACTGCTTTTTTTGGTTTCTCATTGTTGCGGTTAATTTTGTTGCTGTTCAGTTTTTGTCAGTAGTGCTTCTTTTTTGTACTGTGTTTGTAGGCGTGACTCTTTGTGTAGTGTTCTGTGTGTATGTGTGTGTGCAAG	
dummy__TRINITY_DN39438_c0_g1_i3 len=1669 path=[1:0-152 2:153-485 3:486-1668]	ATCTCAACACACACACAGGCTATTCTGTTGTTTCTTCAACGTAAGCAGTCAAAAACAAAACAAAACAAAACAAACAAACAAGCAATACAACCATAACGAAACAGGGATAAGCAGCAGAAGAAGTGAGTGAGTGAAGAGCGCGGTATAATGGGCAGGTCTGGGTCGGCTGACCAACCCTGACTACATGATGTTCATCAATGTGCTGGTTCAAGCCCTCCACTCTCCCAGCTTTCAGAACAGCAAGAGGCTGCTGCACCACTATGACTTTGAGTTCTGGGCTTCACCAGTAGACTTCCAGTGGAACGAAGGCTCAGAGAGCAAAGGGGAGCGGGTAGGACACAGCAAAGTTCAAGCACCTGTGGAATCCAACTCTTCTTCCGGGATCATGGGACTGCCCTTCAGGCTGCTCAGTTACTTTGCCATTCACACCTTTGGACGACGCATGGTCTATCCCGGGGCCACTTCTTTGATGCAGATGCTTGTAGGAGGTATGCTGAACCAGGGTCGGGCAAATCTCCTTGAAGAGAAGAAAGGAATACGAGCCAAGCTGCTGACAGAAGACAACAGTGAGATTGACACCATCTTCGTGGACAGGCGCAGAGTTGGCAGCCGCTATGGAAACACACTGGTGGTGTGCTGTGAGGGCAATGCAGGGTTCTATGAGATGGGCTGCTCAGGGACCCCCCTGGAGTGTGGATACTCTGTGCTGGGCTGGAACCACCCAGGCTTTGCCGGCAGCTCTGGGGTACCATTCCCCGATTCAGAACAGAATGCCATTGATGCTGTGATGAAGTACGCCATCTACCGCCTGGGCTTCCAGCCACACACCATCGTTTTGTTTGCCTGGTCCATTGGTGGGTACAGTGCCTCCTGGGCAGCCATGAACTACCCCGATGTCAAATCTGTGATCCTGGATGCCACCTTTGATGATGTTGTTCCTCTGGCCCTGACCCAGATGCCTCAGTCATGGAGTGGCATTGTGACAATGGCGATGAGAAACCACATGAATCTGAACATTGCTGATCAAATTTTAAAATACCCAGGCCCCATCAAGCTCATACGCAGAGTCCGAGATGAAATCATTGCCACAGAGAAGACGGAGGGAATGAACTCAGTATTGTCATCCAACCGTGGCAATTTTCTGCTCATTCGGCTCCTGCAGCATCGGTACCCCAAAATTGTGGATCGTGAGACCACACCTCTCCTGGAAGACTTCCTTAGTGGTACCAGACAACACCAAGACAAAATGCTGATGGAGCACCATGTAGTGCTGGAGGAGTGTGAAGCCAAGCTGCGTAGCTACGCCCAGGACATCTCTTCCTCCTACCCCATGCTGATAGGGGACGGGGAAACACCAGCAGTCAGAAAGCAGCTGACATTATTCCTGGCCTACAAATACATGGAGGACTTTGATTCTACGCACTGCACTCCTCTACCGATGTTCTACCTGTCACCCCCCTGGTCACCAATGGATTGATCCACTTCCACCCAGCGTTTTTATTGAAAAGTTCTGTTGTCTCCAAAAATATTTGTGACTGCTTTTTTTGGTTTCTCATTGTTGCGGTTAATTTTGTTGCTGTTCAGTTTTTGTCAGTAGTGCTTCTTTTTTGTACTGTGTTTGTAGGCGTGACTCTTTGTGTAGTGTTCTGTGTGTATGTGTGTGTGCAAG	
dummy__TRINITY_DN39497_c0_g1_i2 len=204 path=[0:0-203]	CAGGGGGGTGGGGCCTTGTTCTCCAACCTGTGGTTCCCCAACAACCAACGGGCCACCTCCACCGCCATCTCTACCTTCATCGGCTATACCGGCAGTGCTCTGGCCTTTGTTCTGGGGCCCTCTCTTGTTCCCAGTCCTTCCGACGACGTCAACTCCAACCAGACGCTGCCTGACGGGAACGACACAAGCGGAGCCGACCTACAG	
dummy__TRINITY_DN39436_c0_g1_i1 len=269 path=[0:0-85 2:86-268]	TGTTTGTCTGTCACCTTGTCGGCAGGTCTGTCAGTCCTTGCCATTCAGGAGGCCAGAGGAAAAAAAAAAAGCAAAAAAAAAAGCTACAAAGACACCCTGAAAGCGTCACTGAAGGTCTTCAGCATCGATCACGACTCATGGGAGCAGGCAGCTTTGGACAGAGCGAAGTGGCGCTCAGCTGTCTACCATAGCGCGCAGAGCTGTGAATCCAACAGGACAGCGGCAGCTGAGCTGAGCAAACAGGCCAGGAAAGCCCCAGCCAGCGCACC	
dummy__TRINITY_DN39436_c0_g1_i2 len=456 path=[1:0-272 2:273-455]	CGTCAATATGCAATTGCTATAGCAGATTTCATGGTGTGCTGTGGAAGCACTTTTCGTGCATTTCCATATTGTCAACATAAAATTCATACAGTCCCAGACACCGAGGTCCTCATCAGTGCAGACCTGCCCAGTATCCGTACCATCCTTATGCAGTCATAGCTTCGTTGGGCAGGTCATGTAGTTCGCATGCCAGACCTCCGGCTCCCAAAGAAACTCTTATTCTGCAAACTCCAATACGGCAATTGCTCCCAAGAAGGCCAGAACAAAGCGCTTCAAAGACACCCTGAAAGCGTCACTGAAGGTCTTCAGCATCGATCACGACTCATGGGAGCAGGCAGCTTTGGACAGAGCGAAGTGGCGCTCAGCTGTCTACCATAGCGCGCAGAGCTGTGAATCCAACAGGACAGCGGCAGCTGAGCTGAGCAAACAGGCCAGGAAAGCCCCAGCCAGCGCACC	
dummy__TRINITY_DN39429_c0_g1_i1 len=1122 path=[0:0-927 2:928-1121]	CTGTTTCACACCTTCCCCAGCTTTTCCTTCAAGATGTTCTCGGACATGTCCAGCATGTCCGCAGCCACTGATCGGTCCCTACAGGGGGAAAATGCCTACCTGAGGGAGAACATAGAGAAAGAGCGCTATAGGAGAAAGCATTGTGAACAACAGATTCAGAGTCTGAATGCCAAGTTGCTGGAAATACAACAGCAGCTGGCTGTGGCTATCTCCACTGACAAGCGCAAGGACATCATGATTGACCAGCTGGATAAACAACTGGCCAAAGTAGTGGAAGGGTGGAAGAAGCGAGAAGCAGAAAAGGACGAGTACATGTCCTTGGTGATGAAGGAGAAGTCTCAGATTGAAGAAACGTTGCAGAAACAGCAGGCAATGATTGACAGCTTTGAGAAGGAGCTGGCCAACACAGTGGATGAGCTGAAGCAGGAGAAGGAGAACTCTGCAGAGCTGGTGGACCAGATGAAGGCCCAGCTGTTGTCAGCGGTGCATGACCAGCGTCACGCTGAGGAGATGTTGGCAGCAGAGAAGGAGCGGGTAACTCTGATGGAGAGAGAGTGGGATCAGCTAAAGGAGGCACGGGACTTGGCGGAAAAACGAGGACAGCAAGTCCAGGATCGACTTCATCAGGAGCAGGACAGCTGGTACCAGCGGGAACAGGAACTGGTGCACAAGATTGACCAGGTCAAGGAGGCCAACCTCAAGGTCATGCAGATGGAACGGGTGAAGCTAGAGGAGCAGATGAAGAAAGCTGAAGACCTGGAGGAGCAAGCCCACACAGCGAACACAGAGGTCAAGAGGCTGGAGATGGAAGTGGACTCAGCTGTGAGGGAGAAGGAGAGCCTCAAAGTGGAGATGGCTCTGATGGAGGCCAAATTTGAGAGTGCACAGCGCACCCTGGAGGCAGACCTGCGTGGTCAGATGGAGAAAGAGATCTCAGAGCAGGTGGGCGAGGTGCAGAGGCGAATGCGACAGGAGCAAGAGGAGCAGGGGGAGAAGCACCGCCAGTTGGTGTCTGAGCTCCACCAGCGGCACCAGCGTGACCTGGACATGCAGTTGGCCACCCTGCGCCAGGACCTGGGCCGGCGGGAGGATGACCTGAAGGAGCAGTTAGCAGAGATGGAG	
dummy__TRINITY_DN39429_c0_g1_i3 len=276 path=[1:0-81 2:82-275]	CCCCAGTTCCTTCATTTTGACTTCAGGTTGGCAGGTACCTGTAAACATTCTTCACTAACGCACTCCCACCCCTTTTTCTGTCAGATCTCAGAGCAGGTGGGCGAGGTGCAGAGGCGAATGCGACAGGAGCAAGAGGAGCAGGGGGAGAAGCACCGCCAGTTGGTGTCTGAGCTCCACCAGCGGCACCAGCGTGACCTGGACATGCAGTTGGCCACCCTGCGCCAGGACCTGGGCCGGCGGGAGGATGACCTGAAGGAGCAGTTAGCAGAGATGGAG	
dummy__TRINITY_DN39458_c0_g1_i1 len=301 path=[0:0-300]	AATGAATGTGTGTGTGTGTGAATGAATGTGTGTGAATGAAAATGTGCATGTGTGAATGAATGTGCATGTGTGTGAATGAATGTGTGTGTGTGTGAATGAATGTGTGTGAATGAAAATGTGCATGTGTGAATGAATGTGCATGTGTGTGAATGAATGTGTGTGTGTGTGAATAAATTGTGTATGTGTGTGAACGAATGTGTGTGTGAATAAGTTTGTGTGTGTGAATATGTGTGCATGCGCTCGTGTGAATGAATGTGTGTGTTTGTATGTATGTGTGAGAATGAATGTTTGTGTGTGTGTG	

Then I convet it back to fasta


seqkit tab2fx tmp.tab > without_conus_all.fa
[ERRO] bufio.Scanner: token too long

What is the "token" here ?
Note that the final fasta file is not empty but I don't know if the error is harmless or if some tab were not converted to fasta.

Thank you!

@shenwei356 shenwei356 added the bug label Jun 3, 2021
@Axze-rgb
Copy link
Author

Axze-rgb commented Jun 3, 2021

By the way I can send you the whole files if needed.

@shenwei356
Copy link
Owner

Yes, please send the compressed file.

@shenwei356
Copy link
Owner

I confirmed it's a bug for very long sequences like chr1 or human genome.

@shenwei356
Copy link
Owner

@Axze-rgb
Copy link
Author

Axze-rgb commented Jun 3, 2021

Waw thanks! you rock

@Axze-rgb
Copy link
Author

Axze-rgb commented Jun 3, 2021

Indeed it works perfecty.

Thanks for seqkit, it's a very useful piece of software.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants