Error writing output: ValueError: max() arg is an empty sequence #6

Talitrus · 2018-02-12T17:41:07Z

Hi Eddi,

I've been trying to get BLCA to run with a custom database. As far as I can tell, I have everything formatted as specified in the README, but BLCA errors out while trying to write the output files with the following messages:

blastdbcmd is located in your PATH!
muscle is located in your PATH!
>  > Fasta file read in!!
>  > Read in taxonomy information!
blastn is located in your PATH!
> > Running blast!!
> > Blastn Finished!!
>  > Read in blast output!
Traceback (most recent call last):
  File "/groups/cbi/shared/apps/BLCA/BLCA.git/2.blca_main.py", line 352, in <module>
    outout.write(le+":"+max(lexsum,key=lexsum.get)+";"+str(max(lexsum.values()))+";")
ValueError: max() arg is an empty sequence

I managed to get it run successfully once or twice with different sequence IDs, but haven't been able to replicate it. Even then, it kept identifying sequences as "Unclassified," though. This would probably be a separate issue, though.

I've uploaded my blastdb, reference FASTA used to generate the blastdb (midori_blca_dedup.fasta), reference taxonomy file (midori_blca_taxonomy.txt), and two test input FASTA files (uniques10_trim.fasta) and (other_BLCA_test_input.fasta) to a Dropbox link here: https://www.dropbox.com/sh/m3hie0b9o8ldc29/AABHeBAiS94lwl_skgtEY-qIa?dl=0

Thanks for your help!

Cheers,
Bryan

The text was updated successfully, but these errors were encountered:

yingeddi2008 · 2018-02-12T19:56:47Z

Hi Bryan, Thanks for using our software. After examining your file, I noticed that you do not have the same number of database fasta reads as in your taxonomy file. You have 583043 lines of taxonomy information in midori_blca_taxonomy.txt, while you have 582591 reads in the midori_blca.dedup.fasta. But I do not think it caused the error. Another thing is that the deliminator in the taxonomy file should be a tab instead of space. I forgot to emphasize it in the instruction. It should be formatted as the following. DQ523177[\t]species:Gammarus tigrinus;genus:Gammarus;family:Gammaridae;order:Amphipoda;class:Malacostraca;phylum:Arthropoda;superkingdom:Eukaryota; I fixed the error by changing the space into tab. The outputs from the two test fasta files are as the following: AY937375.1 superkingdom:Eukaryota;100.0;phylum:Cnidaria;100.0;class:Hydrozoa;100.0;order:Siphonophorae;100.0;family:Prayidae;100.0;genus:Praya;100.0;species:Praya dubia;100.0; DQ133904.1 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Tedaniidae;100.0;genus:Tedania;100.0;species:Tedania ignis;100.0; 1 Unclassified 3 superkingdom:Eukaryota;100.0;phylum:Cnidaria;100.0;class:Anthozoa;100.0;order:Actiniaria;100.0;family:Aiptasiidae;100.0;genus:Aiptasia;100.0;species:Aiptasia pulchella;100.0; 2 Unclassified 5 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Tedaniidae;100.0;genus:Tedania;100.0;species:Tedania ignis;100.0; 4 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Tedaniidae;100.0;genus:Tedania;100.0;species:Tedania ignis;100.0; other_BLCA_test_input.fasta.blca.out (END) sq1abc123 Unclassified sq10abc123 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Microcionidae;100.0;genus:Clathria;100.0;species:Clathria abietina;73.5; sq3abc123 superkingdom:Eukaryota;100.0;phylum:Cnidaria;100.0;class:Anthozoa;100.0;order:Actiniaria;100.0;family:Aiptasiidae;100.0;genus:Aiptasia;100.0;species:Aiptasia pulchella;100.0; sq2abc123 Unclassified sq5abc123 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Tedaniidae;100.0;genus:Tedania;100.0;species:Tedania ignis;100.0; sq7abc123 superkingdom:Eukaryota;100.0;phylum:Cnidaria;100.0;class:Anthozoa;100.0;order:Scleractinia;100.0;family:Agariciidae;100.0;genus:Agaricia;100.0;species:Agaricia agaricites;65.75; sq6abc123 Unclassified sq9abc123 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Microcionidae;100.0;genus:Clathria;100.0;species:Clathria abietina;56.5; sq4abc123 superkingdom:Eukaryota;100.0;phylum:Porifera;100.0;class:Demospongiae;100.0;order:Poecilosclerida;100.0;family:Tedaniidae;100.0;genus:Tedania;100.0;species:Tedania ignis;100.0; sq8abc123 superkingdom:Eukaryota;100.0;phylum:Cnidaria;100.0;class:Anthozoa;100.0;order:Scleractinia;100.0;family:Agariciidae;100.0;genus:Agaricia;100.0;species:Agaricia agaricites;66.5; uniques10_trim.fasta.blca.out (END) I believe if you do the same, the error message will disappear, and BLCA will output the taxonomy of your reads. I will add a check statement in the taxonomy reading-in function, and update the README file on github. Please let me know if you have any other problems, Eddi

…

On Mon, Feb 12, 2018 at 11:41 AM, Bryan Nguyen ***@***.***> wrote: Hi Eddi, I've been trying to get BLCA to run with a custom database. As far as I can tell, I have everything formatted as specified in the README, but BLCA errors out while trying to write the output files with the following messages: blastdbcmd is located in your PATH! muscle is located in your PATH! > > Fasta file read in!! > > Read in taxonomy information! blastn is located in your PATH! > > Running blast!! > > Blastn Finished!! > > Read in blast output! Traceback (most recent call last): File "/groups/cbi/shared/apps/BLCA/BLCA.git/2.blca_main.py", line 352, in <module> outout.write(le+":"+max(lexsum,key=lexsum.get)+";"+str(max(lexsum.values()))+";") ValueError: max() arg is an empty sequence I managed to get it run successfully once or twice with different sequence IDs, but haven't been able to replicate it. Even then, it kept identifying sequences as "Unclassified," though. This would probably be a separate issue, though. I've uploaded my blastdb, reference FASTA used to generate the blastdb (midori_blca_dedup.fasta), reference taxonomy file (midori_blca_taxonomy.txt), and two test input FASTA files (uniques10_trim.fasta) and (other_BLCA_test_input.fasta) to a Dropbox link here: https://www.dropbox.com/sh/m3hie0b9o8ldc29/ AABHeBAiS94lwl_skgtEY-qIa?dl=0 Thanks for your help! Cheers, Bryan — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#6>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHCP06CrnvVgNhUKDEGS3TC-MfezL-iUks5tUHe0gaJpZM4SCjsg> .

Talitrus · 2018-02-12T20:04:08Z

Fantastic! Yes, the differing number of lines/reads is because I generated the taxonomy file before deduplication, but I was hoping it wouldn't make a difference (just being lazy). The Midori database has some duplicate sequence IDs, which I didn't realize until I tried to generate the BLAST db.

Thanks for the clarification. I'm looking forward to giving BLCA a try. The results you posted look promising.

Cheers,
Bryan

Talitrus closed this as completed Feb 12, 2018

dswan mentioned this issue Jul 4, 2019

ValueError: max() arg is an empty sequence" #19

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error writing output: ValueError: max() arg is an empty sequence #6

Error writing output: ValueError: max() arg is an empty sequence #6

Talitrus commented Feb 12, 2018

yingeddi2008 commented Feb 12, 2018 via email

Talitrus commented Feb 12, 2018 •

edited

Loading

Error writing output: ValueError: max() arg is an empty sequence #6

Error writing output: ValueError: max() arg is an empty sequence #6

Comments

Talitrus commented Feb 12, 2018

yingeddi2008 commented Feb 12, 2018 via email

Talitrus commented Feb 12, 2018 • edited Loading

Talitrus commented Feb 12, 2018 •

edited

Loading