Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in QC join function #25

Closed
mgopez opened this issue Jan 12, 2018 · 1 comment
Closed

Error in QC join function #25

mgopez opened this issue Jan 12, 2018 · 1 comment

Comments

@mgopez
Copy link
Member

mgopez commented Jan 12, 2018

In the current version of bio_hansel, we get an error on Galaxy as follows:

Fatal error: Exit code 1 ()
2018-01-10 10:53:56,700 DEBUG: Namespace(files=[], force=False, input_directory=None, input_fasta_genome_name=None, keep_tmp=False, low_cov_depth_freq=20, max_intermediate_tiles=0.05, max_kmer_freq=1000, max_missing_tiles=0.05, min_ambiguous_tiles=3, min_kmer_freq=8, output_simple_summary='tech_results.tab', output_summary='results.tab', output_tile_results='match_results.tab', paired_reads=[['CE-R-09-0025_EC20081043_S12_L001_001_1', 'CE-R-09-0025_EC20081043_S12_L001_001_2']], scheme='heidelberg', scheme_name=None, slow=False, threads=1, tmp_dir='/tmp', verbose=3) [in /Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/main.py:209]
2018-01-10 10:53:56,712 INFO: Serial single threaded run mode on 1 input genomes [in /Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py:493]
2018-01-10 10:53:56,713 INFO: genome_name CE-R-09-0025_EC20081043_S12_L001_001 [in /Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py:407]
2018-01-10 10:54:32,736 DEBUG: max substype str len: 7 [in /Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py:446]
2018-01-10 10:54:32,740 DEBUG: pos_subtypes: [[2], [2, 2], [2, 2, 2], [2, 2, 2, 2]] [in /Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py:450]
2018-01-10 10:54:32,741 DEBUG: inconsistent_subtypes: [] [in /Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py:452]
Traceback (most recent call last):
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/bin/hansel", line 11, in <module>
    load_entry_point('bio-hansel==1.1.0', 'console_scripts', 'hansel')()
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/main.py", line 259, in main
    n_threads=n_threads)
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py", line 500, in query_reads_ac
    for fastq_files, genome_name in reads]
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py", line 500, in <listcomp>
    for fastq_files, genome_name in reads]
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/subtyper.py", line 474, in subtype_reads_ac
    st.qc_status, st.qc_message = perform_quality_check(st, df, subtyping_params)
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/qc/__init__.py", line 46, in perform_quality_check
    status, message = func(st, df, subtyping_params)
  File "/Warehouse/galaxy/deps/_conda/envs/mulled-v1-ad86f404540f17af24d34154fed6fcb9f38d2ffb41a623e050ebe4e15ee2ad90/lib/python3.6/site-packages/bio_hansel/qc/checks.py", line 92, in is_mixed_subtype
    '; '.join(conflicting_tiles['refposition'].tolist()),
TypeError: sequence item 0: expected str instance, int found

This is because in qc/checks.py at line: 92:
; '.join(conflicting_tiles['refposition'].tolist()) will try to join 1 conflicting tile's refposition which is being interpreted as an integer.

Should change this to:
'; '.join(conflicting_tiles['refposition'].astype(str).tolist())

@peterk87
Copy link
Contributor

Thanks for catching this @hellothisisMatt ! I'll cut a new release with this bug fixed. Added a test to ensure it doesn't occur again.

peterk87 pushed a commit that referenced this issue Jan 12, 2018
Signed-off-by: pkruczkiewicz <peter.kruczkiewicz@canada.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants