Skip to content

Broken pipe with samtools sort and multiple selects #92

@unode

Description

@unode

Current master dc769d7 fails with:

[Mon 29-10-2018 02:23:33]: Script OK. Starting interpretation...
[Mon 29-10-2018 02:23:33] Line 6: Running garbage collection.
[Mon 29-10-2018 02:23:33] Line 6: Interpreting [interpretIO]: input = samfile("input.bam")
[Mon 29-10-2018 02:23:33] Line 6: Interpreting [assignment]: samfile("input.bam")
[Mon 29-10-2018 02:23:33] Line 6: Interpreting [samfile]: NGOString "input.bam"
[Mon 29-10-2018 02:23:33] Line 8: Running garbage collection.
[Mon 29-10-2018 02:23:33] Line 8: Interpreting [interpretIO]: input = select(Lookup 'input' as NGLMappedReadSet; keep_if=[{mapped}])
[Mon 29-10-2018 02:23:33] Line 8: Interpreting [assignment]: select(Lookup 'input' as NGLMappedReadSet; keep_if=[{mapped}])
[Mon 29-10-2018 02:23:33] Line 9: Running garbage collection.
[Mon 29-10-2018 02:23:33] Line 9: Interpreting [interpretIO]: input = select(Lookup 'input' as NGLMappedReadSet; keep_if=[{mapped}])
[Mon 29-10-2018 02:23:33] Line 9: Interpreting [assignment]: select(Lookup 'input' as NGLMappedReadSet; keep_if=[{mapped}])
[Mon 29-10-2018 02:23:33] Line 11: Running garbage collection.
[Mon 29-10-2018 02:23:33] Line 11: Interpreting [interpretIO]: input = samtools_sort(Lookup 'input' as NGLMappedReadSet; __output_bam=True; by={name})
[Mon 29-10-2018 02:23:33] Line 11: Interpreting [assignment]: samtools_sort(Lookup 'input' as NGLMappedReadSet; __output_bam=True; by={name})
[Mon 29-10-2018 02:23:33] Line 11: Interpreting [executing module function: 'samtools_sort']: NGOMappedReadSet {nglgroupName = "input.bam", nglSamFile = <STREAM>, nglReference = Nothing}
[Mon 29-10-2018 02:23:33] Line 11: Created & opened temporary file /tmp/sorted_selected_selected_input.23787-0.bam
[Mon 29-10-2018 02:23:33] Line 11: Calling binary /home/u/system/apps/ngless/bin/../share/ngless/bin/ngless-0.9.1-samtools with args: sort -n -@ 1 -O bam -T /tmp/samtools_sort_temp.tmp23787/samruntmp
[Mon 29-10-2018 02:23:33] Line 11: Starting samtools view of input.bam
Exiting after internal error. If you can reproduce this issue, please run your script with the --trace flag and report a bug at http://github.com/ngless-toolkit/ngless/issues
fd:13: hPutBuf: resource vanished (Broken pipe)

when using:

ngless "0.8"
import "parallel" version "0.6"
import "mocat" version "0.0"
import "samtools" version "0.0"

input = samfile("input.bam")

input = select(input, keep_if=[{mapped}])
input = select(input, keep_if=[{mapped}])

input = samtools_sort(input, by={name})
write(input, ofile='namesorted.bam')

and a sufficiently large input.bam file.

I tried reproducing this with the .bam files we have in the repository but none was big enough to trigger the broken pipe.

I managed to reproduce it locally with a 13MB bam file generated with:

ngless "0.8"
import "parallel" version "0.6"
import "mocat" version "0.0"
import "samtools" version "0.0"

samples = readlines(ARGV[3])
sample = lock1([ARGV[1]])
input = load_mocat_sample(ARGV[2] + '/' + sample)

mapped = map(input, fafile=ARGV[4], mode_all=True)
mapped = select(mapped) using |mr|:
    mr = mr.filter(min_match_size=45, min_identity_pc=97, action={unmatch})

write(mapped, ofile='outputs/' + sample + '.test.bam')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions