Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange error when QC some samples #91

Closed
AlessioMilanese opened this issue Oct 27, 2018 · 6 comments

Comments

@AlessioMilanese
Copy link

commented Oct 27, 2018

Hi,
I'm trying to filter reads based on quality and remove human reads, but I have the following error:

Exiting after internal error. If you can reproduce this issue, please run your script with the --trace flag and report a bug at http://github.com/luispedro/ngless/issues
/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference11873-6..1.fq11873-7.gz: renameFile:renamePath:rename: does not exist (No such file or directory)

Some more information:
I installed ngless from bioconda (ngless v0.9.1 (release date: July 17 2018))
I'm running this script (/scratch/milanese/TEST/QC/filter-human-merged.ngl):

ngless "0.8"
import "parallel" version "0.6"
import "mocat" version "0.0"

samples = readlines(ARGV[2])
sample = lock1(samples)
input = load_mocat_sample(ARGV[1] + '/' + sample)

input = preprocess(input, keep_singles=True) using |read|:
    read = substrim(read, min_quality=25)
    if len(read) < 45:
        discard
mapped = map(input, reference='hg19')

mapped = select(mapped) using |mr|:
    mr = mr.filter(min_match_size=45, min_identity_pc=90, action={unmatch})
    if mr.flag({mapped}):
        discard

write(as_reads(mapped), ofile=sample+'/'+sample+'.filtered.fq.gz')
collect(qcstats({fastq}), ofile='preprocessing_fqstats.txt', current=sample, allneeded=samples)

submitting it to slurm with the script (/scratch/milanese/TEST/QC/slurm_ex2.sh):

#!/bin/bash
#SBATCH -A zeller
#SBATCH -t 1-00:00
#SBATCH --mem 16G
#SBATCH -n 8
#SBATCH -o /scratch/milanese/TEST/QC/r.out 
#SBATCH -e /scratch/milanese/TEST/QC/r.err

/g/scb2/zeller/milanese/software/ANACONDA/install/bin/ngless /scratch/milanese/TEST/QC/filter-human-merged.ngl /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156 /scratch/milanese/TEST/QC/list2 -j 8

The file /scratch/milanese/TEST/QC/list2 contains:

CCIS12370844ST-4-0

The directory /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0 contains:

CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.1.fq.gz
CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.2.fq.gz
CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.singles.fq.gz
CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.1.fq.gz
CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.2.fq.gz
CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.singles.fq.gz
CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.1.fq.gz
CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.2.fq.gz
CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.singles.fq.gz
CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.1.fq.gz
CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.2.fq.gz
CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.singles.fq.gz
CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.1.fq.gz
CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.2.fq.gz
CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.singles.fq.gz

The output that I get using --trace is:

[Fri 26-10-2018 19:56:09]: # Configuration
[Fri 26-10-2018 19:56:09]: 	download base URL: http://ngless.embl.de/resources/
[Fri 26-10-2018 19:56:09]: 	global data directory: /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data
[Fri 26-10-2018 19:56:09]: 	user directory: /home/milanese/.local/share/ngless
[Fri 26-10-2018 19:56:09]: 	user data directory: /home/milanese/.local/share/ngless/data
[Fri 26-10-2018 19:56:09]: 	temporary directory: /scratch/milanese/NGless_temp_dir/
[Fri 26-10-2018 19:56:09]: 	keep temporary files: False
[Fri 26-10-2018 19:56:09]: 	create report: True
[Fri 26-10-2018 19:56:09]: 	report directory: /scratch/milanese/TEST/QC/filter-human-merged.ngl.output_ngless
[Fri 26-10-2018 19:56:09]: 	color setting: AutoColor
[Fri 26-10-2018 19:56:09]: 	print header: True
[Fri 26-10-2018 19:56:09]: 	subsample: False
[Fri 26-10-2018 19:56:09]: 	verbosity: Normal
[Fri 26-10-2018 19:56:09]: 	search path:
[Fri 26-10-2018 19:56:09]: Loading modules...
[Fri 26-10-2018 19:56:09]: Validating script...
[Fri 26-10-2018 19:56:09]: Transforming script...
[Fri 26-10-2018 19:56:09]: Transformation for QC triggered for variable Variable "input" on line 7.
NGLess v0.9.1 (C) NGLess authors
http://ngless.embl.de/

When publishing results from this script, please cite the following references:

	 - Coelho, L.P., Alves, R., Monteiro, P., Huerta-Cepas, J., Freitas, A.T., and Bork, P., 2018.
	 NG-meta-profiler: fast processing of metagenomes using NGLess, a omain-specific language bioRxiv
	 367755 https://doi.org/10.1101/367755

	 - Li, H., 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv
	 preprint arXiv:1303.3997.

	 - MOCAT2: a metagenomic assembly, annotation and profiling framework. Kultima JR, Coelho LP, Forslund
	 K, Huerta-Cepas J, Li S, Driessen M, et al. (2016) Bioinformatics (2016)
	 doi:10.1093/bioinformatics/btw183

	 - MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit. Kultima JR, Sunagawa S, Li J, Chen W, Chen
	 H, Mende DR, et al. (2012) PLoS ONE 7(10): e47656. doi:10.1371/journal.pone.0047656


[Fri 26-10-2018 19:56:09]: Script OK. Starting interpretation...
[Fri 26-10-2018 19:56:09]: Interpreting [interpretIO]: __check_index_access(Lookup 'ARGV' (type unknown); original_lno=5; index1=2)
[Fri 26-10-2018 19:56:09]: Interpreting [executing module function: '__check_index_access']: NGOList [NGOString "/scratch/milanese/TEST/QC/filter-human-merged.ngl",NGOString "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156",NGOString "/scratch/milanese/TEST/QC/list2"]
[Fri 26-10-2018 19:56:09]: Interpreting [interpretIO]: __check_index_access(Lookup 'ARGV' (type unknown); original_lno=7; index1=1)
[Fri 26-10-2018 19:56:09]: Interpreting [executing module function: '__check_index_access']: NGOList [NGOString "/scratch/milanese/TEST/QC/filter-human-merged.ngl",NGOString "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156",NGOString "/scratch/milanese/TEST/QC/list2"]
[Fri 26-10-2018 19:56:09] Line 5: Interpreting [interpretIO]: samples = readlines(Lookup 'ARGV' as NGList NGLString[IndexOne 2])
[Fri 26-10-2018 19:56:09] Line 5: Interpreting [assignment]: readlines(Lookup 'ARGV' as NGList NGLString[IndexOne 2])
[Fri 26-10-2018 19:56:09] Line 5: Interpreting [executing module function: 'readlines']: NGOString "/scratch/milanese/TEST/QC/list2"
[Fri 26-10-2018 19:56:09] Line 6: Interpreting [interpretIO]: sample = lock1(Lookup 'samples' as NGList NGLString; __hash="8cfe7bf84bb3c439150acb509ecccbb7")
[Fri 26-10-2018 19:56:09] Line 6: Interpreting [assignment]: lock1(Lookup 'samples' as NGList NGLString; __hash="8cfe7bf84bb3c439150acb509ecccbb7")
[Fri 26-10-2018 19:56:09] Line 6: Interpreting [executing module function: 'lock1']: NGOList [NGOString "CCIS12370844ST-4-0",NGOString ""]
[Fri 26-10-2018 19:56:09] Line 6: Looking for a lock in ngless-locks/8cfe7bf8. Total number of elements is 2 (not locked: 2; not finished: 2).
[Fri 26-10-2018 19:56:09] Line 6: Acquired lock file ngless-locks/8cfe7bf8/CCIS12370844ST-4-0.lock
[Fri 26-10-2018 19:56:09] Line 6: lock1: Obtained lock file: 'ngless-locks/8cfe7bf8/CCIS12370844ST-4-0.lock'
[Fri 26-10-2018 19:56:09] Line 6: Writing stats to 'ngless-stats/8cfe7bf8/CCIS12370844ST-4-0'
[Fri 26-10-2018 19:56:09] Line 7: Interpreting [interpretIO]: input = load_mocat_sample(Lookup 'ARGV' as NGList NGLString[IndexOne 1]BOpAdd"/"BOpAddLookup 'sample' as NGLString; __perform_qc=False)
[Fri 26-10-2018 19:56:09] Line 7: Interpreting [assignment]: load_mocat_sample(Lookup 'ARGV' as NGList NGLString[IndexOne 1]BOpAdd"/"BOpAddLookup 'sample' as NGLString; __perform_qc=False)
[Fri 26-10-2018 19:56:09] Line 7: Interpreting [executing module function: 'load_mocat_sample']: NGOString "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0"
[Fri 26-10-2018 19:56:09] Line 7: Executing load_mocat_sample transform
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found paired-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.1.fq.gz' - '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.2.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found paired-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.1.fq.gz' - '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.2.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found paired-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.1.fq.gz' - '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.2.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found paired-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.1.fq.gz' - '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.2.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found paired-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.1.fq.gz' - '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.2.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found single-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.singles.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found single-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.singles.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found single-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.singles.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found single-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.singles.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: load_mocat_sample found single-end sample '/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.singles.fq.gz'
[Fri 26-10-2018 19:56:09] Line 7: Executing paired on "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.1.fq.gz""/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.2.fq.gz"""
[Fri 26-10-2018 19:56:09] Line 7: Executing paired on "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.1.fq.gz""/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.2.fq.gz"""
[Fri 26-10-2018 19:56:09] Line 7: Executing paired on "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.1.fq.gz""/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.2.fq.gz"""
[Fri 26-10-2018 19:56:09] Line 7: Executing paired on "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.1.fq.gz""/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.2.fq.gz"""
[Fri 26-10-2018 19:56:09] Line 7: Executing paired on "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.1.fq.gz""/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.2.fq.gz"""
[Fri 26-10-2018 19:56:09] Line 9: Interpreting [interpretIO]: input = preprocess(Lookup 'input' as NGLReadSet; __input_qc=True; keep_singles=True)using {Block {blockVariable = [Variable "read"], blockBody = Sequence [Optimized (SubstrimReassign (Variable "read") 25),Optimized (LenThresholdDiscard (Variable "read") BOpLT 45)]}}
[Fri 26-10-2018 19:56:09] Line 9: Interpreting [assignment]: preprocess(Lookup 'input' as NGLReadSet; __input_qc=True; keep_singles=True)using {Block {blockVariable = [Variable "read"], blockBody = Sequence [Optimized (SubstrimReassign (Variable "read") 25),Optimized (LenThresholdDiscard (Variable "read") BOpLT 45)]}}
[Fri 26-10-2018 19:56:09] Line 9: Created & opened temporary file /scratch/milanese/NGless_temp_dir/preprocessed.1...fq13665-2.gz
[Fri 26-10-2018 19:56:09] Line 9: Created & opened temporary file /scratch/milanese/NGless_temp_dir/preprocessed.2...fq13665-3.gz
[Fri 26-10-2018 19:56:09] Line 9: Created & opened temporary file /scratch/milanese/NGless_temp_dir/preprocessed.singles...fq13665-4.gz
[Fri 26-10-2018 19:56:13] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.1.fq.gz
[Fri 26-10-2018 19:56:13] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:13] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:13] Line 9: Number of sequences: 629462
[Fri 26-10-2018 19:56:13] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.2.fq.gz
[Fri 26-10-2018 19:56:13] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:13] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:13] Line 9: Number of sequences: 629462
[Fri 26-10-2018 19:56:16] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.1.fq.gz
[Fri 26-10-2018 19:56:16] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:16] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:16] Line 9: Number of sequences: 569180
[Fri 26-10-2018 19:56:16] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.pair.2.fq.gz
[Fri 26-10-2018 19:56:16] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:16] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:16] Line 9: Number of sequences: 569180
[Fri 26-10-2018 19:56:19] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.1.fq.gz
[Fri 26-10-2018 19:56:19] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:19] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:19] Line 9: Number of sequences: 365987
[Fri 26-10-2018 19:56:19] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.pair.2.fq.gz
[Fri 26-10-2018 19:56:19] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:19] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:19] Line 9: Number of sequences: 365987
[Fri 26-10-2018 19:56:21] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.1.fq.gz
[Fri 26-10-2018 19:56:21] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:21] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:21] Line 9: Number of sequences: 353423
[Fri 26-10-2018 19:56:21] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.pair.2.fq.gz
[Fri 26-10-2018 19:56:21] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:21] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:21] Line 9: Number of sequences: 353423
[Fri 26-10-2018 19:56:21] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.1.fq.gz
[Fri 26-10-2018 19:56:21] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:21] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:21] Line 9: Number of sequences: 63941
[Fri 26-10-2018 19:56:21] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.pair.2.fq.gz
[Fri 26-10-2018 19:56:21] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:21] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:21] Line 9: Number of sequences: 63941
[Fri 26-10-2018 19:56:22] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.singles.fq.gz
[Fri 26-10-2018 19:56:22] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:22] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:22] Line 9: Number of sequences: 145122
[Fri 26-10-2018 19:56:22] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-1_lane6.screened.adapter.screened.hg19.singles.fq.gz
[Fri 26-10-2018 19:56:22] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:22] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:22] Line 9: Number of sequences: 139763
[Fri 26-10-2018 19:56:23] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane7.screened.adapter.screened.hg19.singles.fq.gz
[Fri 26-10-2018 19:56:23] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:23] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:23] Line 9: Number of sequences: 270559
[Fri 26-10-2018 19:56:24] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-2_lane8.screened.adapter.screened.hg19.singles.fq.gz
[Fri 26-10-2018 19:56:24] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:24] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:24] Line 9: Number of sequences: 298934
[Fri 26-10-2018 19:56:28] Line 9: Simple Statistics completed for: /g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0/CCIS12370844ST-4-0_11s002754-1-3_lane8.screened.adapter.screened.hg19.singles.fq.gz
[Fri 26-10-2018 19:56:28] Line 9: Number of base pairs: 94
[Fri 26-10-2018 19:56:28] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:28] Line 9: Number of sequences: 965214
[Fri 26-10-2018 19:56:28] Line 9: Preprocess finished
[Fri 26-10-2018 19:56:28] Line 9: Simple Statistics completed for: preproc.lno9.pairs.1
[Fri 26-10-2018 19:56:28] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:28] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:28] Line 9: Number of sequences: 1797171
[Fri 26-10-2018 19:56:28] Line 9: Simple Statistics completed for: preproc.lno9.pairs.2
[Fri 26-10-2018 19:56:28] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:28] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:28] Line 9: Number of sequences: 1797171
[Fri 26-10-2018 19:56:28] Line 9: Simple Statistics completed for: preproc.lno9.singles
[Fri 26-10-2018 19:56:28] Line 9: Number of base pairs: 95
[Fri 26-10-2018 19:56:28] Line 9: Encoding is: SangerEncoding
[Fri 26-10-2018 19:56:28] Line 9: Number of sequences: 1733208
[Fri 26-10-2018 19:56:28] Line 13: Interpreting [interpretIO]: mapped = map(Lookup 'input' as NGLReadSet; reference="hg19")
[Fri 26-10-2018 19:56:28] Line 13: Interpreting [assignment]: map(Lookup 'input' as NGLReadSet; reference="hg19")
[Fri 26-10-2018 19:56:28] Line 13: Looked for hg19 in directory /home/milanese/.local/share/ngless/data/References/hg19 (and did not find it)
[Fri 26-10-2018 19:56:28] Line 13: Looked for hg19 in directory /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19 (and found it)
[Fri 26-10-2018 19:56:28] Line 13: Index for /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz already exists.
[Fri 26-10-2018 19:56:28] Line 13: Created & opened temporary file /scratch/milanese/NGless_temp_dir/mapped_reference.13665-5.sam
[Fri 26-10-2018 19:56:28] Line 13: Starting mapping to /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz
[Fri 26-10-2018 19:56:28] Line 13: Calling: /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/bin/ngless-0.9.1-bwa mem -t 8 /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz /scratch/milanese/NGless_temp_dir/preprocessed.1...fq13665-2.gz /scratch/milanese/NGless_temp_dir/preprocessed.2...fq13665-3.gz
[Fri 26-10-2018 20:04:15] Line 13: BWA info: [M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 892496 sequences (80000139 bp)...
[M::process] read 891218 sequences (80000171 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (522, 76195, 560, 526)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (1961, 2985, 3987)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 8039)
[M::mem_pestat] mean and std.dev: (2742.86, 1187.82)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10065)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (68, 71, 75)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (54, 89)
[M::mem_pestat] mean and std.dev: (71.31, 5.06)
[M::mem_pestat] low and high boundaries for proper pairs: (47, 96)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (2, 26, 709)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2123)
[M::mem_pestat] mean and std.dev: (423.81, 614.14)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 2880)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (2068, 3018, 3987)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 7825)
[M::mem_pestat] mean and std.dev: (2867.86, 1092.02)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 9744)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 892496 reads in 911.429 CPU sec, 116.107 real sec
[M::process] read 923126 sequences (80000054 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (519, 77173, 543, 523)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (1914, 2649, 3978)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 8106)
[M::mem_pestat] mean and std.dev: (2625.26, 1303.36)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10170)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (68, 71, 75)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (54, 89)
[M::mem_pestat] mean and std.dev: (71.30, 5.06)
[M::mem_pestat] low and high boundaries for proper pairs: (47, 96)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (2, 34, 587)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1757)
[M::mem_pestat] mean and std.dev: (367.36, 546.33)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 2553)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (2067, 3035, 3991)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 7839)
[M::mem_pestat] mean and std.dev: (2893.28, 1093.52)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 9763)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 891218 reads in 909.287 CPU sec, 114.563 real sec
[M::process] read 887502 sequences (72012857 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (477, 90794, 739, 445)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (1892, 2646, 3973)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 8135)
[M::mem_pestat] mean and std.dev: (2577.37, 1311.18)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10216)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (67, 70, 74)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (53, 88)
[M::mem_pestat] mean and std.dev: (70.42, 5.44)
[M::mem_pestat] low and high boundaries for proper pairs: (46, 95)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (2, 10, 442)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1322)
[M::mem_pestat] mean and std.dev: (137.12, 290.42)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1762)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (2086, 2915, 3982)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 7774)
[M::mem_pestat] mean and std.dev: (2853.81, 1102.72)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 9670)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 923126 reads in 903.050 CPU sec, 113.579 real sec
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (492, 82359, 982, 577)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (1924, 2694, 3818)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 7606)
[M::mem_pestat] mean and std.dev: (2653.04, 1192.80)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 9500)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (53, 69, 73)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (13, 113)
[M::mem_pestat] mean and std.dev: (62.02, 16.00)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 133)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (3, 11, 327)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 975)
[M::mem_pestat] mean and std.dev: (102.18, 214.13)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1299)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (2092, 3027, 3976)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 7744)
[M::mem_pestat] mean and std.dev: (2865.38, 1089.69)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 9628)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 887502 reads in 783.114 CPU sec, 98.607 real sec
[main] Version: 0.7.17-r1188
[main] CMD: /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/bin/ngless-0.9.1-bwa mem -t 8 /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz /scratch/milanese/NGless_temp_dir/preprocessed.1...fq13665-2.gz /scratch/milanese/NGless_temp_dir/preprocessed.2...fq13665-3.gz
[main] Real time: 466.797 sec; CPU: 3512.062 sec

[Fri 26-10-2018 20:04:15] Line 13: Finished mapping to /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz
[Fri 26-10-2018 20:04:15] Line 13: Starting mapping to /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz
[Fri 26-10-2018 20:04:15] Line 13: Calling: /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/bin/ngless-0.9.1-bwa mem -t 8 /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz /scratch/milanese/NGless_temp_dir/preprocessed.singles...fq13665-4.gz
[Fri 26-10-2018 20:06:32] Line 13: BWA info: [M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1040202 sequences (80000176 bp)...
[M::process] read 693006 sequences (57701577 bp)...
[M::mem_process_seqs] Processed 1040202 reads in 554.027 CPU sec, 70.580 real sec
[M::mem_process_seqs] Processed 693006 reads in 463.502 CPU sec, 59.181 real sec
[main] Version: 0.7.17-r1188
[main] CMD: /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/bin/ngless-0.9.1-bwa mem -t 8 /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz /scratch/milanese/NGless_temp_dir/preprocessed.singles...fq13665-4.gz
[main] Real time: 137.083 sec; CPU: 1022.007 sec

[Fri 26-10-2018 20:06:32] Line 13: Finished mapping to /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz
[Fri 26-10-2018 20:06:32] Line 13: Finished mapping to /g/scb2/zeller/milanese/software/ANACONDA/install/bin/../share/ngless/data/References/hg19/Sequence/BWAIndex/reference.fa.gz
[Fri 26-10-2018 20:06:32] Line 13: Total reads: 3530379
[Fri 26-10-2018 20:06:32] Line 13: Total reads aligned: 2173241 [61.56%]
[Fri 26-10-2018 20:06:32] Line 13: Total reads Unique map: 1795123 [50.85%]
[Fri 26-10-2018 20:06:32] Line 13: Total reads Non-Unique map: 378118 [10.71%]
[Fri 26-10-2018 20:06:32] Line 15: Interpreting [interpretIO]: mapped = select(Lookup 'mapped' as NGLMappedReadSet)using {Block {blockVariable = [Variable "mr"], blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=90; action={unmatch} ),if [(Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "flag"}( Just {mapped} )] then {Sequence [discard]} else {Sequence []}]}}
[Fri 26-10-2018 20:06:32] Line 15: Interpreting [assignment]: select(Lookup 'mapped' as NGLMappedReadSet)using {Block {blockVariable = [Variable "mr"], blockBody = Sequence [mr = (Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "filter"}( Nothing; min_match_size=45; min_identity_pc=90; action={unmatch} ),if [(Lookup 'mr' as NGLMappedRead).MethodName {unwrapMethodName = "flag"}( Just {mapped} )] then {Sequence [discard]} else {Sequence []}]}}
[Fri 26-10-2018 20:06:32] Line 15: Executing blocked select on file /scratch/milanese/NGless_temp_dir/mapped_reference.13665-5.sam
[Fri 26-10-2018 20:06:32] Line 15: Created & opened temporary file /scratch/milanese/NGless_temp_dir/block_selected_mapped_reference13665-6.sam
[Fri 26-10-2018 20:06:42] Line 20: Interpreting [interpretIO]: temp$4 = as_reads(Lookup 'mapped' as NGLMappedReadSet)
[Fri 26-10-2018 20:06:42] Line 20: Interpreting [assignment]: as_reads(Lookup 'mapped' as NGLMappedReadSet)
[Fri 26-10-2018 20:06:42] Line 20: Interpreting [executing module function: 'as_reads']: NGOMappedReadSet {nglgroupName = "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0", nglSamFile = File /scratch/milanese/NGless_temp_dir/block_selected_mapped_reference13665-6.sam, nglReference = Just "hg19"}
[Fri 26-10-2018 20:06:42] Line 20: Created & opened temporary file /scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..1.fq13665-7.gz
[Fri 26-10-2018 20:06:42] Line 20: Created & opened temporary file /scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..2.fq13665-8.gz
[Fri 26-10-2018 20:06:42] Line 20: Created & opened temporary file /scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..singles.fq13665-9.gz
[Fri 26-10-2018 20:06:50] Line 20: Finished as_reads
[Fri 26-10-2018 20:06:51] Line 20: Heuristic for FastQ encoding determination for file "/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..1.fq13665-7.gz" cannot be 100% confident. Guessing 33 offset (Sanger encoding, used by newer Illumina machines).
[Fri 26-10-2018 20:06:51] Line 20: Interpreting [interpretIO]: write(Lookup 'temp$4' as NGLReadSet; __can_move=True; __hash="5ebc69331546eebb8a8b9049e8e4e616"; ofile=Lookup 'sample' as NGLStringBOpAdd"/"BOpAddLookup 'sample' as NGLStringBOpAdd".filtered.fq.gz")
[Fri 26-10-2018 20:06:51] Line 20: Interpreting [write]: NGOReadSet "/g/scb2/zeller/SHARED/DATA/metaG/REAL/CRC-META/FR-CRC_N156/CCIS12370844ST-4-0" (ReadSet {pairedSamples = [(FastQFilePath {fqpathEncoding = SangerEncoding, fqpathFilePath = "/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..1.fq13665-7.gz"},FastQFilePath {fqpathEncoding = SangerEncoding, fqpathFilePath = "/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..2.fq13665-8.gz"})], singleSamples = [FastQFilePath {fqpathEncoding = SangerEncoding, fqpathFilePath = "/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..singles.fq13665-9.gz"}]})
Exiting after internal error. If you can reproduce this issue, please run your script with the --trace flag and report a bug at http://github.com/luispedro/ngless/issues
/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference13665-6..1.fq13665-7.gz: renameFile:renamePath:rename: does not exist (No such file or directory)

@luispedro luispedro added the bug label Oct 27, 2018

@luispedro

This comment has been minimized.

Copy link
Collaborator

commented Oct 27, 2018

Thanks for the excellent report, @AlessioMilanese

@luispedro

This comment has been minimized.

Copy link
Collaborator

commented Oct 27, 2018

I managed to reproduce it locally with v0.9.1, but it works correctly on master. Still looking into it, but might just be worthwhile to do a new release.

@luispedro

This comment has been minimized.

Copy link
Collaborator

commented Oct 28, 2018

Could this possibly be a permission error with an erroneous error message? Does your user have permissions to create the output files inside the output directory?

@AlessioMilanese

This comment has been minimized.

Copy link
Author

commented Oct 28, 2018

I have writing permission. I can reproduce the error also with just one lane:

CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.1.fq.gz
CCIS12370844ST-4-0_11s002754-1-1_lane5.screened.adapter.screened.hg19.pair.2.fq.gz

and also if I rename the samples to:

a_1.fastq.gz
a_2.fastq.gz

The error says

/scratch/milanese/NGless_temp_dir/reads_block_selected_mapped_reference7827-4..1.fq7827-5.gz: renameFile:renamePath:rename: does not exist (No such file or directory)

but when I check reads_block_selected_mapped_reference7827-4..1.fq7827-5.gz before the script finish, I can see it:

-bash-4.2$ ls -lah /scratch/milanese/NGless_temp_dir/
total 648M
drwxr-xr-x.  2 milanese zeller    8 Oct 28 22:37 .
drwxr-xr-x. 16 milanese zeller   18 Oct 26 14:14 ..
-rw-r--r--.  1 milanese zeller 142M Oct 28 22:37 block_selected_mapped_reference7827-4.sam
-rw-r--r--.  1 milanese zeller 349M Oct 28 22:37 mapped_reference.7827-3.sam
-rw-r--r--.  1 milanese zeller  46M Oct 28 22:34 preprocessed.1...fq7827-0.gz
-rw-r--r--.  1 milanese zeller  46M Oct 28 22:34 preprocessed.2...fq7827-1.gz
-rw-r--r--.  1 milanese zeller 2.6M Oct 28 22:34 preprocessed.singles...fq7827-2.gz
-rw-r--r--.  1 milanese zeller  22M Oct 28 22:37 reads_block_selected_mapped_reference7827-4..1.fq7827-5.gz
-rw-r--r--.  1 milanese zeller  22M Oct 28 22:37 reads_block_selected_mapped_reference7827-4..2.fq7827-6.gz
-rw-r--r--.  1 milanese zeller    0 Oct 28 22:37 reads_block_selected_mapped_reference7827-4..singles.fq7827-7.gz

It seems indeed a problem when moving the files to the final destination.

@AlessioMilanese

This comment has been minimized.

Copy link
Author

commented Oct 29, 2018

If I change the name of the directory from CCIS12370844ST-4-0 to aa and the file list to:

aa

it seems to work. Can you reproduce it?

@luispedro

This comment has been minimized.

Copy link
Collaborator

commented Nov 9, 2018

Okay, I figured this out. It's a mix of (1) what I think is a bug in your code (2) NGLess does not correctly check if the output directory exists and (3) outputs a horrible error message, (4) partially because of another bug in the haskell posix library (haskell/unix#60)

You were doing:

samples = readlines(ARGV[2])
sample = lock1(samples)
input = load_mocat_sample(ARGV[1] + '/' + sample)

You load the data from ARGV[1] </> sample, but later write to another directory:

write(as_reads(mapped), ofile=sample+'/'+sample+'.filtered.fq.gz')

Just sample.

If this directory does not exist, you get the stupid behaviour you saw.

I am re-classifying this bug as "better-error-message"

@luispedro luispedro added better error message and removed bug labels Nov 9, 2018

luispedro added a commit that referenced this issue Nov 9, 2018

ENH Slightly more flexible output path checking
See #91 for an example where this is relevant.

@luispedro luispedro closed this in 8f2dd8b Nov 9, 2018

luispedro added a commit that referenced this issue Nov 12, 2018

BLD Release 0.10
Many small fixes rather than any large new features.

Full ChangeLog:

    * Fix to lock1's return value when used with paths (#68 - reopen)
    * Support _F/_R suffixes for forward/reverse in load_mocat_sample
    * samtools_sort() now accepts by={name} to sort by read name
    * Fixed bug where header was printed even when STDOUT was used
    * Fixed bug where writing interleaved FastQ to STDOUT did not work as
    expected
    * Indices created by bwa and minimap2 are now versioned
    * arg1 in external modules is no longer always treated as a path
    * Added expand_searchpath to external modules API (closes #56)
    * Fixed bug where detection of Fastq encoding was not performed on the second pair
    * Fix saving fastq sets with --subsample (issue #85)
    * Add __extra_megahit_args to assemble() (issue #86)
    * Better error message when user mis-specifies the ngless version string
    (issue #84)
    * Support NO_COLOR environment variable (issue #83)
    * Garbage collection for temporary files (issue #79)
    * Rename --search-dir to --search-path for consistency with other API
    * Fix corner case with select() producing incorrect CIGAR strings (#92)
    * Always check output file writability (#91)
    * Make paired() accept encoding argument
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.