Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing process > 'downsampleFastq' #84

Closed
erazzolini opened this issue Jan 9, 2024 · 16 comments
Closed

Error executing process > 'downsampleFastq' #84

erazzolini opened this issue Jan 9, 2024 · 16 comments

Comments

@erazzolini
Copy link

Hello there,

I'm trying to run MetONTIIME on my data set (a few reads with ITS), but I have some problem with downsampleFastq (I think).

When I run:

nextflow -c metontiime2.conf run metontiime2.nf --resultsDir=/Users/emanuel/teste_microbioma_nfn/its/completo/results

I received the error:

me2.nf --resultsDir=/Users/emanuel/teste_microbioma_nfn/its/completo/results
N E X T F L O W ~ version 23.10.0
Launching metontiime2.nf [determined_hugle] DSL2 - revision: 24f9bc59a5
executor > local (4)
[16/b07488] process > importDb (1) [ 0%] 0 of 1
[88/5e64f8] process > concatenateFastq [100%] 1 of 1 ✔
[56/11796d] process > filterFastq [100%] 1 of 1 ✔
[50/f9dcff] process > downsampleFastq [ 0%] 0 of 1
[- ] process > importFastq -
[- ] process > derepSeq -
[- ] process > assignTaxonomy -
[- ] process > filterTaxa -
[- ] process > taxonomyVisualization -
[- ] process > collapseTables -
[- ] process > dataQC -
executor > local (4)
[- ] process > importDb (1) -
[88/5e64f8] process > concatenateFastq [100%] 1 of 1 ✔
[56/11796d] process > filterFastq [100%] 1 of 1 ✔
[50/f9dcff] process > downsampleFastq [100%] 1 of 1, failed: 1 ✘
[- ] process > importFastq -
[- ] process > derepSeq -
[- ] process > assignTaxonomy -
[- ] process > filterTaxa -
[- ] process > taxonomyVisualization -
[- ] process > collapseTables -
[- ] process > dataQC -
[- ] process > diversityAnalyses -
ERROR ~ Error executing process > 'downsampleFastq'

Caused by:
Process downsampleFastq terminated with an error exit status (1)

Command executed:

mkdir -p /Users/emanuel/teste_microbioma_nfn/its/completo/results/downsampleFastq
fq=$(find /Users/emanuel/teste_microbioma_nfn/its/completo/results/filterFastq/ | grep ".fastq.gz$");
for f in $fq; do
/Userssn=$(basename $f);
seqtk sample $f 1000 | gzip > /Users/emanuel/teste_microbioma_nfn/its/completo/results/downsampleFastq/$sn
done

Command exit status:
1

Command output:
(empty)

Work dir:
/Users/emanuel/teste_microbioma_nfn/its/completo/MetONTIIME/work/50/f9dcff9acfe9dcd5392ff383b3ef84

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

I tryied a few things, like set to false, change the maxNumReads to 10, 100, 1000, 10000 and 100000, clusteringIdentity and put the files in the right folders, but anything works.

My .conf stay as:

   //Path to working directory including fastq.gz files
    workDir="/Users/emanuel/teste_microbioma_nfn/its/completo/barcode53/"
    //Path to sample metadata tsv file; if it doesn't exist yet, it is created at runtime
    sampleMetadata="/Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv"
    //Path to database file with sequences in fasta format
    dbSequencesFasta="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/sh_refs_qiime_ver$
    //Path to database file with sequence id-to-taxonomy correspondence in tsv format
    dbTaxonomyTsv="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/sh_taxonomy_qiime_ve$
    //Name of database file with sequences as QIIME2 artifact (qza); if it is already available, $
    dbSequencesQza="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/unite_ver9_99_class$
    //Name of database file with sequence id-to-taxonomy correspondence as QIIME2 artifact (qza);$
    dbTaxonomyQza="/Users/emanuel/teste_microbioma_nfn/its/completo/unite_db/sh_taxonomy_qiime_ve$
    //Taxonomy classifier, available: VSEARCH, Blast
    classifier="Blast"
    //maxNumReads is the maximum number of reads per sample; if one sample has more than maxNumRe$
    maxNumReads=1000
    //minReadLength is the minimum length (bp) for a read to be retained
    minReadLength=200
    //maxReadLength is the maximum length (bp) for a read to be retained
    maxReadLength=5000
    //minQual is the minimum average PHRED score for a read to be retained
    minQual=10
    //Number of bases to be trimmed at both ends
    extraEndsTrim=0
    //Identity for de novo clustering [0-1]
    clusteringIdentity=0.9
    //Maximum number of candidate hits for each read, to be used for consensus taxonomy assignment
    maxAccepts=3
    //Minimum fraction of assignments must match top hit to be accepted as consensus assignment [$
    minConsensus=0.7
    //Minimum query coverage for an alignment to be considered a candidate hit [0-1]
    minQueryCoverage=0.8
    //Minimum alignment identity for an alignment to be considered a candidate hit [0-1]
    minIdentity=0.9
    //Taxonomy level at which you want to perform non-phylogeny-based diversity analyses
    taxaLevelDiversity=6  
    //Max num. reads for diversity analyses
    numReadsDiversity=500
    //Taxa of interest that you want to retain and to focus the analysis on
    taxaOfInterest=""
    //Minimum number of reads assigned to Taxa of interest to retain a sample
    minNumReadsTaxaOfInterest=1
    //Path to directory containing results
    resultsDir="/path/to/resultsDir"
    
    help=false
    
    // Flags to select which process to run
    concatenateFastq = true
    filterFastq = true    
    downsampleFastq = true
    importFastq = true   
    dataQC = true
    importDb = true
    derepSeq = true
    assignTaxonomy = true
    taxonomyVisualization = true
    collapseTables = true
    filterTaxa = false
    diversityAnalyses = true

And I set my /Users folder to work with docker.

Are there anything that I can do to fix it?

@MaestSi
Copy link
Owner

MaestSi commented Jan 10, 2024

Hi,
first I noticed that dbSequencesQza and dbTaxonomyQza variables do not point to the file name only, but include the full path, please leave only the base name (the full path is needed instead for dbSequencesFasta and dbTaxonomyTsv variables).
Can you please show the content of /Users/emanuel/teste_microbioma_nfn/its/completo/results/filterFastq/ folder? Does it contain any fastq.gz files?
Thanks,
SM

@erazzolini
Copy link
Author

Hello MaestSi, thank you for your time.

I changed the full path in dbSequencesQza and dbTaxonomyQza.

The folder filterFastq still empty, even if I copy the files from concatenatedFastq folder (fastq.gz files) and run MetONTIIME again, all the files in the filterFastq folder has been removed.

I'm trying to use in my macbook and a linux computer and the error still the same.

ERROR ~ Error executing process > 'downsampleFastq'

Caused by:
Process downsampleFastq terminated with an error exit status (1)

Command executed:

mkdir -p /home/emanuelr/teste_microbioma_nfn/its/completo/results/downsampleFastq
fq=$(find /home/emanuelr/teste_microbioma_nfn/its/completo/results/filterFastq/ | grep ".fastq.gz$");
for f in $fq; do
sn=$(basename $f);
done seqtk sample $f 1000 | gzip > /home/emanuelr/teste_microbioma_nfn/its/completo/results/downsampleFastq/$sn
done

Command exit status:
1

Command output:
(empty)

Work dir:
/home/emanuelr/teste_microbioma_nfn/its/completo/MetONTIIME/work/d8/fb8dd90e6853aebd2a7674659c5981

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

@MaestSi
Copy link
Owner

MaestSi commented Jan 10, 2024

Does workDir="/Users/emanuel/teste_microbioma_nfn/its/completo/barcode53/" contain fastq.gz files still to be merged? If yes, it is ok to keep concatenateFastq = true, otherwise it should be set to false. Moreover, I think workDir should end one level before barcode53 folder, in other words you should try with:
workDir="/Users/emanuel/teste_microbioma_nfn/its/completo/"
Before trying this, please remove manifest.txt file and also sample-metadata.tsv file, if it was automatically generated by the pipeline.
As a last point, did you remember to mount /Users directory, so that Docker can access it?
Ciao,
SM

@erazzolini
Copy link
Author

Hello, It's me again.

I've changed the workdir to one level before barcode53 folder, but didn't work. I changed the concatenateFastq to false and do it by myself, but I still receive the same error. I made a test with Zyro control but didn't work too.

I'm attaching my .conf file. I've runned this file in tree different computers, one mac, one linux and one WSL windows (with docker running), but the error with downsample stills the same.

With docker I've mounted my /home and /Users folder.

image

@MaestSi
Copy link
Owner

MaestSi commented Jan 11, 2024

Hi,
if this is the command line you used:
nextflow -c metontiime2.conf run metontiime2.nf --resultsDir=/Users/emanuel/teste_microbioma_nfn/its/completo/results
you are missing -profile docker.
SM

@erazzolini
Copy link
Author

In one system (that one from conf file) i don't have permission to run with -profile docker, but in the other ones I used the full file, with -profile docker, but the error is the same.

@MaestSi
Copy link
Owner

MaestSi commented Jan 11, 2024

If on one system you don't have either docker or singularity, you can't run the pipeline on that system. If you want, please provide the config file you used on the system with Docker and the command line you used.
SM

@erazzolini
Copy link
Author

I've made a few changes (like remove / in the last folder, copy the files and set false to a few steps), but now I found another error with importFasq that I think it's a qiime error?

ERROR ~ Error executing process > 'importFastq'

Caused by:
Process importFastq terminated with an error exit status (132)

Command executed:

mkdir -p /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq

fq=$(realpath $(find /Users/emanuel/teste_microbioma_nfn/its/completo/downsampleFastq/ -maxdepth 1 | grep ".fastq.gz"))
manifestFile=/Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/manifest.txt

if [ ! -f "$manifestFile" ]; then
ln -s echo -e sample-id"te_mic"absolute-filepath > $manifestFile;
for f in $fq; do
mmand s=$(echo $(basename $f) | sed 's/.fastq.gz//g');
132 echo -e $s" "$f >> $manifestFile;
done
fi

if [ ! -f /Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv ]; then
mmand echo -e sample-id" "sample-name > /Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv;
WARNINfor f in $fq; do
ific ps=$(echo $(basename $f) | sed 's/.fastq.gz//g');
.commaecho -e $s" 23: "$s >> /Users/emanuel/teste_microbioma_nfn/its/completo/sample-metadata.tsv;
h $mandone
fi

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path $manifestFile --input-format 'SingleEndFastqManifestPhred33V2' to/MetONTIIME/wo--output-path /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/sequences.qza

ln -s /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/sequences.qza ./sequences.qza

Command exit status:
132

Command output:
(empty)

Command error:
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
.command.sh: line 23: 13 Illegal instruction qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path $manifestFile --input-format 'SingleEndFastqManifestPhred33V2' --output-path /Users/emanuel/teste_microbioma_nfn/its/completo/importFastq/sequences.qza

Work dir:
/Users/emanuel/teste_microbioma_nfn/its/completo/MetONTIIME/work/4d/9c06d0a952dbadf1ac3d3cc40a543f

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

-- Check '.nextflow.log' file for details

@MaestSi
Copy link
Owner

MaestSi commented Jan 11, 2024

Hi,
I don't think it's a QIIME error. It seems Docker is complaining that the MetONTIIME image was built for Linux/amd64, while you are running on a Linux/arm64/v8 platform. Do you have access to any other server with Linux/amd64 architecture with Docker or Singularity available?
SM

@MaestSi
Copy link
Owner

MaestSi commented Jan 19, 2024

Closing due to inactivity.
SM

@MaestSi MaestSi closed this as completed Jan 19, 2024
@erazzolini
Copy link
Author

Dear MeastSi, Sorry for my delayed return, I spend some time to explain to service admin to allow me to use docker. Now, with all permissions, to folders and files are ok and I have a new error

Caused by:
Process importFastq terminated with an error exit status (2)

Command executed:

mkdir -p /home/emanuelr/teste_microbioma_nfn/its/completo/importFastq

fq=$(realpath $(find /home/emanuelr/teste_microbioma_nfn/its/completo/downsampleFastq/ -maxdepth 1 | grep ".fastq.gz"))
manifestFile=/home/emanuelr/teste_microbioma_nfn/its/completo/importFastq/manifest.txt

if [ ! -f "$manifestFile" ]; then
echo -e sample-id" "absolute-filepath > $manifestFile;
for f in $fq; do
s=$(echo $(basename $f) | sed 's/.fastq.gz//g');
echo -e $s" "$f >> $manifestFile;
done
fi

if [ ! -f ]; then
echo -e sample-id" "sample-name > ;
for f in $fq; do
s=$(echo $(basename $f) | sed 's/.fastq.gz//g');
echo -e $s" "$s >> ;
done
fi

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path $manifestFile --input-format 'SingleEndFastqManifestPhred33V2' --output-path /home/emanuelr/teste_microbioma_nfn/its/completo/importFastq/sequences.qza

ln -s /home/emanuelr/teste_microbioma_nfn/its/completo/importFastq/sequences.qza ./sequences.qza

Command exit status:
2

Command output:
(empty)

Command error:
.command.sh: line 16: syntax error near unexpected token `;'

@MaestSi
Copy link
Owner

MaestSi commented Jan 27, 2024

Hi, I need the command you ran and the config file you used, together with the content of the directory containing fastq.gz files, thanks.
SM

@erazzolini
Copy link
Author

Hello,

I run the nextflow -c metontiime2.conf run metontiime2.nf -profile docker

In my original folder I have one file called barcode53.fastq (only one file)

In the other folders that nextflow create I have another file with the same name and in the importFastq folder I have only one manifest.txt file.

The log of my run:

nextflow.log

@MaestSi
Copy link
Owner

MaestSi commented Jan 29, 2024

Hi, the fastq file should be gzip compressed, you should only have fastq.gz files.
SM

@erazzolini
Copy link
Author

Sorry, the file in the first folder where the fastq file for analysis comes from is in the format fastq.gz, in fact I tried using both ways, fastq and fastq.gz but the error remains the same.

Despite the command creating the folders, I believe that docker creates a folder with root permission, which I do not have access to and I cannot make changes to the folder, could this be a problem?

@MaestSi
Copy link
Owner

MaestSi commented Jan 29, 2024

It looks like Docker is not configured to run without root privileges, could this be the case?
Edit: Docker usually creates files and folders with root privileges, but you must be sure you can run docker run hello-world without sudo.
SM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants