-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error at read-correction step #1
Comments
Hi! Thank you for contacting us. We've updated some Docker images at Dockerhub and also the repository Dockerfiles under conda_envs/ (including read_correction module) due to problems with the enviroment path. The pipeline has been now tested in Ubuntu 18.04 using both conda (v4.8.3) and docker (v19.03.9) with Nextflow v20.01.0 and the test profile. If the problem persist, feel free to contact again and include the executing command and any information about the configuration used.
|
A new push has been made with the latests updates. |
Thanks for your reply. Still having the same issue though - I ran the command for the test data:
On the first occasion it generated the .fasta.gz file but still gave the same error. I have just run again with the same error, this time no file was generated. |
If nextflow, python/pip + conda configuration is ok, it seems that the read_clustering conda environment is not working properly. Try to remove this env (under work/conda directory) and run the pipeline again to reinstall the enviroment and retry the process. If it doesn't work, running the pipeline with '-profile test,docker' will automatically use docker images pulled from Dockerhub that are also tested. Please let us know if the problem persist with conda and docker profiles. |
this pipeline could run in mac os??? |
Nextflow and both conda and docker are compatible with Mac os. We've not tested on a Mac machine but maybe it could be run with the docker profile to avoid compatibility errors. We've updated the pipeline and now we include the exact version tags in the enviroment.yml files used for conda envs. This should fix some errors with conda environments that arise in some machines. Also the docker images include a correct version of the environments. |
Having more luck with the docker option which ran fine with the test data. However, still problem at the read correction with my own dataset even with the docker option: Error executing process > 'read_correction (15)' Caused by: Command executed: head -n$(( null*4 )) 20.fastq > subset.fastq Command exit status: Command output: Command error: Work dir: Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named Apologies if this is an obvious error on my part! |
Also, the test data run errors at the classification step when specifying --db and --tax as I had originally downloaded these to a separate volume. Haven't got to this step on my own data yet. Is there an option to change the working directory? |
The first problem you report is due to a typo in the assignment to the default value of --polishing-reads when it is not set up in the command. You can check in conf/test.conf that this value is set to 20 when using the test profile. We've updated the pipeline to fix the typo and set a default value of --polishing-reads to 100 when using no profile confs at all. If you are running the pipeline with your own data we strongly recommend to manually assign --polishing_reads and --min_cluster_size parameters in order to compare pipeline outputs specially at low taxonomic levels such as species. For db and taxdb parameters, you should write the full path using double quotes: According to the Nextflow documentation, you can use -w set the working directory:
Thank you for your time and feedback! We've also modified the documentation to make those issues with paths and parameters clearer to users. Hope you can run NanoCLUST with no issues using your own data. |
yes, the docker is ok,when run the test when i use my own data , the same problems with Fahadkhokhar is comging , i will do it right now |
thank you |
Hi Are you running the pipeline in mac os? nextflow run main.nf -profile test,docker If you are running Docker on Mac OSX make sure you are mounting your local /Users directory into the Docker VM as explained in this excellent tutorial: How to use Docker on OSX. PD: We updated the pipeline to avoid the Fahadkhokhar problem with read_correction when using their own data |
yes, --polishing_reads 60 --min_cluster_size 50 the problem is sovling, |
Many thanks for the reply. I can now proceed to the classification step, but there is error using both test and own data set in the classification, even without specifying the --db or --tax paths with the test data: Error executing process > 'consensus_classification (1)' Caused by: Command error: |
Hi. I don't know very well what's happening with the classification. The pipeline is working for me on a clean Ubuntu18 VM with the minimum dependencies. Just downloading the db using the exact script inside the NanoCLUST dir: mkdir db db/taxdb
wget https://ftp.ncbi.nlm.nih.gov/blast/db/16S_ribosomal_RNA.tar.gz && tar -xzvf 16S_ribosomal_RNA.tar.gz -C db
wget https://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz && tar -xzvf taxdb.tar.gz -C db/taxdb After that you should have the right directory tree with the db and the taxonomy. Then I manually set those in the command specifying: --db "/home/nanoclust_vm/NanoCLUST/db/16S_ribosonal_RNA" --tax "/home/nanoclust_vm/NanoCLUST/db/taxdb/" Seems that you may downloaded the db in a different way (resulting in not the same dir structure) or any other location other than the NanoCLUST dir? I will try using BLAST databases in different systems and paths to make it more flexible. Thanks again |
hello,i want to compare the data to get the different species,and alpha ,beta analysis, so where i can get the abundance table that like the ”otutab.txt“ (not the rel_abundance), |
Hi, HaiyangDu. At this time, we do not have an option to get an OTU table like the otutab command does yet. However, the nanoclust_out.txt file includes the number of reads assigned to the same taxonomic ID so it may not be hard to build a otutab.txt file to get the file for alpha and beta analysis. We will work on an option to get the exact otutab file to make it easier for users to use NanoCLUST output in downstream analyses that require that file structure. Thank you for your time and feedback. |
Ok,thanks for you reply。
ok,thanks for your reply. |
hi, the problems is that run the pipline with 1 sample is perfectct,but my data has 50 samples,it always occurs error ,when i run the 50 samples with the parameter "--reads 'my path/*.fastq" . |
Running the script with the test data provided, errors at the read-correction stage:
Starting command on Wed May 20 12:05:03 2020 with 39.037 GB free disk space
Finished on Wed May 20 12:05:03 2020 (like a bat out of hell) with 39.037 GB free disk space
gzip: corrected_reads.correctedReads.fasta.gz: No such file or directory
The text was updated successfully, but these errors were encountered: