-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kraken_file error when running MetaPhage (docker) #44
Comments
I'm using the Docker image btw. |
Hi, thanks for reporting. Are you using the 0.3 release or did you clone the repository? |
I cloned into the repository |
For |
Did you create a "configuration file" using the provided "newProject.py" script? Can you please upload the configuration file used? |
Dear Mattia,
i've managed to get the program to run using the docker image, but now it
gets stuck at "violin_plots" stage with the error:
```
Error executing process > 'violin_plots (megahit)'
Caused by:
Process `violin_plots (megahit)` terminated with an error exit status (1)
Command executed:
Rscript
/home/imbm-bioinformatics/Downloads/MetaPhage-0.3.0/bin/Rscript/violin_plot.R
count_table.csv taxonomy_table.csv metadata.csv library
Command exit status:
1
Command output:
(empty)
Command error:
Error in validObject(.Object) : invalid class “phyloseq” object:
Component sample names do not match.
Try sample_names()
Calls: phyloseq ... do.call -> new -> initialize -> initialize ->
validObject
Execution halted
Work dir:
/home/imbm-bioinformatics/Downloads/MetaPhage-0.3.0/work/ed/c65aacbf9a9ffc654e64e9665a9017
Tip: you can replicate the issue by changing to the process work dir and
entering the command `bash .command.run`
```
I've attached my metadata file and config file which was initially created
with the python script, but later manually edited for the machine i'm
running on (Ubuntu 20.04, 16 core I7 11900, 128gb RAM). The metadata file
might be an issue (phyloseq object; issue with rows or columns of the
count_table versus taxonomy_table versus metadata.csv).
Regards
Lonnie
…--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
|
Hi Lonnie, could you please provide your metadata and config file? I fail to find them in the mail attachments. |
Dear Mattia,
I've attached them again (metadat.csv and nextflow.conf). If you don't see
them, I'll upload to Google drive and share the folder with you.
Regards
Lonnie
…On Wed, May 25, 2022 at 11:28 AM Mattia Pandolfo ***@***.***> wrote:
Hi Lonnie, could you please provide your metadata and config file? I fail
to find them in the mail attachments.
—
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZDP2WJWWWM3JFHXYVGU2EDVLXXELANCNFSM5VSR3QHQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
|
Hi again, for some reason the mail has no attachments. Upload the files on Google drive and send me the folder link, thank you. Regards |
See https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/attaching-files to upload directly on github issue (not via mail). |
Here are the config and metadata files: |
Hi Lonnie, your Sample names in the metadata.csv are numbers, which cause the sample_data() function to add the "sa" string in front. This function is used to create the phyloseq object, therefore the error: phyloseq tries to match the sample names in your count table (1 2 3 4) to these modified names (sa1, sa2, sa3, sa4), failing. I would suggest you to rename the samples adding some characters (e.g. sample1, sample2 etc). Regards |
Great, thank you Mattia,
I appreciate your help in sorting this out. I'll let you know if this sorts
the issue.
Regards
Lonnie
…On Wed, May 25, 2022 at 1:17 PM Mattia Pandolfo ***@***.***> wrote:
Hi Lonnie,
your Sample names in the metadata.csv are numbers, which cause the
sample_data() function to add the "sa" string in front. This function is
used to create the phyloseq object, therefore the error: phyloseq tries to
match the sample names in your count table (1 2 3 4) to these modified
names (sa1, sa2, sa3, sa4), failing. I would suggest you to rename the
samples adding some characters (e.g. sample1, sample2 etc).
Regards
Mattia
—
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZDP2WMGKBLJGN6JWXS4BCDVLYD47ANCNFSM5VSR3QHQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
|
Hi Mattia, Unfortunately the change to the metadata file did not resolve the issue. Phyloseq still won't run. Please see the error below: [a5/df1fca] process > summary (megahit) [100%] 1 of 1 ✔ Caused by: Command executed: Rscript /home/imbm-bioinformatics/Downloads/MetaPhage-0.3.0/bin/Rscript/heatmap.R count_table.csv taxonomy_table.csv metadata.csv library Command exit status: Command output: Command error: Work dir: Tip: you can replicate the issue by changing to the process work dir and entering the command Here are the new metadata and config files as well as the taxonomy and count tables: [nextflow.txt](https://github.com/MattiaPandolfoVR/MetaPhage/files/ |
Hi Mattia,
Not sure if you saw my last post. Phyoseq still does not seem to be working
(formatting issue with the metadata, count or taxonomy files). Have you had
any issues reported relating to the OS? I'm on Ubuntu 20.04. Might try
installing it on 18.04 and see if it works there.
Regards
Lonnie
…On Wed, May 25, 2022 at 1:25 PM Lonnie van zyl ***@***.***> wrote:
Great, thank you Mattia,
I appreciate your help in sorting this out. I'll let you know if this
sorts the issue.
Regards
Lonnie
On Wed, May 25, 2022 at 1:17 PM Mattia Pandolfo ***@***.***>
wrote:
> Hi Lonnie,
>
> your Sample names in the metadata.csv are numbers, which cause the
> sample_data() function to add the "sa" string in front. This function is
> used to create the phyloseq object, therefore the error: phyloseq tries to
> match the sample names in your count table (1 2 3 4) to these modified
> names (sa1, sa2, sa3, sa4), failing. I would suggest you to rename the
> samples adding some characters (e.g. sample1, sample2 etc).
>
> Regards
> Mattia
>
> —
> Reply to this email directly, view it on GitHub
> <#44 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AZDP2WMGKBLJGN6JWXS4BCDVLYD47ANCNFSM5VSR3QHQ>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
|
Hi Lonnie, sorry for the late reply, No need to install different distros, the problem was related to the relative abundance filter applied to the phyloseq object, which i fixed. I already updated the code, download again the bin/Rscript folder only and launch MetaPhage (use -resume when you launch it to start from where it stopped). I hope this fix the issue! Regards |
Hi Mattia,
I eventually came to the same conclusion looking at the R script, but I
don't know what I'm doing when it comes to scripting. Glad the problem was
resolved.
Thanks for your help and persistence.
Keep well.
Lonnie
…On Tue, May 31, 2022 at 9:49 AM Mattia Pandolfo ***@***.***> wrote:
Hi Lonnie, sorry for the late reply,
No need to install different distros, the problem was related to the
relative abundance filter applied to the phyloseq object, which i fixed. I
already updated the code, download again the bin/Rscript folder only and
launch MetaPhage (use -resume when you launch it to start from where it
stopped). I hope this fix the issue!
Regards
Mattia
—
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZDP2WNYKCZYSHQRCTHXDVLVMXAB3ANCNFSM5VSR3QHQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
|
Hi Mattia, alpha diversity runs fine, however the heatmap and beta diversity still have issues. The beta diversityy error is as follows: Error executing process > 'betadiversity (megahit)' Caused by: Command executed: Rscript /home/imbm-bioinformatics/Downloads/MetaPhage-0.3.0/bin/Rscript/beta_diversity.R count_table.csv taxonomy_table.csv metadata.csv Site Command exit status: Command output: Command error: Whereas the heatmap error is: Error executing process > 'heatmap (megahit)' Caused by: Command executed: Rscript /home/imbm-bioinformatics/Downloads/MetaPhage-0.3.0/bin/Rscript/heatmap.R count_table.csv taxonomy_table.csv metadata.csv Site Command exit status: Command output: Command error: The violin_plots error is: Error executing process > 'violin_plots (megahit)' Caused by: Command executed: Rscript /home/imbm-bioinformatics/Downloads/MetaPhage-0.3.0/bin/Rscript/violin_plot.R count_table.csv taxonomy_table.csv metadata.csv Site Command exit status: Command output: Command error: Regards |
Hi Lonnie, Your problem may be related to that "cannot open file 'bin/Rscript/filter&CSSnormalize.R': No such file or directory". Regards, |
Hi Mattia, Yes, that is the issue for violin plots, and it's easy enough to fix by editing the R.script file, but the other two are my main concern. Regards |
Hi there, The problem was related to the sourcing of the filtering function. I updated the Rscripts using it, and tested (again) everything. It seems to work flawless, so download again the whole bin/Rscript folder to have the updated version. Let me know if the issue persist. Regards, |
Hi Mattia,
I'll really appreciate your efforts in fixing these issues !! I'll give it
a go and let you know.
Enjoy the evening.
Cheers
Lonnie
…On Wed, Jun 1, 2022 at 6:23 PM Mattia Pandolfo ***@***.***> wrote:
Hi there,
The problem was related to the sourcing of the filtering function. I
updated the Rscripts using it, and tested (again) everything. It seems to
work flawless, so download again the whole bin/Rscript folder to have the
updated version. Let me know if the issue persist.
Regards,
Mattia
—
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZDP2WLRFYOHOTNZ7BUCD5TVM6E7VANCNFSM5VSR3QHQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
Dr. Lonnie van Zyl
Chief Officer
Institute for Microbial Biotechnology and Metagenomics (IMBM)
University of the Western Cape
Bellville
South Africa
|
Hi There,
I get the following error when running MetaPhage even though the kraken2 and krona output files (*_output and *_report as well as *_krak_krona_abundancies.html) are created and the directory structure looks correct. Any idea where it might be going wrong?
Tip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
****Thank you in advance.
Regards
Lonnie
The text was updated successfully, but these errors were encountered: