Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

traitar phenotype #71

Open
lanlan0210 opened this issue Jul 24, 2023 · 8 comments
Open

traitar phenotype #71

lanlan0210 opened this issue Jul 24, 2023 · 8 comments

Comments

@lanlan0210
Copy link

Dear Aaron Weimann,

Thank you for developing Traitar. I got an error message when I ran the software. The data used is the data in the test file you provided.
command: traitar phenotype test samples.txt from_genes output_dir
error: usage: traitar phenotype [-h] [-c CPUS] [-x PARALLEL] [-o]
[-g {genbank,refseq,img,prodigal,metagenemark}]
[-p PRIMARY_MODELS] [-s SECONDARY_MODELS]
[-r REARRANGE_HEATMAP]
[--no_heatmap_sample_clustering]
[--no_heatmap_phenotype_clustering]
[-f {png,pdf,svg,jpg}]
pfam_dir input_dir sample2file
{from_genes,from_nucleotides,from_annotation_summary}
output_dir
traitar phenotype: error: argument mode: invalid choice: 'output_dir' (choose from 'from_genes', 'from_nucleotides', 'from_annotation_summary').
Can you help me see how to solve it? Looking forward to your reply.

@aweimann
Copy link
Owner

Dear Lan,

Thanks for your interest in Traitar. The command to run Traitar using the test data should be something like:
traitar phenotype pfam/ traitar3/tests/data/ traitar3/tests/data/samples.txt from_nucleotides traitar_test -c 2 --overwrite
You need to run traitar pfam first.
Please let me know if this works.

Best regards,

Aaron

@lanlan0210
Copy link
Author

Thank you for your timely reply. The previous problem seems to have been solved, but a new error has appeared.
Uploading image.png…
Looking forward to your reply

@lanlan0210
Copy link
Author

I'm sorry to bother you again. According to your suggestion, I first ran traitar pfam --local pfam/, Then run traitar phenotype pfam/ test1/ test1/samples.txt from_nucleotides traitar_test -c 2 --overwrite. This all seems to be ok, only in the last tip

traitar phenotype pfam/ test1/ test1/samples.txt from_nucleotides traitar_test -c 2 --overwrite2023-07-26 13:43:33,040 - Running annotate as part of predict
2023-07-26 13:43:33,040 - Running gene prediction with Prodigal
2023-07-26 13:43:33,041 - CMD: prodigal -a traitar_test/gene_prediction/Listeria_ivanovii_WSLC3009.faa -f gff < test1//1457190.3.RefSeq.faa > traitar_test/gene_prediction/Listeria_ivanovii_WSLC3009.gff
2023-07-26 13:43:33,182 - CMD: prodigal -a traitar_test/gene_prediction/Listeria_grayi_DSM_20601.faa -f gff < test1//525367.9.RefSeq.faa > traitar_test/gene_prediction/Listeria_grayi_DSM_20601.gff
2023-07-26 13:43:33,323 - Running annotation with hmmer. This step can take a while. A rough estimate for sequential Pfam annotation of genome samples of ~3 Mbs is 10 min per genome.
2023-07-26 13:43:33,324 - Computing hmms
2023-07-26 13:43:33,325 - File exists, but re-creating due to --overwrite: traitar_test/gene_prediction/Listeria_ivanovii_WSLC3009.faa
2023-07-26 13:43:33,325 - File exists, but re-creating due to --overwrite: traitar_test/gene_prediction/Listeria_grayi_DSM_20601.faa
2023-07-26 13:43:33,325 - No. of hmmer commands to run: 2
2023-07-26 13:43:33,325 - CMD: hmmsearch --notextw --cpu 2 --cut_ga --domtblout traitar_test/annotation/pfam/Listeria_ivanovii_WSLC3009_domtblout.dat /home/zhouguilan/temp/aa/test/pfam/Pfam-A.hmm traitar_test/gene_prediction/Listeria_ivanovii_WSLC3009.faa
2023-07-26 13:43:57,340 - CMD: hmmsearch --notextw --cpu 2 --cut_ga --domtblout traitar_test/annotation/pfam/Listeria_grayi_DSM_20601_domtblout.dat /home/zhouguilan/temp/aa/test/pfam/Pfam-A.hmm traitar_test/gene_prediction/Listeria_grayi_DSM_20601.faa
2023-07-26 13:44:50,067 - Filtering hmm hits
2023-07-26 13:44:50,067 - File doesn't exist, so creating: traitar_test/annotation/pfam/Listeria_ivanovii_WSLC3009_filtered_best.dat
2023-07-26 13:44:50,067 - File doesn't exist, so creating: traitar_test/annotation/pfam/Listeria_grayi_DSM_20601_filtered_best.dat
2023-07-26 13:44:50,067 - No. of hmmer filter commands to run: 2
2023-07-26 13:44:50,077 - File written: traitar_test/annotation/pfam/Listeria_ivanovii_WSLC3009_filtered_best.dat
2023-07-26 13:44:50,082 - File written: traitar_test/annotation/pfam/Listeria_grayi_DSM_20601_filtered_best.dat
2023-07-26 13:44:50,082 - Creating a summary matrix
2023-07-26 13:44:50,208 - File written: traitar_test/annotation/pfam/summary.dat
2023-07-26 13:44:50,208 - Running phenotype prediction
2023-07-26 13:44:50,208 - Predicting with primary models
2023-07-26 13:44:51,172 - Running 67 predictions with 2 threads
2023-07-26 13:44:53,161 - File written: traitar_test/predictions_raw.txt
2023-07-26 13:44:53,248 - File written: traitar_test/predictions_single-votes.txt
2023-07-26 13:44:53,251 - File written: traitar_test/predictions_majority-vote.txt
2023-07-26 13:44:53,255 - File written: traitar_test/predictions_conservative-vote.txt
2023-07-26 13:44:53,280 - Predicting with secondary models
2023-07-26 13:44:54,175 - Running 67 predictions with 2 threads
2023-07-26 13:44:56,046 - File written: traitar_test/predictions_raw.txt
2023-07-26 13:44:56,117 - File written: traitar_test/predictions_single-votes.txt
2023-07-26 13:44:56,123 - File written: traitar_test/predictions_majority-vote.txt
2023-07-26 13:44:56,128 - File written: traitar_test/predictions_conservative-vote.txt
2023-07-26 13:44:56,152 - Merging primary & secondary predictions
2023-07-26 13:44:56,192 - File written: traitar_test/phenotype_prediction/predictions_single-votes_combined.txt
2023-07-26 13:44:56,192 - File written: traitar_test/phenotype_prediction/predictions_majority-vote_combined.txt
2023-07-26 13:44:56,196 - File written: traitar_test/phenotype_prediction/predictions_flat_majority-votes_combined.txt
2023-07-26 13:44:56,201 - File written: traitar_test/phenotype_prediction/predictions_flat_single-votes_combined.txt
2023-07-26 13:44:56,201 - Running feature track generation
2023-07-26 13:44:56,201 - hmm2gff currently not supported. SKIPPING!
2023-07-26 13:44:56,201 - Running heatmap generation
2023-07-26 13:44:56,201 - Heatmap currently not supported. SKIPPING!

The resulting file appears to be faulty.
Then I checked the phenotype prediction files and found all the prediction values were zeros:
These results seem similar to the results of #67, So I ran traitar phenotype pfam/ test1/ test1/samples.txt from_genes traitar_test -2 --overwrite as you prompted in #67.
traitar phenotype pfam/ test1/ test1/samples.txt from_genes traitar_test -c 2 --overwrite
2023-07-26 13:51:09,467 - Running annotate as part of predict
2023-07-26 13:51:09,467 - Running annotation with hmmer. This step can take a while. A rough estimate for sequential Pfam annotation of genome samples of ~3 Mbs is 10 min per genome.
2023-07-26 13:51:09,467 - Computing hmms
2023-07-26 13:51:09,468 - File exists, but re-creating due to --overwrite: test1/1457190.3.RefSeq.faa
2023-07-26 13:51:09,468 - File exists, but re-creating due to --overwrite: test1/525367.9.RefSeq.faa
2023-07-26 13:51:09,468 - No. of hmmer commands to run: 2
2023-07-26 13:51:09,468 - CMD: hmmsearch --notextw --cpu 2 --cut_ga --domtblout traitar_test/annotation/pfam/Listeria_ivanovii_WSLC3009_domtblout.dat /home/zhouguilan/temp/aa/test/pfam/Pfam-A.hmm test1/1457190.3.RefSeq.faa
2023-07-26 13:58:56,925 - CMD: hmmsearch --notextw --cpu 2 --cut_ga --domtblout traitar_test/annotation/pfam/Listeria_grayi_DSM_20601_domtblout.dat /home/zhouguilan/temp/aa/test/pfam/Pfam-A.hmm test1/525367.9.RefSeq.faa
2023-07-26 14:06:26,191 - Filtering hmm hits
2023-07-26 14:06:26,191 - File doesn't exist, so creating: traitar_test/annotation/pfam/Listeria_ivanovii_WSLC3009_filtered_best.dat
2023-07-26 14:06:26,191 - File doesn't exist, so creating: traitar_test/annotation/pfam/Listeria_grayi_DSM_20601_filtered_best.dat
2023-07-26 14:06:26,191 - No. of hmmer filter commands to run: 2
2023-07-26 14:06:29,892 - File written: traitar_test/annotation/pfam/Listeria_ivanovii_WSLC3009_filtered_best.dat
2023-07-26 14:06:32,771 - File written: traitar_test/annotation/pfam/Listeria_grayi_DSM_20601_filtered_best.dat
2023-07-26 14:06:32,771 - Creating a summary matrix
PF17762
PF16403
PF17802
....
PF17757
PF17862
2023-07-26 14:06:35,091 - File written: traitar_test/annotation/pfam/summary.dat
2023-07-26 14:06:35,091 - Running phenotype prediction
2023-07-26 14:06:35,091 - Predicting with primary models
2023-07-26 14:06:35,754 - Running 67 predictions with 2 threads
2023-07-26 14:06:37,688 - File written: traitar_test/predictions_raw.txt
2023-07-26 14:06:37,755 - File written: traitar_test/predictions_single-votes.txt
2023-07-26 14:06:37,758 - File written: traitar_test/predictions_majority-vote.txt
2023-07-26 14:06:37,761 - File written: traitar_test/predictions_conservative-vote.txt
2023-07-26 14:06:37,784 - Predicting with secondary models
2023-07-26 14:06:38,663 - Running 67 predictions with 2 threads
2023-07-26 14:06:40,506 - File written: traitar_test/predictions_raw.txt
2023-07-26 14:06:40,575 - File written: traitar_test/predictions_single-votes.txt
2023-07-26 14:06:40,581 - File written: traitar_test/predictions_majority-vote.txt
2023-07-26 14:06:40,584 - File written: traitar_test/predictions_conservative-vote.txt
2023-07-26 14:06:40,616 - Merging primary & secondary predictions
2023-07-26 14:06:40,643 - File written: traitar_test/phenotype_prediction/predictions_single-votes_combined.txt
2023-07-26 14:06:40,644 - File written: traitar_test/phenotype_prediction/predictions_majority-vote_combined.txt
2023-07-26 14:06:40,648 - File written: traitar_test/phenotype_prediction/predictions_flat_majority-votes_combined.txt
2023-07-26 14:06:40,653 - File written: traitar_test/phenotype_prediction/predictions_flat_single-votes_combined.txt
2023-07-26 14:06:40,653 - Running feature track generation
2023-07-26 14:06:40,653 - hmm2gff currently not supported. SKIPPING!
2023-07-26 14:06:40,653 - Running heatmap generation
2023-07-26 14:06:40,653 - Heatmap currently not supported. SKIPPING!

There is still no graph in the final result, but the file's contents are no longer 0. I don't know where the problem is. I hope you can help me solve it, thank you.

@aweimann
Copy link
Owner

Generation of the heat map is not supported at the moment unfortunately. You can use the result tables to make your own version. Sorry about that.

@lanlan0210
Copy link
Author

Ok, thank you for your answer. There is one last problem. Using your example data, there are only phenotypes marked 0 and 3 in the generated Table and flat files(predictions_majority-vote_combined.txt), without 1 and 2, but there will be 1 and 2 in your running results. I don't know whether it is caused by the different versions of Pfam used, but I am using Pfam35.0 now. Or is it something else?

image

@aweimann
Copy link
Owner

aweimann commented Aug 3, 2023

Thanks for flagging this. That's not expected. I need to look into it. It should be run on Pfam 27.0. That's what was used to train the models.

@aweimann
Copy link
Owner

aweimann commented Aug 4, 2023

Hi Lan, I imagine you're using the Traitar3 version through Conda. I found that there is an issue in that version that will mean only the results from the phypat+PGL model will be produced (rather than both phypat and phypat+PGL). At the moment the only fix is to use the original Traitar version if you manage to install it. I will see if this can be sorted out, but this may take a while.

@lanlan0210
Copy link
Author

Thank you very, very much. Indeed, I use Traitar version 3, and I plan to reinstall the original Traitar version. Thank you very much for your prompt and timely reply, which is of great help to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants