Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summarizeFasta not cannot find variant #672

Closed
lydiayliu opened this issue Feb 1, 2023 · 2 comments · Fixed by #673
Closed

summarizeFasta not cannot find variant #672

lydiayliu opened this issue Feb 1, 2023 · 2 comments · Fixed by #673
Labels
bug Something isn't working priority: now Issue to be fixed immediately

Comments

@lydiayliu
Copy link
Collaborator

This is my attempt to run CPCG with parser entry instead of GVF entry (too lazy to reparse alllllll the files manually with 0.11.1). Interesting error that we've never seen lol.

    Command output:
      [ 2023-02-01 07:25:34 ] moPepGen summarizeFasta started
      [ 2023-02-01 07:27:15 ] Reference indices loaded.
    
    Command error:
      [ 2023-02-01 07:25:34 ] moPepGen summarizeFasta started
      [ 2023-02-01 07:27:15 ] Reference indices loaded.
      Traceback (most recent call last):
        File "/usr/local/lib/python3.8/site-packages/moPepGen/aa/VariantPeptideLabel.py", line 323, in get_source
          return self.data[gene_id][var_id]
      KeyError: 'ENSG00000231764.10'
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
        File "/usr/local/bin/moPepGen", line 8, in <module>
          sys.exit(main())
        File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/__main__.py", line 89, in main
          args.func(args)
        File "/usr/local/lib/python3.8/site-packages/moPepGen/cli/summarize_fasta.py", line 123, in summarize_fasta
          summarizer.count_peptide_source(anno, args.cleavage_rule)
        File "/usr/local/lib/python3.8/site-packages/moPepGen/aa/PeptidePoolSummarizer.py", line 194, in count_peptide_source
          self.summary_table.add_entry(seq, self.label_map, anno, enzyme)
        File "/usr/local/lib/python3.8/site-packages/moPepGen/aa/PeptidePoolSummarizer.py", line 86, in add_entry
          peptide_labels = VariantPeptideInfo.from_variant_peptide(
        File "/usr/local/lib/python3.8/site-packages/moPepGen/aa/VariantPeptideLabel.py", line 191, in from_variant_peptide
          source = label_map.get_source(gene_id, var_id)
        File "/usr/local/lib/python3.8/site-packages/moPepGen/aa/VariantPeptideLabel.py", line 325, in get_source
          raise err.VariantSourceNotFoundError(gene_id, var_id) from e
      moPepGen.err.VariantSourceNotFoundError: Variant source not found transcript [ENSG00000231764.10] variant [ENSG00000231764.10]. Please verify all GVF files are imported
    
    Work dir:
      work/6a/4493b9fdd44f3a349d40c4f97535da
    
    Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

I actually think this is because the FASTA header of callNoncoding has changed??? When I ran the first 10 samples without the noncoding_peptides.fasta it was perfectly fine.

@lydiayliu lydiayliu added bug Something isn't working priority: now Issue to be fixed immediately labels Feb 1, 2023
@lydiayliu
Copy link
Collaborator Author

I reran this with a working directory on /hot/, and here it is
/hot/project/method/AlgorithmDevelopment/ALGO-000074-moPepGen/CPCGENE/processed/noncanonical-database/call-nonCanonicalPeptide/work_pipe/eb/809f9aab0542e1201f86114b5720ab

Still same issue!

@zhuchcn
Copy link
Member

zhuchcn commented Feb 2, 2023

I believe this is because the sequence fasta header for noncoding peptides got updated in the last PR so the gene ID is now part of it. And the gene ID is somehow treated as a variant label mistakenly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority: now Issue to be fixed immediately
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants