Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Motif GM.5.0.bHLH.0126 does not occur in motif database when running maelstrom #192

Closed
shangguandong1996 opened this issue Jun 11, 2021 · 7 comments

Comments

@shangguandong1996
Copy link

Dear developer

I am running maelstrom using the test data hg19.blood.most_variable.1k.txt
my hg19 genome is from

sgd@localhost ~/reference/genome/hg19
$ genomepy install hg19 --annotation -g .
Downloading genome from UCSC.
Target URL: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz...
Download: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████| 905M/905M [01:54<00:00, 8.30MB/s]
Genome download successful, starting post processing...

name: hg19
local name: hg19
fasta: /data/sgd_data/reference/genome/hg19/hg19/hg19.fa
Downloading annotation from UCSC.
Target URL: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/genes/hg19.knownGene.gtf.gz...
Download: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 17.1M/17.1M [00:03<00:00, 4.53MB/s]
Annotation download successful
sgd@localhost ~/test
$ gimme maelstrom hg19.blood.most_variable.1k.txt ~/reference/genome/hg19/hg19/hg19.fa maelstrom.blood.1k.out -N 60
2021-06-11 15:23:06,610 - INFO - Starting maelstrom
2021-06-11 15:23:06,617 - INFO - Input is not mean-centered, setting the mean of all rows to 0.
2021-06-11 15:23:06,617 - INFO - Use --nocenter if you know what you're doing and want to change this behavior.
2021-06-11 15:23:06,617 - INFO - Note that if you use count data (ChIP-seq, ATAC-seq) we recommend to first transform your data, for instance using log2(), and to normalize between samples. To create a table suitable for maelstrom you can use the coverage_table script included with GimmeMotifs.
2021-06-11 15:23:06,632 - INFO - Counts, using: maelstrom.blood.1k.out/motif.count.txt.gz
2021-06-11 15:23:06,632 - INFO - motif scanning (scores)
2021-06-11 15:23:06,632 - INFO - reading table
2021-06-11 15:23:12,367 - INFO - creating score table (z-score, GC%)
2021-06-11 15:45:27,389 - INFO - done
2021-06-11 15:45:28,815 - INFO - creating dataframe
2021-06-11 15:45:38,405 - INFO - Selecting non-redundant motifs
2021-06-11 15:45:43,852 - INFO - Selected 657 motifs
2021-06-11 15:45:43,852 - INFO - Motifs: maelstrom.blood.1k.out/nonredundant.motifs.pfm
2021-06-11 15:45:43,852 - INFO - Factor mappings: maelstrom.blood.1k.out/nonredundant.motifs.motif2factors.txt
2021-06-11 15:45:44,024 - INFO - Fitting BayesianRidge
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00,  2.46it/s]
2021-06-11 15:45:46,488 - INFO - Done
2021-06-11 15:45:46,757 - INFO - Fitting XGBoostRegression
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:24<00:00,  4.15s/it]
2021-06-11 15:46:11,661 - INFO - Done
2021-06-11 15:46:11,844 - INFO - Fitting MultiTaskLasso
2021-06-11 15:46:14,602 - INFO - Done
2021-06-11 15:46:14,878 - INFO - Fitting SVR
2021-06-11 15:46:22,090 - INFO - Done
2021-06-11 15:46:22,111 - INFO - Rank aggregation
2021-06-11 15:46:23,020 - INFO - Correlation
2021-06-11 15:46:24,321 - INFO - html report
Traceback (most recent call last):
  File "/opt/sysoft/Python-3.7.9/bin/gimme", line 11, in <module>
    cli(sys.argv[1:])
  File "/opt/sysoft/Python-3.7.9/lib/python3.7/site-packages/gimmemotifs/cli.py", line 730, in cli
    args.func(args)
  File "/opt/sysoft/Python-3.7.9/lib/python3.7/site-packages/gimmemotifs/commands/maelstrom.py", line 45, in maelstrom
    aggregation=aggregation,
  File "/opt/sysoft/Python-3.7.9/lib/python3.7/site-packages/gimmemotifs/maelstrom.py", line 546, in run_maelstrom
    maelstrom_html_report(outdir, os.path.join(outdir, "final.out.txt"), pfmfile)
  File "/opt/sysoft/Python-3.7.9/lib/python3.7/site-packages/gimmemotifs/report.py", line 868, in maelstrom_html_report
    motif_to_img_series(df.index, pfmfile=pfmfile, outdir=outdir, subdir="logos"),
  File "/opt/sysoft/Python-3.7.9/lib/python3.7/site-packages/gimmemotifs/report.py", line 837, in motif_to_img_series
    raise ValueError(f"Motif {motif} does not occur in motif database")
ValueError: Motif GM.5.0.bHLH.0126 does not occur in motif database

Best wishes

Guandong Shang

@simonvh
Copy link
Member

simonvh commented Jun 17, 2021

Thanks for this bug report! Just to be sure, this is version 0.16? We'll check to see if we can reproduce.

@shangguandong1996
Copy link
Author

I am sorry I do not post the version:)

sgd@localhost ~
$ gimme
usage: gimme [-h] <subcommand> [options]

    GimmeMotifs v0.16.0
    

positional arguments:
  {motifs,scan,maelstrom,match,logo,cluster,background,threshold,location,diff,prediction,motif2factors}
    motif2factors       Generate a motif2factors file based on orthology for
                        your species of interest.

optional arguments:
  -h, --help            show this help message and exit

    commands:
        motifs          identify enriched motifs (known and/or de novo)
        scan            scan for known motifs
        maelstrom       find differential motifs
        match           find motif matches in database
        logo            create sequence logo(s)
        cluster         cluster similar motifs
        background      create a background file
        threshold       calculate motif scan threshold
        location        motif location histograms
        diff            compare motif frequency and enrichment
                        between fasta files
        motif2factors   generate a motif database based on orthology for any
                        species

    type `gimme <command> -h` for more details
    

@fmarletaz
Copy link

fmarletaz commented Jun 19, 2021

Hi -
I am also encountering the same error using a custom dataset (I actually had the same error trying with both peak category and peak coverage type of inputs):

2021-06-19 16:19:01,513 - INFO - Rank aggregation
2021-06-19 16:19:01,885 - INFO - Correlation
2021-06-19 16:19:02,304 - INFO - html report
Traceback (most recent call last):
  File "/home/ferdi/miniconda3/envs/gimme/bin/gimme", line 11, in <module>
    cli(sys.argv[1:])
  File "/home/ferdi/miniconda3/envs/gimme/lib/python3.9/site-packages/gimmemotifs/cli.py", line 730, in cli
    args.func(args)
  File "/home/ferdi/miniconda3/envs/gimme/lib/python3.9/site-packages/gimmemotifs/commands/maelstrom.py", line 33, in maelstrom
    run_maelstrom(
  File "/home/ferdi/miniconda3/envs/gimme/lib/python3.9/site-packages/gimmemotifs/maelstrom.py", line 546, in run_maelstrom
    maelstrom_html_report(outdir, os.path.join(outdir, "final.out.txt"), pfmfile)
  File "/home/ferdi/miniconda3/envs/gimme/lib/python3.9/site-packages/gimmemotifs/report.py", line 868, in maelstrom_html_report
    motif_to_img_series(df.index, pfmfile=pfmfile, outdir=outdir, subdir="logos"),
  File "/home/ferdi/miniconda3/envs/gimme/lib/python3.9/site-packages/gimmemotifs/report.py", line 837, in motif_to_img_series
    raise ValueError(f"Motif {motif} does not occur in motif database")
ValueError: Motif GM.5.0.C2H2_ZF.0171 does not occur in motif database

Thanks a lot!!

@simonvh
Copy link
Member

simonvh commented Jun 21, 2021

Thanks for reporting this @shangguandong1996 and @fmarletaz. This should now be fixed in the develop branch. Until 0.16.1 is released, you can run the following command in your conda environment to install the fix:

pip install git+https://github.com/vanheeringen-lab/gimmemotifs.git@develop

@fmarletaz
Copy link

fmarletaz commented Jun 24, 2021 via email

@tzhu-bio
Copy link

Thanks for reporting this @shangguandong1996 and @fmarletaz. This should now be fixed in the develop branch. Until 0.16.1 is released, you can run the following command in your conda environment to install the fix:

pip install git+https://github.com/vanheeringen-lab/gimmemotifs.git@develop

I am encountering the same error.
How I can install the 0.16.1 version? I try it with conda, but it did not work for me.

@simonvh
Copy link
Member

simonvh commented Nov 5, 2021

Sorry for getting back to you so late @tzhu-bio
Can you open a new issue with the details? We would need to see the exact error message you get and any other information that may help us to debug your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants