Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to use different motif repositories? #111

Closed
josruirod opened this issue Mar 12, 2019 · 18 comments
Closed

Is it possible to use different motif repositories? #111

josruirod opened this issue Mar 12, 2019 · 18 comments
Assignees
Labels

Comments

@josruirod
Copy link

josruirod commented Mar 12, 2019

Hi,
Thanks for the work. This software seems indeed really useful and I would like to fully test it.
I see you jaspar_vertebrates motifs are included in the rgtdata/motifs folder. My question is, it's possible to add other motif repositories? For example plants, insects or nematodes from jaspar. I see we could edit the dataconfig and provide more repositores, but I'm unsure on how to download them or where to place them.
I can download the individual PFMs from jaspar (http://jaspar.genereg.net/downloads/) but I'm unsure on how to add them to the folder. The format of the download files is .jaspar, whereas in the folder they are .pwm. Should these be converted with meme suite or similars? (http://meme-suite.org/doc/motif_conversion.html)

Thanks for your support

@fabio-t fabio-t self-assigned this Mar 12, 2019
@fabio-t fabio-t added this to TODO in Motif Analysis via automation Mar 12, 2019
@fabio-t
Copy link
Member

fabio-t commented Mar 12, 2019

Hi @josruirod, the short answer is that yes, it's possible to use additional repositories. In the currently released version, though, it's a bit tricky.

I could explain how to do this step by step, but we've actually already made the process a bit easier. We will release the next version of RGT in a few days, with these changes included.

So if you can wait, I'll let you know when the next release is out and how to add another motif database.

If instead you need it now, let me know and I'll give you the instructions.

@josruirod
Copy link
Author

Hi, great news thanks. I can wait. I'm looking forward to the new release then, thanks

@maltesemike
Copy link

Hi, any update on when this new motif repository implementation will be released? The tool is great and am really keen to implement rgt-hint differential with my own non-animal ATAC-seq data..

@fabio-t
Copy link
Member

fabio-t commented Mar 21, 2019

Hi all, the new RGT 0.12.0 version is in preparation and should be completely by tomorrow.

The initial documentation for adding custom repositories has been added here: https://regulatory-genomics.org/motif-analysis/additional-motif-data/

We also have internal ways to automatically get JASPAR and Hocomoco files and annotation, and to convert it to our own formats. This is still not available in a user-friendly format, but will be soon.

The instructions above should be enough to let you run motif analysis on custom repositories.

One further note: we have added the JASPAR plants database to the list of our included repositories, so you will get that "for free".

Feel free to ask any questions in the meanwhile.

@fabio-t
Copy link
Member

fabio-t commented Mar 21, 2019

Also please note that, after the 0.12.0 release, there might be some "hiccups" since a lot of internal code has changed. We'll be attentive to bug reports and fix them ASAP.

@fabio-t fabio-t mentioned this issue Mar 21, 2019
@davemcg
Copy link

davemcg commented Mar 22, 2019

If anyone is confused it seems you HAVE to give the new motif database right after the rgt-motifanlysis matching call.

For example this works:
rgt-motifanalysis matching --motif-dbs $RGTDATA/motifs/hocomoco --organism=hg19 --input-files INPUT --output-location HINT

this didn't (0.11.8)
rgt-motifanalysis matching --organism=hg19 --input-files INPUT --output-location HINT --motif-dbs $RGTDATA/motifs/hocomoco

@fabio-t
Copy link
Member

fabio-t commented Mar 22, 2019

That sounds like a bug, and one that doesn't seem to persist in the upcoming version. Example of command I run yesterday:

rgt-motifanalysis matching  --filter "database:jaspar_plants" --input-files input/regions_K562.bed input/background.bed --filter-type exact --motif-dbs ~/rgtdata/motifs/jaspar_plants/

@davemcg
Copy link

davemcg commented Mar 22, 2019

Maybe you've fixed it post 0.11.8?

[mcgaugheyd@cn3110 iPSC_RPE_ATAC_Seq]$ rgt-motifanalysis matching --organism=hg19 --motifs-db $RGTDATA/motifs/hocomoco --input-files HINT/GFP_ATAC-Seq.intersect.colors.bed
usage: rgt-motifanalysis [-h] [--version] {matching,enrichment} ...
rgt-motifanalysis: error: unrecognized arguments: --motifs-db /fdb/rgt/rgtdata/motifs/hocomoco

@fabio-t
Copy link
Member

fabio-t commented Mar 22, 2019

You have a typo, the argument is called --motif-dbs (because it can take multiple dbs in one go) :)

It seems to be working fine even in 0.11.8, I've just tested it on the FullSiteTest tutorial data:

rgt-motifanalysis matching --organism hg19 --input-files input/regions_K562.bed input/background.bed --motif-dbs ~/rgtdata/motifs/jaspar_plants

@davemcg
Copy link

davemcg commented Mar 22, 2019

Stupid computers with their literal-ness

@fabio-t
Copy link
Member

fabio-t commented Mar 22, 2019

RGT 0.12.1 is out.

@fabio-t fabio-t closed this as completed Mar 22, 2019
Motif Analysis automation moved this from TODO to Done Mar 22, 2019
@josruirod
Copy link
Author

HI, I have updated rgt and testing this new feature.
I can download individual PFMs from jaspar (http://jaspar.genereg.net/downloads/) but these are in .jaspar format, do you have any comment on how to convert to .pwm? Just removing the header?
Thanks

@fabio-t
Copy link
Member

fabio-t commented Apr 4, 2019

From that page, you should download the Single batch file (txt) JASPAR entry. For example, this is the file for all CORE non-redundant PFMs:

http://jaspar.genereg.net/download/CORE/JASPAR2018_CORE_non-redundant_pfms_jaspar.txt

Then you can use the script we provide in the rgtdata folder (you should have it after the upgrade):

~/rgtdata/motifs/createPwm.py

This is the full command that you should run (for brevity, I've renamed the file as jaspar_file.txt):

python createPwm.py -i jaspar_file.txt -f jaspar-2016 -o jaspar_all

This will create a jaspar_all directory that you can generate the logos for (see docs) and use in RGT via the --motif-dbs argument.

@josruirod
Copy link
Author

josruirod commented Apr 5, 2019

I must be missing something, the following command fails with IOError: No such file or directory

python2 ~/rgtdata/motifs/createPwm.py -i $HOME/Downloads/JASPAR2018_CORE_non-redundant_pfms_jaspar.txt -f jaspar-2016 -o $HOME/Downloads/jaspar_all

Error:
Traceback (most recent call last): File "/home/i7_station/rgtdata/motifs/createPwm.py", line 59, in <module> with open(outputFileName, "w") as f: IOError: [Errno 2] No such file or directory: '/home/i7_station/Downloads/jaspar_all/MA0004.1.Arnt.pwm'

Any insights? Thanks for all the help!

EDIT: It seems that, at least in my case, the folder is not going to be automatically created. If I create manually the output folder then it works as it should. Thanks

@fabio-t
Copy link
Member

fabio-t commented Apr 5, 2019

Does the jaspar_all directory exist? I believe we don't create the path. Will check and eventually fix that by the next release.

@fabio-t
Copy link
Member

fabio-t commented Apr 5, 2019

Confirmed and fixed, you can get the fixed script here:

https://raw.githubusercontent.com/CostaLab/reg-gen/develop/data/motifs/createPwm.py

and simply overwrite it over yours, so you don't have to wait for the next release.

@fabio-t fabio-t reopened this Apr 5, 2019
Motif Analysis automation moved this from Done to In Progress Apr 5, 2019
@fabio-t
Copy link
Member

fabio-t commented Apr 5, 2019

Re-opening, as this may be useful to some other people until the changes are completely "settled in".

@josruirod
Copy link
Author

Thanks, regards

@fabio-t fabio-t closed this as completed Oct 14, 2019
Motif Analysis automation moved this from In Progress to Done Oct 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

4 participants