Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Screening run tutorial, only one compound predicted #1

Closed
GattiMh opened this issue Jan 27, 2023 · 5 comments
Closed

Screening run tutorial, only one compound predicted #1

GattiMh opened this issue Jan 27, 2023 · 5 comments

Comments

@GattiMh
Copy link

GattiMh commented Jan 27, 2023

Hello,

I have installed and run the tutorial sets both benchmarked and screening. So far so good, I have noticed that in the resulting file database_predcitions.csv I only have 1 molecule from MCule. What are the parameters to tweak to produce more results?

Many thanks

@gioamendola
Copy link
Collaborator

Hi,

The file mcule_sample.smi in the screening tutorial was included for illustrative purposes only. It's very small and only contains one known inhibitor.
Screen larger databases to obtain more potentially active compounds.

For the best parameters to tweak to obtain more results, please read the "Optimizing PyRMD Performance" section in the README of the repository.

@GattiMh
Copy link
Author

GattiMh commented Jan 29, 2023

Thank you.

In the documentation, it mentions you can provide non-Chembl data as well. How does it have to be formatted?
For the Chembl data, it seems PyRMD.py is looking for certain fields like 'Standard Value','Standard Type','Standard Relation', 'Standard Units' ecc ecc. Was wondering if they have to be called the same for the non-Chembl data files as well.

Many thanks

@gioamendola
Copy link
Collaborator

gioamendola commented Jan 29, 2023

PyRMD should be able to read any text-based tabular data files, such as .smi SMILES files and .csv files, as long as they contain a column of SMILES strings. If for some reason the file cannot be read by PyRMD, you could make sure that the file includes a column named "Title" and another named "Smiles". And that the file uses commas as separators

@GattiMh
Copy link
Author

GattiMh commented Jan 30, 2023

Thank you.

Does it read a column for activity as well for the non-Chembl data files or just SMILES strings?

@gioamendola
Copy link
Collaborator

If you tell PyRMD in the configuration file to use non-Chembl data, then it will search only for two values in the tabular file -- a column with SMILES strings and their associated name.

If you want to use activities in a non-Chembl file you'll have to manually format it like a Chembl file. Check the SI of the PyRMD manuscript to know exactly what PyRMD looks for in a Chembl file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants