-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oracle issue - Yuchen #238
Comments
@abearab ^ you can have a look at this |
I'm having the same issue with I wonder whether something changed in sklearn's random forests and their formatting. That being said, I'm using |
The culprit for the discrepancy between lists/individual SMILES is the try-except block in L656 of the implementation of oracles. In other words, the loading of the oracle is failing silently, and thus the oracle returns the default value. So we could try to solve two problems:
I'm happy to volunteer on any of those! |
Hi @miguelgondu , thanks for the find! For clarity, changing the try-except block would only reveal the real error, not fix it. What version of the package are you using? Could you try 0.4.1 ? |
Hi @amva13, Yes! Changing the try-except block only reveals the error. Fixing it would involve checking what changed with the pkl files/their loading, I imagine. I've tried with both 0.4.1 and 0.4.6. Both have the same issue. |
Ok. This was to confirm error is not due to recent release changes. I will be personally inspecting this error starting now. One thing I'd try while I'm looking into it. There might be something to your claim about sklearn==1.3.0 causing a breaking change. I would try building package 0.4.1 in a virtual environment (i.e. conda). 0.4.1 does not specify versions in requirements.txt and this might fix the behavior. |
This error is indeed because of a mismatch in the formatting between the pickle object and the format expected by scikit learn. This is in part due to a version upgrade in scikit. See reverse issue here Evaluating some fixes and will push new version of package asap. EDIT: Downgrading scikit-learn fixes the dtype issue but does not solve the underlying problem. |
Hi @miguelgondu I believe I've solved it. Would you mind sharing some of the input SMILES strings which produced a 0.0 value for these oracles for you? |
Hi @miguelgondu I just pushed the fix and will be releasing the new package now. Will lyk when you can install |
Thanks! Looking forward. |
Just FYI: I'm getting a warning on
|
Got it. Thanks for pointing out. The best solution is to pickle these solutions with a more modern scikit (or invoke the models with a different method entirely to avoid the dependency issues altogether). For now the downgrade seems to work, though that particular classifier came from version 0.23.0.. so not great. I'll flag this is a longer term issue to look at. |
@miguelgondu it's all fixed. you can install 0.4.7 for the working version example: |
Hi @amva13 , thanks for the fix! Checking with the other oracles in that specific version, something seems to break in The rest of the oracles seem to work as expected, except for the ones in the issue I raised recently (#244). Thanks again for the hard work. |
ack'd issue opened |
Describe the bug
Dear TDC Team,
I hope this message finds you well. I am writing to report some technical issues I encountered while utilizing the oracle provided by TDC. Below are the details of the problems:
Problem 1: I have encountered an error after downloading the Oracle with the name "JNK3". The error message is as follows:
ValueError: node array from the pickle has an incompatible dtype:
This problem also occurs with the Oracle named "GSK3". However, the error arises when I attempt to input a list of smiles. Inputting a single smile into GSK3 does not trigger the error.
Problem 2: As mentioned above, inputting a single smile into the Oracle "GSK3" does not result in an error. However, I have tried multiple active molecules targeting GSK3beta from ChEMBL, and the output value from the oracle is consistently 0. This suggests there might be an issue with the "GSK3" oracle that requires your attention.
I hope you can address these issues promptly. Please let me know if you need any further information or details from my end.
Best regards,
Yuchen
The text was updated successfully, but these errors were encountered: