different result about dataset query #10

Laser-Cho · 2020-06-15T15:17:07Z

I tried to reproduce with the same dataset (chemble22) that the author said was used in the paper by referring to the code created by the you, but the results are different.

I tried below.

SELECT DISTINCT canonical_smiles FROM compound_structures WHERE molregno IN ( SELECT DISTINCT molregno FROM activities WHERE standard_type IN ("Kd", "Ki", "Kb", "IC50", "EC50") AND standard_units = "nM" );
result is [Result: 802320 rows]

Author said "dataset of 677,044 SMILES strings with annotated nanomolar activities(Kd/i/B, IC/EC50) from ChEMBL22 "

So I use Chembl22, and
insert [standard_units = "nM"] for "nanomolar" ,
and [standard_type IN ("Kd", "Ki", "Kb", "IC50", "EC50")] for "activities(Kd/i/B, IC/EC50)"

what I missed?

The text was updated successfully, but these errors were encountered:

topazape · 2020-07-01T03:01:16Z

Hi, @Laser-Cho,

Sorry for the late reply.

You are right, I am aware that the number of molecules used in the paper does not match the number of molecules that can be obtained in the SQL query described in the README.md.
However, I don't know the correct SQL query because the paper doesn't give that.
If you have any good ideas, please let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

different result about dataset query #10

different result about dataset query #10

Laser-Cho commented Jun 15, 2020

topazape commented Jul 1, 2020

different result about dataset query #10

different result about dataset query #10

Comments

Laser-Cho commented Jun 15, 2020

topazape commented Jul 1, 2020