Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BCUT 2D descriptors #2957

Merged
merged 13 commits into from
Feb 23, 2020
Merged

Conversation

bp-kelley
Copy link
Contributor

Preliminary BCUT descriptor implementation.

@greglandrum I'm not quite sure how to test this, there are a few different implementation styles and we won't have the same diagonal elements. Any ideas?

I'm leaning towards:

  1. atomic mass
  2. Crippen LogP
  3. Gasteiger Charge

And maybe using a lookup table for polarizability.

One question is should we do absolute or carbon relative values?

@bp-kelley
Copy link
Contributor Author

Pat Walters suggested I just try it on his ML test sets and if it is discriminatory, we call it a day.

// copyright notice, this list of conditions and the following
// disclaimer in the documentation and/or other materials provided
// with the distribution.
// * Neither the name of Institue of Cancer Research.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you don't mean "ICR" here, right?
Maybe just put the usual boilerplate after the copyright?

//  This file is part of the RDKit.
//  The contents are covered by the terms of the BSD license
//  which is included in the file license.txt, found at the root
//  of the RDKit source tree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don’t.

@bp-kelley
Copy link
Contributor Author

@greglandrum It looks like this pulls in Eigen for the 2D descriptors which is failing a couple of builds. Any ideas?

@greglandrum
Copy link
Member

yeah, I had pointed out the new dependency as something we should think about in the review I started (but didn't finish) yesterday.
These problems are because we don't install eigen as part of the config for the java or cartridge builds. We will need to do that if we decide to enable these descriptors by default

Copy link
Member

@greglandrum greglandrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a round of comments

Code/GraphMol/Descriptors/BCUT.cpp Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.cpp Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.cpp Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.cpp Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.cpp Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.cpp Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.cpp Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/BCUT.h Outdated Show resolved Hide resolved
Code/GraphMol/Descriptors/testBCUT.cpp Show resolved Hide resolved
@greglandrum
Copy link
Member

@bp-kelley : when you're ready for the pre-merge review on this please just remove the [WIP] from the title.

python::class_<std::pair<double, double> >("BCUTPair")
.def_readwrite("first", &std::pair<double, double>::first)
.def_readwrite("second", &std::pair<double, double>::second)
.def_readwrite("higest", &std::pair<double, double>::first)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think including highest (which is mis-spelled here) and lowest add anything

docString.c_str());

python::class_<std::pair<double, double> >("BCUTPair")
.def_readwrite("first", &std::pair<double, double>::first)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why def_readwrite? does it make sense to change the values once they've been calculated in Python?

@bp-kelley bp-kelley changed the title [WIP] Add BCUT 2D descriptors Add BCUT 2D descriptors Feb 21, 2020
@bp-kelley
Copy link
Contributor Author

@greglandrum I think I'm done here

@@ -396,6 +400,54 @@ RDKit::SparseIntVect<std::uint32_t> *MorganFingerprintHelper(
}
return res;
}

std::pair<double,double> BCUT2D_list(const RDKit::ROMol &m, python::list atomprops)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this block also needs to be in an #ifdef RDK_HAS_EIGEN3 or it won't compile

@bp-kelley
Copy link
Contributor Author

@greglandrum the build failure doesn't look real.

@greglandrum
Copy link
Member

Yep; agreed. Please feel free to merge (it won’t let me merge on my phone of a test has failed)

@greglandrum greglandrum added this to the 2020_03_1 milestone Feb 23, 2020
@greglandrum greglandrum merged commit 96353c3 into rdkit:master Feb 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants