Skip to content
This repository has been archived by the owner on Nov 28, 2023. It is now read-only.

Update atom densities feature calculation #101

Merged
merged 15 commits into from
Sep 18, 2019

Conversation

CunliangGeng
Copy link
Member

@CunliangGeng CunliangGeng commented Sep 16, 2019

To solve issue #92.
The new code calculates atom densities based on elements. The default is C, N, O and S. Thus both backbone and sidechain are used for feature calculation.

NOTE: Please merge this RP #101 after merging the RP #98

@coveralls
Copy link

coveralls commented Sep 16, 2019

Pull Request Test Coverage Report for Build 570

  • 22 of 25 (88.0%) changed or added relevant lines in 5 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 74.874%

Changes Missing Coverage Covered Lines Changed/Added Lines %
deeprank/tools/pdb2sql.py 12 13 92.31%
deeprank/generate/GridTools.py 5 7 71.43%
Totals Coverage Status
Change from base Build 566: 0.2%
Covered Lines: 2226
Relevant Lines: 2973

💛 - Coveralls

Copy link
Member

@NicoRenaud NicoRenaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty nice, maybe we can try to use Mendeleev for all the vdw radius


# atom vdw radius
# https://en.wikipedia.org/wiki/Van_der_Waals_radius
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use something else than Wikipedia ?
The python package Mendeleev has vdw radius for all atoms I think

Copy link
Member Author

@CunliangGeng CunliangGeng Sep 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need vdw_radius for only five chemical elements, maybe too heavy to import a package to do that.
Changed the reference to a book.

@@ -802,7 +803,7 @@ def map_features(self, grid_info={},
>>> grid_info = {
>>> 'number_of_points' : [30,30,30],
>>> 'resolution' : [1.,1.,1.],
>>> 'atomic_densities' : {'CA':3.5,'N':3.5,'O':3.5,'C':3.5},
>>> 'atomic_densities' : {'C':1.7, 'N':1.55, 'O':1.52, 'S':1.8},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why half of the radius ? Did I use the diameter ? There also we could use Mendeleev ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Mendeleev has the same VDW values as what I'm using.

@@ -35,8 +35,8 @@ def __init__(self, molgrp,
number_of_points(int, optional): number of points we want in
each direction of the grid.
resolution(float, optional): distance(in Angs) between two points.
atomic_densities(dict, optional): dictionary of atom types with
their vdw radius, e.g. {'CA':1.7, 'C':1.7, 'N':1.55, 'O':1.52}
atomic_densities(dict, optional): dictionary of element types with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atomType is the default name in the PDB file format no ? I have nothing against elementtype but I don't get why changing it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code calculates atomic densities based on atom types, e.g. CA, CB, C, N and O. It is not a problem for protein backbone which has four atom types (CA, C, N and O). But it will be a problem when considering side chain that has about 40 atom types, which means we have to provide a dictionary for all these atom types with their VDW radius as input if using the old code. Different force fields may use slightly different atom types, which make the situation worse.

To solve this problem, the new code calculates atomic densities based on chemical element types. The 20 amino acids have only five element types (C, N, O, S, H).

So we have to change the name to distinguish the difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@@ -113,7 +113,8 @@ def _create_sql(self):
'y': 'REAL',
'z': 'REAL',
'occ': 'REAL',
'temp': 'REAL'
'temp': 'REAL',
'element': 'TEXT'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this element is but we should also use it in pdb2sql then ... and also use pdb2sql in deeprank :)

Copy link
Member Author

@CunliangGeng CunliangGeng Sep 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The element here is chemical element.
OK, will update pdb2sql and use it in deeprank.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using pdb2sql will maybe not work out of the box as some functionalities ma be slightly different ... so be careful there

@NicoRenaud
Copy link
Member

If you fix the conflicts it's all good for me and we can merge

@CunliangGeng CunliangGeng merged commit e53f60b into development Sep 18, 2019
@CunliangGeng CunliangGeng deleted the issue92_atomdensity branch September 18, 2019 08:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants