-
Notifications
You must be signed in to change notification settings - Fork 27
Update atom densities feature calculation #101
Conversation
Pull Request Test Coverage Report for Build 570
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty nice, maybe we can try to use Mendeleev for all the vdw radius
deeprank/config/chemicals.py
Outdated
|
||
# atom vdw radius | ||
# https://en.wikipedia.org/wiki/Van_der_Waals_radius |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use something else than Wikipedia ?
The python package Mendeleev has vdw radius for all atoms I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need vdw_radius for only five chemical elements, maybe too heavy to import a package to do that.
Changed the reference to a book.
@@ -802,7 +803,7 @@ def map_features(self, grid_info={}, | |||
>>> grid_info = { | |||
>>> 'number_of_points' : [30,30,30], | |||
>>> 'resolution' : [1.,1.,1.], | |||
>>> 'atomic_densities' : {'CA':3.5,'N':3.5,'O':3.5,'C':3.5}, | |||
>>> 'atomic_densities' : {'C':1.7, 'N':1.55, 'O':1.52, 'S':1.8}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why half of the radius ? Did I use the diameter ? There also we could use Mendeleev ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Mendeleev has the same VDW values as what I'm using.
@@ -35,8 +35,8 @@ def __init__(self, molgrp, | |||
number_of_points(int, optional): number of points we want in | |||
each direction of the grid. | |||
resolution(float, optional): distance(in Angs) between two points. | |||
atomic_densities(dict, optional): dictionary of atom types with | |||
their vdw radius, e.g. {'CA':1.7, 'C':1.7, 'N':1.55, 'O':1.52} | |||
atomic_densities(dict, optional): dictionary of element types with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atomType is the default name in the PDB file format no ? I have nothing against elementtype but I don't get why changing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old code calculates atomic densities based on atom types, e.g. CA, CB, C, N and O. It is not a problem for protein backbone which has four atom types (CA, C, N and O). But it will be a problem when considering side chain that has about 40 atom types, which means we have to provide a dictionary for all these atom types with their VDW radius as input if using the old code. Different force fields may use slightly different atom types, which make the situation worse.
To solve this problem, the new code calculates atomic densities based on chemical element types. The 20 amino acids have only five element types (C, N, O, S, H).
So we have to change the name to distinguish the difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
@@ -113,7 +113,8 @@ def _create_sql(self): | |||
'y': 'REAL', | |||
'z': 'REAL', | |||
'occ': 'REAL', | |||
'temp': 'REAL' | |||
'temp': 'REAL', | |||
'element': 'TEXT' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what this element is but we should also use it in pdb2sql then ... and also use pdb2sql in deeprank :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The element here is chemical element.
OK, will update pdb2sql and use it in deeprank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using pdb2sql will maybe not work out of the box as some functionalities ma be slightly different ... so be careful there
If you fix the conflicts it's all good for me and we can merge |
To solve issue #92.
The new code calculates atom densities based on elements. The default is C, N, O and S. Thus both backbone and sidechain are used for feature calculation.
NOTE: Please merge this RP #101 after merging the RP #98