Feature request: Residue groups #178

frodofine · 2018-06-28T14:46:59Z

I believe there is a need for residue groups in the Frame class to store data such as chain IDs and other related information in Biologic systems (I'm sure other applications could benefit here, but I do not know them). Such an implementation could look like:

class ResidueGroup {
    sorted_set<size_t> residue_ids_
    property_map prop_map_
public:
    set / get - property
    check if a residue is in the group
    add / residues from the group
}

Thoughts?

The text was updated successfully, but these errors were encountered:

Luthaf · 2018-07-01T12:34:50Z

I am a bit worried about proliferation of ad-hoc structures here, with both Residue and ResidueGroup being loosely integrated with the Frame (both only store index of atoms/residues in a separated frame).

What is made available by ResidueGroup that could not be done with a chain_id string property on atoms/residues? If one need the ability to get all atoms with a given chain_id, maybe extending selections to work with properties would be another solution.

If there is a real need for ResidueGroup as a separated class, maybe we can re-organize the code to make the whole ResiudeGroup -> Residue -> Atom clearer.

On this matter, see https://khinsen.wordpress.com/2013/04/10/lessons-from-sixteen-years-of-molecular-simulation-in-python/, at the end he says he would prefer having only Atom and Group instead of the whole Atoms, Groups, Molecules, Complexes, AminoAcidResidue, PeptideChain, and Protein. He does not go on to say why, but I would guess less types of groups make the code more general and easier to understand.

frodofine · 2018-07-01T16:08:53Z

I really like the selection based idea, although I would have no idea how to implement it...

Luthaf · 2018-07-02T12:26:26Z

Just to be sure, you mean that a Selection based solution would fulfil your needs here?

On the implementation side, the selection engine is basically a simple interpreter for a small language.

The main matching happens in one of the Selector subclass

chemfiles/include/chemfiles/selections/expr.hpp

Lines 20 to 30 in 26ea194

    
           /// Abstract base class for selectors in the selection AST 
        
           class Selector { 
        
           public: 
        
               /// Pretty-printing of this selector. The output should use a shift 
        
               /// of `delta` spaces in case of multilines output. 
        
               virtual std::string print(unsigned delta = 0) const = 0; 
        
               /// Check if the `match` is valid in the given `frame`. 
        
               virtual bool is_match(const Frame& frame, const Match& match) const = 0; 
        
               /// Optimize the AST corresponding to this Selector. Currently, this only 
        
               /// perform constant propgations in mathematical expressions. 
        
               virtual void optimize() {}

The main function here is is_match, that check if a given Match is valid for the current selection in a specific frame.

Creating the Selector from the selection string is the job of the parser (src/selections/parser.cpp), which is a recursive descent parser. To add support for properties in selections, one would need to add a new case in Parser::selector().

Adding support for generic property would be hard, so we could start with only string properties.

On the Selector side, this would mean adding a new StringSelector implementation that check if the property exist, if it is a string and if it matches the required value.

On the parser side, the code would depend on the desired syntax. We could add support for chain_id specifically by adding a new chain_id selector (chain_id == A, chain_id(#2) != A) which we could just add in the list of string selectors:

chemfiles/src/selections/parser.cpp

Lines 19 to 23 in 26ea194

    
           static std::map<std::string, string_prop_creator_t> STRING_PROPERTIES = { 
        
               {"name", [](std::string value, bool equals, Variable var){ return Ast(new Name(value, equals, var));}}, 
        
               {"type", [](std::string value, bool equals, Variable var){ return Ast(new Type(value, equals, var));}}, 
        
               {"resname", [](std::string value, bool equals, Variable var){ return Ast(new Resname(value, equals, var));}}, 
        
           };

A more generic solution for arbitrary property names and types could be nice, but is much harder to achieve. Syntax wise, I would go for something like property("long name", #1) == 45.0 or property("name", #3) == "foo" or property("bool"). An alternative syntax like [long name](#1) == 45.0 or [name](#3) == "foo" or [bool] could work too.

frodofine · 2018-07-02T16:01:07Z

Yes, a selection mechanism would satisfy my needs (in my own project I build the residue group myself anyway, I was really just suggesting to move that upstream).

I could add the selector part my self as it looks easy enough, but I need to finish up some other projects before I can look at this more in detail.

frodofine · 2018-11-11T17:33:02Z

This issue is closed, with the above PRs.

frodofine self-assigned this Jun 28, 2018

Luthaf mentioned this issue Jul 25, 2018

Improved PDB Support #185

Merged

Luthaf mentioned this issue Aug 29, 2018

Use atomic properties in selections #189

Merged

4 tasks

frodofine mentioned this issue Sep 15, 2018

Iterate over properties #191

Merged

frodofine closed this as completed Nov 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Residue groups #178

Feature request: Residue groups #178

frodofine commented Jun 28, 2018

Luthaf commented Jul 1, 2018

frodofine commented Jul 1, 2018

Luthaf commented Jul 2, 2018

frodofine commented Jul 2, 2018

frodofine commented Nov 11, 2018

Feature request: Residue groups #178

Feature request: Residue groups #178

Comments

frodofine commented Jun 28, 2018

Luthaf commented Jul 1, 2018

frodofine commented Jul 1, 2018

Luthaf commented Jul 2, 2018

frodofine commented Jul 2, 2018

frodofine commented Nov 11, 2018