clean up extraneous edges in BioPAX import #14

goodb · 2018-09-04T16:47:06Z

examples in the pathway: http://noctua-dev.berkeleybop.org/editor/graph/gomodel:-80976963

Only make an inference about which has_input edge should be enabled_by, when there’s no controller. For example, the reaction “phosphorylation of LRP5/6 cytoplasmic domain by CSNKI” currently has multiple enabled_by edges, and it should only have one, pointing to CSNKI (P78368)
There are extraneous “enabled_by (protein-containing complex)” triplets. Same example as Fix auto-generated coordinates - with folding in Noctua in mind. #1 above. I’m guessing this comes from the reasoner. These should be removed.
When there’s a GO term from Reactome, you can remove the placeholder root molecular function (GO:0003674). Right now there are two function terms for the same activity in these cases, same example as 1 above.

goodb · 2018-09-05T18:42:38Z

@thomaspd The extraneous “enabled_by (protein-containing complex)” statements you see in Noctua aren't actually asserted anywhere or inferred by a reasoner, they are an artifact of how the Noctua display works and the lack of a reference ontology for protein complexes. Noctua really needs the nodes in the models to be annotated with OWL classes and we have no classes for most complexes.

To make this work for new complexes, the converter currently adds 2 rdf:types to each new complex node. It first adds the generic type 'protein-containing complex' which is not incorrect and is needed for downstream reasoning. Next it adds a kind of fake class (in that its not represented in any ontology and not useful for inference) with a name like "WNT:FZD:p5S/T-LRP5/6:DVL:AXIN:GSK3B". The only purpose for this one at the moment is to show a label in Noctua. Noctua shows all direct types in the display, hence you see both.

For proteins, Noctua loads the Neo ontology which instantiates a class for each protein it knows about and allows the display to function. If we want to use Noctua for dealing with complexes we either need to have them represented in something like Neo in advance of going into the editor or we need a way to create the needed classes on the fly or in the models that get imported. Right now any new classes (additions to the 'tbox') in new models are ignored.

goodb · 2018-09-05T22:29:00Z

Moved last point to new issue #20

goodb · 2018-09-06T21:06:27Z

@cmungall what is your take on how to represent complexes in GO-CAMs ? This needs a decision so that my code, curators entering new complexes, and the UI can move forward effectively. Specifically, when a new complex comes into a model that does not have an associated class in neo, (e.g. in the above WNT:FZD:p5S/T-LRP5/6:DVL:AXIN:GSK3B) do we need to generate a new class? If so, how should we approach that? If not, how should we give the entity a visible name in the editor and assure that the system knows what it is for reasoning purposes.

For the short term, I suggest the latter approach. Tag the entity with a generic 'protein complex' class, do not create a new class, have the UI use the RDFS:label to accept and display an informative name, and add the parts of the complex using the has_part relationship such that if and when there are logically defined complex classes in the GO or elsewhere a more specific classification can be inferred automatically. The only code required would be logic in the Noctua UI to show the label instead of 'protein complex' for the nodes in question.

cmungall · 2018-09-06T21:18:20Z

s/protein complex/macromolecular complex/

I agree, but I don't know if there is a need to make an rdfs:label (and this adds complexities, e.g. what if a user wants to change this). I would just do a pure post-composition approach.

I can see a use for a generic name-this-individual function (whether for complexes or any other subgraph), perhaps using DOSDPs. But is the immediate need not somewhat subsumed by having the visuals fold/compact these? (which may not always work, but that is something to be fixed separately)

goodb · 2018-09-06T21:31:04Z

There is still a need to refer to the complex individual in the folded view. See for example, the reaction/function 'Autoglucosylation of GYG2 complexed with GYS2-b' http://noctua-dev.berkeleybop.org/editor/graph/gomodel:-670700788

Its more informative and natural for a user to see enabled_by "GYG2:GYS2-b tetramer" than it is to see enabled_by 'protein complex'.

(But yes, this also brings up the problem of folding in complexes with parts into single function nodes which i have added to geneontology/noctua#581 .)

goodb · 2018-09-06T21:34:45Z

I guess if you really want to avoid the label (which the UI already supports to some extent via geneontology/noctua#536 ) UI work could be moved over to the folding/expansion problem. Users could either see the generic complex node and click to expand to see what it was made out of or there could be a way to dynamically compose a name to show based on the names of the parts.

goodb · 2018-09-10T18:46:15Z

Switching off the UI optimizations for Noctua 1.0 flag addresses: "There are extraneous “enabled_by (protein-containing complex)” triplets". There will now only be one type shown for these complexes, but its going to be the generic, but ontologically extant 'protein-containing complex'. The linguistically meaningful but logically useless fake classes are out per discussion above with @cmungall . Showing a name for these new unidentified complexes is now going to be a UI problem (which can be approached based on inspection of their parts which do have names).

goodb · 2018-09-10T18:50:08Z

Closing - data issues resolved, UI issues e.g. geneontology/noctua#581 (comment) remain.

goodb self-assigned this Sep 4, 2018

goodb pushed a commit that referenced this issue Sep 10, 2018

Adjusted query for inferring enabled_by per first part of #14

66f982c

goodb closed this as completed Sep 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clean up extraneous edges in BioPAX import #14

clean up extraneous edges in BioPAX import #14

goodb commented Sep 4, 2018 •

edited

Loading

goodb commented Sep 5, 2018

goodb commented Sep 5, 2018

goodb commented Sep 6, 2018

cmungall commented Sep 6, 2018

goodb commented Sep 6, 2018

goodb commented Sep 6, 2018

goodb commented Sep 10, 2018

goodb commented Sep 10, 2018

clean up extraneous edges in BioPAX import #14

clean up extraneous edges in BioPAX import #14

Comments

goodb commented Sep 4, 2018 • edited Loading

goodb commented Sep 5, 2018

goodb commented Sep 5, 2018

goodb commented Sep 6, 2018

cmungall commented Sep 6, 2018

goodb commented Sep 6, 2018

goodb commented Sep 6, 2018

goodb commented Sep 10, 2018

goodb commented Sep 10, 2018

goodb commented Sep 4, 2018 •

edited

Loading