Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General pattern needed for compound functions #25

Closed
dosumis opened this issue Dec 12, 2016 · 7 comments
Closed

General pattern needed for compound functions #25

dosumis opened this issue Dec 12, 2016 · 7 comments

Comments

@dosumis
Copy link
Contributor

dosumis commented Dec 12, 2016

We have many compound functions in GO. Sometimes this is reflected in multiple axes of classification. For example: 'ATPase activity, coupled to transmembrane movement of substances' is classified both under transmembrane transporter activity and under 'ATPase activity'. In other cases one component of the compound function is used for classification while another has a has_part relationship to the compound function.

For automated classification, using multiple has_part relationships would work well. has_part also works well for LEGO templates, as it exposes the individual components so that regulation edges can be linked directly to them. Unfortunately has_part is useless for grouping of annotations (although see this proposal: ). Also, going over to entirely using has_part would break some of the existing heirarchy in places where it feels intuitively right. For example, receptor tyrosine kinases are classified as kinases as well as receptors.

Is there a logical way we can get around this conundrum?

Would this GCI be crazy?

'molecular function that has_component some X' SubClassOf X

?

Could be added programmatically for MFs that are components of other MFs. has_component potentially less damaging as it is non-transitive.

CC @cmungall

@cmungall
Copy link
Member

cmungall commented Dec 14, 2016 via email

@dosumis
Copy link
Contributor Author

dosumis commented Dec 14, 2016

Not too surprised. One big negative is that it screws with our ability to add disjoints.

@dosumis
Copy link
Contributor Author

dosumis commented May 11, 2017

A new general pattern suggestion

(The proposal at the top of this ticket should be ignored. It's dumb.)

Background:

MF design patterns need to cope with the compound nature of many molecular functions while
(a) keeping classification that is intuitive to biologists;
(b) supporting the curation of LEGO (GO-CAM) models with unbroken chains of causal relations;
(c) being easy for LEGO (GO-CAM) curators to use;
(d) keeping LEGO (GO-CAM) models as simple as possible: curators should be able to choose whether or not to include subfunctions with as little loss of information as possible.

Earlier attempts at defining compound functions used has_part (or some subProperty of it) for all components of a compound function including effector function. This approach is particularly bad at supporting unbroken causal chains (see this comment for an illustration: #31 (comment)). They also suffered from somewhat unintuitive classification. Biologists typically expect classification under and effector function - so an RTK is_a kinase, PKA is_a kinase (not a transducer with parts cAMP sensor with kinase activity) and ATPase coupled K+ transporter is_a K+ transporter, a transcription factor is transcription regulator.

Proposal

(In reading this proposal, please bear in mind that all inverses are assumed to be automatically inferred)

  • Compound functions are classified under their effector function.
  • We use specialised subproperties of has_component to record both the presence of a component and its causal relationship to the effector function. Rather than relying on property composition, reasoning relies on property hierarchy: These new relations live under both has_component and the relevant causal relation.
  • A property chain (or rule) => inference of annotation GP to subfunction: enables o has_component -> enables
  • A property chain (or rule) => inference of input from direct_regulation: directly_regulates o enabled_by -> has_input (possibly has_substrate - TBD)
  • (TBD) Additional axioms infer the presence of subcomponent from direct regulation edges:
    • GCI: directly_regulated_by some 'protein binding activity' SubClassOf has_regulatory_component some 'protein binding activity'
    • GCI: directly_regulated_by some kinase activity SubClassOf has_regulatory_component some 'phosphorylation sensor activity'

With this in place, we still get continuous chains of causal relations (and resulting inference) whether the regulatory edge points to the effector function or a subfunction that is causally related to it. We can also eliminate additional regulatory edges internal to compound functions in GO-CAM.

Any additional classification under Paul's new upper level classes (e.g. molecular transducer) should be inferable based on these design patterns.

Sketch:

"molecular transducer activity" EquivalentTo: molecular_function that has_regulatory_component some molecular_function ?

"phosphorylation sensing molecular transducer activity": EquivalentTo: molecular_function that has_regulatory_component some 'phosphorylation sensor activity'

(whether we want such high level classes is something we can decide separately, this just illustrates how we could get inference to them).

A sketch of potential object properties:

image

Some examples of how this could work (ontology design patterns):

image

image

image

We might want an explicit has_component_function to => domain and range restriction to MFs.

We could flesh this out with more relations (has_energy_source ? e.g. for ATPase-coupled transporter example?)

GO-CAM examples:

TF activity - modified from one of Astrid's test models: Note that we get complete regulatory chains whether going via the DNA binding component or the effector (genus).

image

(All inverses are inferred in GO-CAM models, so we could flip has_necessary_component -> necessary_component_of in the opposite direction if that is clearer)

Inferred classic GO annotations:

  • FOXO3 enables "RNA POLII regulatory region DNA sequence-specific binding"
  • ATK1 enables protein serine/threonine kinase activity has_input(FOXO3)
  • Myc enables 'transcription factor binding' has_input(FOXO3) (We should be able to infer TF binding from simple 'protein binding' + enables TF activity)
  • fu enables 'protein binding' has_input(FOXO3)

Can we also get these ? (depends on #49)

  • FOXO3 involved_in positive regulation of transcription
  • ATK1 involved_in negative regulation of transcription
  • Myc involved_in negative regulation of transcription
  • fu involved_in negative regulation of transcription
  • FOXO3 acts_upstream_of "cyclin-dependent ser/thr kinase activity"
  • ATK1 acts_upstream_of "cyclin-dependent ser/thr kinase activity"
  • myc acts_upstream_of "cyclin-dependent ser/thr kinase activity"
  • fu acts_upstream_of "cyclin-dependent ser/thr kinase activity"

Inferred upper level MFs:

  • FOXO3 enables some 'phosphorylation sensing molecular transducer activity'*

(* Assuming we want this class.)

Use of these patterns would, of course, be eased by the definition of GO-CAM template patterns to go along with the ontology design patterns. In this case a design pattern would drive a table something like this allowing input of DP components:

DNA binding transcription type regulatory effect
{ Sequence specific DNA binding } {transcription, DNA templated } { directly_regulates }
RNA pol II regulatory region sequence specific DNA binding transcription from RNA pol II promoter directly_positively_regulates

First row shows range class/relation; second row shows example fillers.

Possible extensions:

There has been some discussion of how we might represent logic gates in LEGO. This may be beyond the expressiveness of OWL, but we could, at some point in future, add support internal component nodes representing logic gates, that sit between regulatory component and the effector function that correspond to logic gates.

@cmungall @thomaspd @ukemi Comments please.

@cmungall
Copy link
Member

cmungall commented May 12, 2017 via email

@ukemi
Copy link

ukemi commented May 12, 2017

@vanaukenk

@dosumis
Copy link
Contributor Author

dosumis commented May 24, 2017

In absence of objections, I'm starting implementation.

@pgaudet
Copy link
Contributor

pgaudet commented Mar 1, 2019

This issue was moved to geneontology/go-ontology#16972

@pgaudet pgaudet closed this as completed Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants