New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define what qualifies a GPAD line as a distinct assertion individual in GO-CAM #45
Comments
The with/from field qualifies (or was originally intended to qualify) the evidence, so the individuals should be collapsed, but each ref/evidence_code/with is a separate piece of evidence. The one caveat to that is that I thought we had decided to express binding annotations in a formally correct way with the bound entity being an input. then we would spit them back out as they currently are in the with field. @vanaukenk is that still the plan? |
@ukemi |
OK, I think we can still work with that protein binding caveat. I now see the protein binding section on the wiki. So basically, DO split out distinct with/from values into multiple assertions (translated with different @ukemi @vanaukenk Sound correct? |
Yes. That sounds correct. |
Cool, thanks! I updated the header/line thingy in my first comment to clarify how |
We will want to split out distinct With/From values into multiple assertions for GO:0005515 and also its children, as we may have annotations to terms like 'protein kinase binding' GO:0019901 that still refer to different entities in the With/From field. I'll update the protein binding section of the wiki to make it clearer which of the options we chose. |
I've updated the import rules section for protein binding. Please let me know if anything is unclear or doesn't look right to you: http://wiki.geneontology.org/index.php/Noctua_MOD_Imports#Protein_Binding_Annotations Thx. |
Thanks @vanaukenk ! That definitely is more straight-forward. I just wanted to make sure I understand the last point:
So multiple GPAD lines with same GP-term-with/from-etc (the header fields above) values won't be collapsed into the same assertion if their evidence code-references (line fields) vary? More simply, each protein binding GPAD line will have its own assertion individual in GO-CAM? |
Is this true even for annotation lines that have the same value in the 'with' field? MGI MGI:1340046 enables GO:0005515 MGI:MGI:4845793|PMID:21068328 ECO:0000353 UniProtKB:Q8K1S1 20130311 MGI |
@dustine32 |
@vanaukenk Oh ok, that's what I was thinking too. Differing references alone shouldn't require multiple assertion individuals. I also forgot that all protein binding and descendant term annotations should be using the same IPI evidence code, so that removes one variable. Thanks so much for clarifying! |
@vanaukenk I think those two examples are now collapsing correctly. Here they are on my dev server:
|
@dustine32 - the two CC examples above are indeed now collapsing correctly. Thanks! |
@vanaukenk have you set up a formal testing document? If not, I will have a shot at it. |
Yes, I started with this spreadsheet here: https://docs.google.com/spreadsheets/d/1XFuD6LOyFKXNk94jIK8zv1TrESfwCJo-RnrXQ3tzmJg/edit |
@vanaukenk @ukemi The latest iteration of WB, MGI models are now up on noctua-dev so this can be tested there now. This won't have the fix for the comma-separated with/from snafu that @ukemi pointed out here but I've since fixed it on my USC server. Here are some stats from the import attached to the PR. |
So far we've been "collapsing/consolidating," based on certain criteria (e.g. GP + term + extensions are same), multiple GPAD lines into distinct assertion individuals containing multiple evidence. I'd like to get clarification and document this criteria here first, then we can move it to the wiki page.
In my head, this is basically a header vs line situation so I'll present it like so:
Header:
Line:
@vanaukenk @ukemi @thomaspd This mainly came about recently when trying to figure out how to group lines by the
with/from
field, hence the question mark. Do multiple GPAD lines that differ only inwith/from
values represent the same assertion individual in GO-CAM or multiple?The text was updated successfully, but these errors were encountered: