-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EPC Corrections #86
Comments
Many of our edges flowing through automat to strider and then to Aragorn ultimately end up looking like this. {
"attributes": [
{
"attribute_source": null,
"attribute_type_id": "biolink:original_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:drugcentral",
"value_type_id": null,
"value_url": null
},
{
"attribute_source": null,
"attribute_type_id": "biolink:aggregator_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:aragorn",
"value_type_id": null,
"value_url": null
},
{
"attribute_source": null,
"attribute_type_id": "biolink:aggregator_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:automat-robokop",
"value_type_id": null,
"value_url": null
}
]
} Here is an attribute from BTE {
"attribute_source": null,
"attribute_type_id": "biolink:aggregator_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": ["infores:translator-biothings-explorer"],
"value_type_id": "biolink:InformationResource",
"value_url": null
} Here is one from COHD {
"attribute_source": "infores:cohd",
"attribute_type_id": "biolink:original_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:cohd",
"value_type_id": "biolink:InformationResource",
"value_url": "http://cohd.io/api/query"
} I believe that we should be
|
What should |
Technically only
I think for most things where the value is an infores it is typically the same infores. At least that's what COHD does. In reality I think the intention is to create a linked list of who called who. "I heard it from here". Under this logical, COHD is doing the right thing as an original knowledge source. It's the aggregators that I think need to change things. Modifying the edge from earlier, I think it should be
This maintains the call stack. Aragorn called Robokop who had info from Drug Central. The order didn't have meaning, which is good because things were out of order anyway. I have never actually scene this in the wild, but it feels right. edit: fixed link. |
Hi guys. The documentation describing the TRAPI Standard for Representing Source Retrieval Provenance should answer your questions. Worth a read top to bottom, but example 1B in the Data Examples section is most pertinent. And below this example, you'll see the following relevant comment: "Note that the attribute_source fields indicate the Information Resource that made the key-value assertion about source provenance that is carried in a given Attribute object (here, an assertion that a particular resource was an original or aggregator source of the knowledge expressed in the Edge)." So, bringing this back to your example, where I understand the retrieval path for the Edge to be DrugCentral --> AutomatRobokop --> ARAGORN. The TRAPI message for this edge will include a separate Attribute object for each of these three Information Resources.
This is how I see things working. Does it make sense? I can easily add an updated version of the data example below (once I understand what the policy is on who adds the Attributes declaring translator aggregator sources) |
Thanks @mbrush, this document is super helpful! I think I had my link list linked backwards. Translating the prose above into JSON. {
"attributes": [
{
"attribute_source": "infores:automat-robokop",
"attribute_type_id": "biolink:original_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:drugcentral",
"value_type_id": "biolink:InformationResource",
"value_url": null
},
{
"attribute_source": "infores:aragorn",
"attribute_type_id": "biolink:aggregator_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:aragorn",
"value_type_id": "biolink:InformationResource",
"value_url": null
},
{
"attribute_source": "infores:aragorn",
"attribute_type_id": "biolink:aggregator_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:automat-robokop",
"value_type_id": "biolink:InformationResource",
"value_url": null
}
]
}
This is true because, you are correct, Aragorn adds its own EPC to the message, and the user called Aragorn. This leads me to a follow up question. How should automat-robokop have responded to Aragorn? I think it might be like this. {
"attributes": [
{
"attribute_source": "infores:automat-robokop",
"attribute_type_id": "biolink:original_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:drugcentral",
"value_type_id": "biolink:InformationResource",
"value_url": null
},
{
"attribute_source": "infores:automat-robokop",
"attribute_type_id": "biolink:aggregator_knowledge_source",
"attributes": null,
"description": null,
"original_attribute_name": null,
"value": "infores:automat-robokop",
"value_type_id": "biolink:InformationResource",
"value_url": null
}
]
}
Is it then Aragorn's job to change the second item to insert itself as the Thanks for the help! edit: spelling |
Ideally Aragorn would not have to change anything in the retrieval provenance when it gets data from Automat. It should just add one more attribute adding itself to this chain. Since Automat is the one that adds an attribute indicating itself as an aggregator source before sending messages to other systems, it should de facto be the 'source' for this Attribute. There should be no need to change this when the message gets into Aragorn. Re:
. . . this is not currently correct (although it would be useful if it were because it lets you order the retrieval path). Generally, If the attribute holds a publication supporting an Edge, If the attribute holds a confidence score for an Edge, In the case of source retrieval provenance - when an attribute holds an Info Resource from which the knowledge expressed in an edge was retrieved at some point, That said, if we wanted to make the attribute_source field more useful for allowing us to order the retrieval path, I would be fine with defining conventions that make this possible. But I don't think we want a scenario where requesting systems have to overwrite values of data passed to them. @cbizon curious about your take on all this? |
My current understanding:
Automat should have responded to aragorn with what @kennethmorton said, because it added both of these attributes:
Then, when aragorn passed this information back to the ARS it just adds to the chain (not changing any entries) an attribute like
Saying "Aragorn is telling you that Aragorn passed this edge back" For cases where value is a translator tool, you would expect But for cases where value is not a translator tool, then some translator tool (the source) will be making a statement about where the data came from (the value) |
@cbizon After Matt's explanation, this is my understanding as well. It just leaves a little to be desired. Since the EPC group is exploring future modifications, we should take this approach for now and await further direction. |
This sounds like the right approach to me for now. I agree that the current model is not ideal . . . esp if we want to be able to assemble an ordered retrieval chain. We were hamstrung by the need to stuff this into one level of attribute objects in the TRAPI schema. But there is renewed interest in supporting more expressivity when it comes to source retrieval chains – so the modeling may get refactored in the near future. re:
I'll reiterate what I think you are saying here with an example. Consider a scenario where the Automat KP pulls knowledge from DGIdb and codifies it as an Edge in their graph, then sends on to Aragorn. But DGIdb as an aggregator pulled this knowledge from Chembl. In this scenario, we would have the following Attribute objects:
Again, not ideal, but where we are at now. |
Need to ensure this is working as expected.
The text was updated successfully, but these errors were encountered: