We need to manage the updates of records (and Named Graphs?) #14

chin-rcip · 2019-10-31T17:11:53Z

Update of records

In the v.1.5 of the Target Model, the history of record (E73 Information Object) in only modeled by the E65 Creation event, and there is no possibility to document the history of the different versions of this record, which is a problem.

With CIDOC CRM

With CIDOC CRM, there is no way to render those updates, as the E11 Modification class refers only to the modification of E24 Physical Man-Made Thing.

With Prov-O

The Prov-o ontology, used to describe the named graph, can also be used to document the update of the entity. with the property prov:wasRevisionOf it creates a link between the creation version and the updated version of the record.

With CIDOC CRM-Dig

With CRM-Dig, if we instantiate the record and named graph into digital object, we could add the event of modification. Nonetheless, that would create 2 entities for the record and named graph, the original one and the modified one. I'm not sure it would be the best way to model it.

Named Graphs

The Named Graphs generation are documented with the prov:Activity->prov:generated->prov:Entity(Graph). But we could also document the creation and modification of the whole graph with a similar pattern that for the record. Do we need to document the updates of the Named Graph though?

The text was updated successfully, but these errors were encountered:

stephenhart8 · 2020-01-15T21:32:44Z

When looking at the AAT data, I found that they also uses PROV-O to track the creation and modifications of each Concepts.

They chose 2 simultaneous patterns, with PROV-O Refinements (https://www.w3.org/TR/2013/NOTE-prov-dc-20130430/#term_modified) and Dublin Core.

By looking at their approach, it seems I made a mistake in my earlier proposition. I've linked the creation Event to one E73 Information Object, and the Modification to a different E73, even if it is the same E73 (either Named Graph or Record).

Following what the Getty did with the AAT, I would propose the following:

I am not sure if the property between the prov:Modify and the E73 is prov:wasGeneratedBy. In the documentation, it seems it should be that, but it seems a bit strange to me.

KarineLeonardBrouillet · 2020-02-25T16:00:31Z

Notes on verbal meeting 2020-02-17

It might be sufficient for what we are meaning to track.

Made a distinction between a persistent resource and a volatile dynamic resource (e.g. making a software program with versions) you want to be able to refer to each version (e73 information object but is not ex nihilo) but everything is linked back to the volatile object (the software program). Will we keep copies of chunks of metadata or do we only want to know when was the last update.

When wanting to have all the information about a specific named graph, do you need a link between all the information objects?

The museums will need the older versions of their data at some point for sure. If it has to be in the LOD environment is another question.

The named graph could be online with the modified events and the copies of data are in our repositories. Or all could be documented in the model with the multiple e73 instances. This will multiply triples as well.

Illip. Museums will need to query older versions.

Habennin. will put links to Parthenos. How to manage massive integration long term; can create meta metadata for the data that we are transforming. Would require a separate triplestore with pointers and policies to handle them.

VladimirAlexiev · 2020-03-03T12:46:58Z

In the Getty vocabs we went for simplicity foremost, because PROV is quite complicated, e.g see http://vocab.getty.edu/doc/#dct_modified

This is a complex topic, so here are just a few considerations:

I thought from other issues that you intend to have one named graph per museum, but that is too large granularity. Is best to have one named graph per record (or unit of work) I.e. the individual data pieces that typically are moved in an aggregation scenario.
- then you can connect the individual entities to the museum dataset using some partOf relation
as for keeping older versions, it's not a simple question because of
- rapidly increasing volume
- the need to do queries, faceting and some inferences (eg influence, counting) on the latest versions only
regarding dedicated implementations ,
- GraphDB has a history and versioning plugin,
- Wikidata has an experimental History_Query_Service that's not guaranteed/operational
- LDF can easily federate a repo with an older version of the data

stephenhart8 · 2020-03-30T22:12:09Z

Thank you @VladimirAlexiev for your input.

ProvO is indeed a bit complicated and creates a lot of triples, but it's quite similar to CIDOC CRM. Would it be an option to both have ProvO and dct?

For the question of where to have the named graph, I have created the issue #45. I would very much like your input on that important subject.

During our latest discussion (on the 23th of March), we came to the conclusion that it would be best not to publish the older version of the datasets, and to store them in a repository at CHIN (and available if someone asked them for historical purposes).

We need to investigate those implementations, thank you for the information!

General question: Does my pattern proposed on the 15h of January make sens?

illip · 2020-12-17T16:08:09Z

Regarding #45, we have decided to go with a Named Graph per dataset. We know need to identify clearly the updating process.

illip · 2021-01-07T19:39:38Z

During our Semantic Committee meeting on the 2021-01-07, while we were discussiing Issue #10, the update came up since in some use cases, keeping track of more than two roles (creator and provider) could be necessary in order to offer the possibility of documenting updates done, for instance, by an artist regarding his/her data in the museum's dataset.

This highlights the need for having two "categories" of updates:

A way to do snapshots of the whole dataset to keep track of its evolution. Someone who would be interested in identifying the changes will have to compare these versions.
When a stakeholder documents the reason for a data modification, we should have a pattern to answer this need.

chin-rcip added the modeling This issue concerns how we organize the information semantically label Oct 31, 2019

stephenhart8 added the legal This issue implies legal aspects label Mar 20, 2020

illip added the meeting needed label Mar 22, 2020

illip mentioned this issue Mar 22, 2020

Who are the creators of a record? #10

Open

illip assigned stephenhart8 Mar 23, 2020

illip removed the meeting needed label Mar 31, 2020

stephenhart8 changed the title ~~Issue #14 - We need to manage the updates of records (and Named Graphs?)~~ We need to manage the updates of records (and Named Graphs?) Apr 6, 2020

illip added this to To do in Version 2.3 (December 2021) Jul 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We need to manage the updates of records (and Named Graphs?) #14

We need to manage the updates of records (and Named Graphs?) #14

chin-rcip commented Oct 31, 2019 •

edited by stephenhart8

stephenhart8 commented Jan 15, 2020 •

edited

KarineLeonardBrouillet commented Feb 25, 2020

VladimirAlexiev commented Mar 3, 2020 •

edited

stephenhart8 commented Mar 30, 2020 •

edited

illip commented Dec 17, 2020

illip commented Jan 7, 2021

We need to manage the updates of records (and Named Graphs?) #14

We need to manage the updates of records (and Named Graphs?) #14

Comments

chin-rcip commented Oct 31, 2019 • edited by stephenhart8

Update of records

With CIDOC CRM

With Prov-O

With CIDOC CRM-Dig

Named Graphs

stephenhart8 commented Jan 15, 2020 • edited

KarineLeonardBrouillet commented Feb 25, 2020

VladimirAlexiev commented Mar 3, 2020 • edited

stephenhart8 commented Mar 30, 2020 • edited

illip commented Dec 17, 2020

illip commented Jan 7, 2021

chin-rcip commented Oct 31, 2019 •

edited by stephenhart8

stephenhart8 commented Jan 15, 2020 •

edited

VladimirAlexiev commented Mar 3, 2020 •

edited

stephenhart8 commented Mar 30, 2020 •

edited