Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model links between GSBPM and GSIM #19

Open
FranckCo opened this issue Jan 8, 2020 · 19 comments
Open

Model links between GSBPM and GSIM #19

FranckCo opened this issue Jan 8, 2020 · 19 comments
Assignees

Comments

@FranckCo
Copy link
Member

FranckCo commented Jan 8, 2020

The Unece "Supporting Standards" group lists kinks between GSBPM sub-processes and GSIM objects. These links could be rendered via a RDF object property. A GSIMObject class could also be created to scope the domain or range of this property.

@FranckCo FranckCo self-assigned this Jan 8, 2020
@FranckCo
Copy link
Member Author

Several points to discuss here:

  • name of mother class for all GSIM "objects": GSIMObject? GSIMClass?
  • should we create sub-classes for each group (GSIMBaseObject, GSIMExchangeClass)?
  • or/and, keeping referring to PROV-O, create GSIMAgent, GSIMActivity and GSIMEntity?
  • name of properties linking GSIM and GSBPM: currently, "input" and "output" seem to be used
  • do we define these properties as sub-properties of PROV properties (used, wasGeneratedBy), or even directly use the PROV properties?

@JALinnerud
Copy link
Collaborator

JALinnerud commented Feb 22, 2021

Could Class and Object be ModelElements?
GSIMModules where the Modules are Base, Concept, Exchange, Structure, Business.
I agree to using provo and any other international ontologies we can.
GSIM and GSBPM could be linked by input, output, resources and control. Ref IDEF0
Will we need OWL2 object properties and datatype properties?

@FranckCo
Copy link
Member Author

FranckCo commented Feb 28, 2021

Decided during 23/02/21 meeting: create GSIMClass, with equivalent class GSIMObject and daughter GSIMEntity.
Implemented in Turtle by commits 2a811f0, 74b0ae1 and f1515dd

For GAMSO and GSBPM we did not keep the name of the model in the name of the classes (e.g. ActivityArea and not GAMSOActivityArea, Phase and not GSBPMPhase), so why should we for GSIM? But if we drop the GSIM part, we are left with "Class" or "Object", not very informative. So, since GSIM is a statistical information model, why not "StatisticalInformationClass", or "StatisticalInformationObject"?

@FlavioRizzolo
Copy link
Collaborator

FlavioRizzolo commented Mar 2, 2021

I think StatisticalInformationObject/Class works. The question of whether we need the term "statistical" in the name has been raised by Edgardo:

I agree with the proposal of removing “GSIM” prefix and substitute it by a generic one. However, my hesitance is whether it is necessary to qualify them by “Statistical”. Will there be Objects or Classes other than “statistical” in the scope? If not, and for the sake of shortening the names, I would just say “InformationObject” and ”InformationClass”.

Given we included GAMSO, we might want to support information entities that are not statistical. Having said that, GSIM is a statistical model, same as GSBPM, so its scope is the statistical information entities.

I suggest to have InformationClass at the top and StatisticalInformationClass/Object as subclasses for the GSIM objects.

@pafrance
Copy link

pafrance commented Mar 5, 2021

We are fine with a superclass called "informationClass" as source node for GSIM subclasses.
This can be linked as a stub to the corresponding GSBPM Objects.
After the link has been extablished we can derive other GSIM classes by further specification, so it is fine if the class looks seldom "informative" at this level.
With InformationObject being an instance of the class instead. What do you think?
Paolo & Adele

@FranckCo
Copy link
Member Author

@FlavioRizzolo DDI-CDI uses InformationObjet, I think, which is consumed and produced by activities

cdi

So, following your suggestion, I would go for prov:Entity -> coos:InformationObject -> coos:StatisticalInformationObject

@FranckCo
Copy link
Member Author

@FlavioRizzolo

We discussed different possibilities previously for representing links between GSIM objects and GSBPM sub-processes, in particular the idea of representing individuals sub-processes as some kind of "abstract" or "prototype" individuals with class-like features so they can be used as domains or ranges of properties. We mentioned OWL NamedIndividual as a possibility, but I checked in the specifications and I understand that OWL named individuals are just individuals that have an IRI (https://www.w3.org/TR/owl2-syntax/#Individuals).

Actually, considering how GSBPM sub-processes are currently declared in COOS, typing them as owl:NamedIndividual might actually restrict the possibilities. Just typing them as coos:SubProcess as it is currently done does not entail that they are individuals (https://stackoverflow.com/questions/37157883/member-of-an-owlclass-versus-owlnamedindividual), if I understand correctly. You could still add axioms treating, for example, http://id.unece.org/activities/subProcess/7.3 as a class, using metamodeling. However, I'm not sure we want to go that way.

@FlavioRizzolo
Copy link
Collaborator

Example of InformationObjects being inputs and outputs of Activities:

Consider "Design Frame and Sample". Inputs are "DataSet", "DataStructure", "Variable", and "Population", and outputs are "Process Method" and "Rules", among others. Those are GSIM objects, which seem to be individuals of either coos:InformationObject, or coos:StatisticalInformationObject, to be more precise. They seem to me to be at the same level of "abstract" individuals as "Design Frame and Sample" is an "abstract" individual of the subProcess class. That aligns with prov:used/prov:wasGeneratedBy as well, and their inverses.

@FlavioRizzolo
Copy link
Collaborator

To discuss, if possible:

image

The current classes are in white and the suggested additions in green. Note also in grey the renaming of Information "Object" and Statistical Information "Object": object is kind of controversial, and we are removing the Information Object class from DDI CDI, so I thought the term entity might be a better one, specially since it aligns with Prov.

@FlavioRizzolo
Copy link
Collaborator

Also, a question: currently, InformationObject (or Entity) is defined as "Mother of all classes defined in GSIM" and StatisticalInformationObject (or Entity) as "Information object representing statistical information". Should both be the definition of StatisticalInformationObject (or Entity)? To me GSIM doesn't apply to other type of information, e.g. HR, Finance, Procurement, etc.

@InKyungChoi
Copy link

Also, a question: currently, InformationObject (or Entity) is defined as "Mother of all classes defined in GSIM" and StatisticalInformationObject (or Entity) as "Information object representing statistical information". Should both be the definition of StatisticalInformationObject (or Entity)? To me GSIM doesn't apply to other type of information, e.g. HR, Finance, Procurement, etc.

I know GSIM is "Statistical Information Model"..... but are all GSIM objects "statistical information"? I am thinking of something like "Process Step", "Process Control Design" or "Identifiable Artefact", they might be concepts needed for the statistical production process, but don't seem so "statistical" information as "Population" or "Variable"?

@JALinnerud
Copy link
Collaborator

The introduction of Statistical Concept might be confusing for GSIM users that already have Concept (https://statswiki.unece.org/display/clickablegsim/Concept) with subtypes Population, Universe, Unit Type, Variable and Category.

@JALinnerud
Copy link
Collaborator

Regarding the adjective 'Statistical'. When NSIs contribute data sets to national catalogues/portals and European portals then users know the content is statistical through dct: publisher, dct: creator, foaf: Agent, dcat: theme, dct:subject, skos:Concept etc etc. I am still worried that by using the adjective Statistical everywhere we might be reducing out interoperability with other international and national groups and organisations eg national mapping agencies. The end users simply want to find data sets and combine them with other data sets. Are our data sets different from other data sets or is it 'just' that our organisations try to adhere to certain quality criteria? Are we giving ourselves more work by introducing the adjective Statistical and at the same time reducing the quality for our users? How open is our data when we use terms that are not used by other organisations?

@FlavioRizzolo
Copy link
Collaborator

I am still worried that by using the adjective Statistical everywhere we might be reducing out interoperability with other international and national groups and organisations eg national mapping agencies. The end users simply want to find data sets and combine them with other data sets. Are our data sets different from other data sets or is it 'just' that our organisations try to adhere to certain quality criteria? Are we giving ourselves more work by introducing the adjective Statistical and at the same time reducing the quality for our users? How open is our data when we use terms that are not used by other organisations?

This is a good point. Other than statistical products, the rest, i.e. data point, data structure, dataset and all its sub-classes, doesn't really need to be "statistical".

@FlavioRizzolo
Copy link
Collaborator

Also, a question: currently, InformationObject (or Entity) is defined as "Mother of all classes defined in GSIM" and StatisticalInformationObject (or Entity) as "Information object representing statistical information". Should both be the definition of StatisticalInformationObject (or Entity)? To me GSIM doesn't apply to other type of information, e.g. HR, Finance, Procurement, etc.

I know GSIM is "Statistical Information Model"..... but are all GSIM objects "statistical information"? I am thinking of something like "Process Step", "Process Control Design" or "Identifiable Artefact", they might be concepts needed for the statistical production process, but don't seem so "statistical" information as "Population" or "Variable"?

I feel the same way. It seems to me that objects in the Concepts and Structures groups are Information Entities whereas objects in the Business group are rather "Business" Entities. Not sure what to think of Exchange though.

@FlavioRizzolo
Copy link
Collaborator

Based on the recent comments, I have two questions to discuss:

Question 1: Should we add Business Entity as a child of prov:Entity to capture at least the GSIM Business group?

Question 2: Should we remove Statistical Information Entity entirely?

@FlavioRizzolo
Copy link
Collaborator

Regarding the relationship between Dataset, Data Structure and Data Point, both GSIM and DDI CDI have the same relationships as the proposed diagram above.

From GSIM:

image

From DDI CDI:

image

@FranckCo
Copy link
Member Author

FranckCo commented Jul 4, 2022

Wait for conclusion of the discussion in the GSIM revision Task Team

@FranckCo FranckCo added this to the Version 2 of COOS milestone Oct 17, 2022
@tfrancart
Copy link
Collaborator

@FranckCo @flo7894 Please find an analysis of this issue in https://github.com/linked-statistics/COOS/wiki/Issue-19-analysis-note

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants