Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub-classes of StatisticalProduct #14

Closed
FranckCo opened this issue Jun 6, 2019 · 12 comments
Closed

Sub-classes of StatisticalProduct #14

FranckCo opened this issue Jun 6, 2019 · 12 comments
Assignees

Comments

@FranckCo
Copy link
Member

FranckCo commented Jun 6, 2019

Decided during May 7 meeting: define what sub-classes of StatisticalProduct we want.

Example candidate: StatisticalIndicator.

@FranckCo
Copy link
Member Author

FranckCo commented Jun 6, 2019

Linked to issue #6

@FlavioRizzolo
Copy link
Collaborator

We could distinguish between data products, e.g. StatCan's CANSIM tables, and analysis, e.g. research papers providing insights into the data.

@FlavioRizzolo
Copy link
Collaborator

FlavioRizzolo commented Jun 6, 2019

We could also distinguish between census products, survey products and other statistical programs.

@FlavioRizzolo
Copy link
Collaborator

StatCan also has interactive products, including visualizations, infographics and thematic maps. If other organizations are doing these as well, they could be sub-classes of StatisticalProducts as well.

@JALinnerud
Copy link
Collaborator

Statistics Norway also has interactive products.

@JALinnerud
Copy link
Collaborator

Could be worth checking that we are consistent with GSIM Product v1.2:
Definition: A package of content that can be disseminated as a whole.
Explanatory text: A Product is a type of Exchange Channel for outgoing information. A Product packages Presentations of Information Sets for an Information Consumer. The Product and its Presentations are generated according to Output Specifications, which define how the information from the Information Sets it consumes are presented to the Information Consumer. The Protocol for a Product determines the mechanism by which the Product is disseminated (e.g website, SDMX web service, paper publication).
A Provision Agreement between the statistical organization and the Information Consumer governs the use of a Product by the Information Consumer. The Provision Agreement, which may be explicitly or implicitly agreed, provides the legal or other basis by which the two parties agree to exchange data. In many cases, dissemination Provision Agreements are implicit in the terms of use published by the statistical organization.
For static Products (e.g. paper publications), specifications are predetermined. For dynamic Products, aspects of specification could be determined by the Information Consumer at run time. Both cases result in Output Specifications specifying Information Set data or referential metadata that will be included in each Presentation within the Product.

@FlavioRizzolo
Copy link
Collaborator

Good point. I believe it does align with GSIM Product, to some extent.
We need to discuss as well whether we want to get into the minutia of GSIM Presentation, OutputSpecification, etc. (I think we shouldn't) or just consider all that embedded into COOS:StatisticalProduct.
Whether we need to cover GSIM ExchangeChannel in detail is also a matter of discussion --perhaps a separate issue?

@abrycsaba
Copy link
Collaborator

We should focus on statistical products (the ones you mentioned before) that is made of statistical data. Other statistical products e.g. statistical registers, statistical applications, statistical classification should be taken into consideration at later stage.
I tried to find some literature on classification of statistical products unfortunately without any result yet.

@zoltanvereczkei
Copy link
Collaborator

zoltanvereczkei commented Apr 9, 2021

Statistical Product is a prov:Entity. When we describe statistical products in the ontology, we agree that sub-classes are needed.

What do our ModernStats models say about products?

GSBPM

2.1. Design outputs
This sub-process contains the detailed design of the statistical outputs, products and services to be produced, including the related development work and preparation of the systems and tools used in the "Disseminate" phase.
7.2. Produce dissemination products
This sub-process produces the dissemination products, as previously designed in sub-process 2.1 (Design outputs), to meet user needs. They could include printed publications, press releases and websites. The products can take many forms including interactive graphics, tables, maps, public-use microdata sets, linked open data and downloadable files.

GSIM

Product: A package of content that can be disseminated as a whole. A Product is a type of Exchange Channel for outgoing information. A Product packages Presentations of Information Sets for an Information Consumer. The Product and its Presentations are generated according to Output Specifications, which define how the information from the Information Sets it consumes are presented to the Information Consumer. The Protocol for a Product determines the mechanism by which the Product is disseminated (e.g website, SDMX web service, paper publication).
For static Products (e.g. paper publications), specifications are predetermined. For dynamic Products, aspects of specification could be determined by the Information Consumer at run time. Both cases result in Output Specifications specifying Information Set data or referential metadata that will be included in each Presentation within the Product.
Presentation: The way data and referential metadata are presented in a Product. A Product has one or more Presentations, which present data and referential metadata from Information Sets. A Presentation is defined by an Output Specification.

Presentation can be in different forms; e.g. tables, graphs, structured data files.
Examples:
A table of data. Based on a Data Set, the related Data Structure is used to label the column and row headings for the table. The Data Set is used to populate the cells in the table. Reference metadata is used to populate footnotes and cell notes on the table. Confidentiality rules are applied to the Data Set to suppress any disclosive cells.
A data file based on a standard (e.g. SDMX).
o A PDF document describing a Statistical Classification.
o Any structural metadata object expressed in a standard format (e.g. DDI 3.1 XML).
o A list of Products or services (e.g. a product catalogue or a web services description language (WSDL) file).
o A web page containing Statistical Classifications, descriptions of Variables, etc.

GAMSO
Manage Consumers: These activities cover the management of communication and exchanges between governmental or international institutions, the public, and other stakeholders in direct or indirect support of organisational services. They therefore deal with the relationships between statistical organisations and the public, including those via the media. This includes general marketing activities and dealing with non-specific consumer feedback. This also includes measures to educate and inform users so that they fully understand statistical outputs, and to promote and improve levels of statistical literacy in society in general. Manage cross-product user support


As it can be seen from the above listed concepts GSIM and GSBPM only concentrates on products which are from statistical data as the concept ‘Product’ in GSIM is only in connection with Statistical Programme therefore other products which are not from a Statistical Programme e.g. registers, classifications are excluded.

In the Process phase 7 of the GSBPM also products from statistical data are highlighted, other are not mentioned before.

But if we take GAMSO on the board the situation might be a bit different. There is only one mentioning of products in the GAMSO, which are related to statistical data (see Manage consumers). Other GAMSO elements have also products as those ‘activities’ also take input and transform them to outputs. These are mainly not made of statistical data e.g vision, strategy, financial plan, resources plan etc. These outputs/products are not fully present in GSIM (only which are is close connection to a statistical business process - understandably) but later on those non statistical products have also be determined and standardised.

Conclusion and suggestion:

The adjective ‘Statistical’ for products should be kept for this project, since non-statistical products (e.g. resource plans, websites, methods) are the outputs from Statistical Support Programme (using GSIM object) or from ‘activities’ of GAMSO. These are probably out-of-scope for now if we set the scope of the work to the statistical business process. Basically this means Product has two sub-classes: StatisticalProduct (within scope for us) and NonStatisticalProducts (out of scope for us).

For the sub-classes of StatisticalProduct, the GSIM already has classification of static and dynamic products. As the main focus of the work is to support statistical activities, we should define sub-classes that make sense from the statistical perspective and it is easy for the user (statistician) of the ontology to understand the sub-classes we define.

If we take a look around of what classifications of statistical products are available for official statistics, there is no standard or recommendation for it (or we have not found it yet).

If we check NSI webpages, the following items are usually referred to as statistical products (this is not a classification, only a summary of observations made based on a few NSI webpages):

  • Data: statistical databases storing vast amount of statistical information in table or database formats. Sometimes certain statistical domains are highlighted as separate items, e.g. census but it is another classification: statistical domains not a classification of products, in my opinion.
  • Publications
  • Visualisations (maps, infographics, etc.)
  • Data access services

Bonus question: (never enough of these :))
GSBPM mentions outputs, products and services. Does this mean that we need StatisticalOutput, StatisticalProduct and StatisticalService as separate entities? Do we want to deal with StatisticalServices as a separate entity? In my opinion, if we want the scope to be in line with the GSBPM as well, then we need to. But while sub-classes are needed for statistical products, we might not need to specify sub-classes for statistical services (or at least not at the moment). Just keep in our minds to have an entity for StatisticalService.

@FlavioRizzolo
Copy link
Collaborator

FlavioRizzolo commented Apr 9, 2021

Very good points!

I think we should focus on Statistical Products for now.

We could look at a couple of dimensions to characterize types:

  • Type of presentation: Datasets, Publications, Visualizations, Infographics, Thematic Maps, Interactive (given by GSIM Output Specification)
  • Type of content: Data, Metadata (e.g. a statistical classification), Analysis, Models (given by the subtypes of GSIM Information Sets, not included in GSIM)
  • others?

@FlavioRizzolo
Copy link
Collaborator

Proposed definition of graph dataset:

"Collection of data where datapoints are nodes and relationships between them are edges in a graph structure"

I couldn't find a definition in our documentation, that's why I came up with one, but if you already have a better one please disregard this one.

@FranckCo
Copy link
Member Author

Classes, concept schemes and properties added for product presentation and product content in commit 76fc9cc and commit f8067b3 respectively, so closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants