Manifest payload naming standard #39

Sveino · 2023-05-12T11:37:37Z

CIM in general shall not include any naming standard. The main idea to include manifest and DCAT is to avoid implementation to rely on naming standard. However, as part of testing and where user interaction on the technical level is needed there is a need for a naming recommendation.
The shall follow the same logic that cim:IdentifiedObject.mRID and cim:IdentifiedObject.name where mRID is the machine interpreted identification and name is the user "identification".
A naming standard can also be useful for simple file based archiving tools that is based on the file name.

The updated naming standard need to cover the needs from CGMES profiles in CGM and TYNDP process in addition to the CSA, CCC, OPC and STA processes.

The current CGM names standard is document in: Quality of CGMES datasets and calculations v3.3
3.2 FILE NAME AND FILE HEADER
The following mask is to be used to have a valid file name:
(snip)

Example from QoCDC:

20180118T0930Z_1D_APG_SSH_001.xml
20180117T2230Z_1D_APG_EQ_001.xml
20180117T2230Z__APG_EQ_001.xml
20180118T1130Z_1D_TSCNET-EU_SV_001.xml
20180118T1130Z_1D_TSCNET-EU-APG_SSH_001.xml

The item in the naming standard need to be found in the header so that tools can generate it based on an information model and that is consistent with the content of the payload.

<dcat:startTime>_<dcterms:publisher>_<prov:wasGeneratedBy>_[dcat:version]

dcat:startTime:
Taken from the dcat:Dataset - if there are multiple dataset with different startTime the prov:generateedAtTime for the manifest (collection) is used.

dcterms:publisher:
Taken from the dcat:Dataset - if there are multiple dataset with different publisher the publisher of the manifest (collection) are used.

prov:wasGeneratedBy
Taken from the dcat:Dataset - if there are multiple dataset with different wasGeneratedBy the wasGeneratedBy of the manifest (collection) are used.
prov:wasGeneratedBy is an association to the abstract prov:Activity that produced the prov:Entity.
The name include:

Process Type: CGM, TYNDP etc
Time Horizon: Year-ahead, Month-ahead etc
Run
Iteration
Profile
E.g. for the following instance file the relevant activity are relevant:
EQ/RA -> CGM, CGM1Y, TYNDP
SSH/TP/SV -> IN, TYNDP, 1Y, 1M, 1W, 6...1D, ID
RAS -> IN, TYNDP, 1Y, 1M, 1W, 6...1D, ID

_[dcat:version]:
This is referring to the dcat:Dataset where a new dcat:Dataset is replacing, make the previous version not valid any longer, by a new version that has the same validity period. The naming should follow semantic versioning, e.g. https://semver.org/ where _[1.0.0] is the default and is optional to use. Other version than the default must be included in the name.
E.g. The same EQ is exchange for the TYNDP:

20230101_APG_TYNDP-EQ.xml
20230101_APG_TYNDP-EQ_[1.0.0].xml

Example for CGM:

20180118T0930Z_1D_APG_SSH_001.xml -> 20180118T0930Z_APG_CGM-1D-SSH.xml
0180117T2230Z_1D_APG_EQ_001.xml -> 0180117T2230Z_APG_CGM-1D-EQ.xml
20180117T2230Z__APG_EQ_001.xml -> 20180117T2230Z_APG_CGM-EQ.xml
20180118T1130Z_1D_TSCNET-EU_SV_001.xml -> 20180118T1130Z_TSCNET-EU_CGM-1D-SV.xml
20180118T1130Z_1D_TSCNET-EU-APG_SSH_001.xml -> 20180118T1130Z_TSCNET-EU-APG_CGM-1D-SSH.xml

Example for TYNDP:

20230101_APG_TYNDP-EQ.xml

Example for CSA:

20230512T2230Z_APG_CGM-RA.xml
20230512T2230Z_APG_CGM-1D-r1-RAS.xml

The text was updated successfully, but these errors were encountered:

Haigutus · 2023-05-12T13:56:55Z

I would propose a rule, that filename can contain only data that can be extracted from file header.

Reasoning:

Currently some metadata is added to filename, that is not present inside the file and then the filename parsing becomes mandatory process. To avoid this in future we should force the rule and if additional metadata is needed, then first file header/manifest needs to be extended
Filename can be automatically created at the moment of storage by extracting relevant metadata from the file header

Sveino · 2023-05-12T15:02:36Z

@Haigutus Yes, definitely - I was hoping this would come clear out of the text above. In the discussion with CSA it is clear that we need to have a name - may above proposal is based on this. Making sure that we can cover the current requirement. The next step would be to come up with a proposal that used our current header data.

Sveino · 2023-05-19T06:32:59Z

Updated above that the _[dcat:version] is referring to dcat:Dataset and not dcat:Distribution. It now refers to when a dataset is replaced by a new version with the same metadata, e.g. start and end validitiy period. dcat:version will follow semantic versioning, e.g. https://semver.org/.

Sveino mentioned this issue May 13, 2023

Manifest instance file specification #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manifest payload naming standard #39

Manifest payload naming standard #39

Sveino commented May 12, 2023 •

edited by Haigutus

Loading

Haigutus commented May 12, 2023

Sveino commented May 12, 2023

Sveino commented May 19, 2023

Manifest payload naming standard #39

Manifest payload naming standard #39

Comments

Sveino commented May 12, 2023 • edited by Haigutus Loading

Haigutus commented May 12, 2023

Sveino commented May 12, 2023

Sveino commented May 19, 2023

Sveino commented May 12, 2023 •

edited by Haigutus

Loading