Semantic validation #51

bittremieux · 2018-10-25T16:59:17Z

bittremieux · 2019-03-12T23:00:04Z

See pymzqc for a semantic validator for mzQC files (WIP).

bittremieux · 2022-05-04T14:53:17Z

Checks from our Slack channel:

So far collected rules for semantic validation:

must: no metric duplicates within a runQuality/setQuality

must: label (metadata) must be unique in the file

must: all columns in tables have same length

~~must: cv in file and obo mustmatch in id,name,type, ...~~

any 'may' rules??

Non-semantic checks:

~~automatic CV checks (be)for(e) cv integration ???~~

~~test multiple 'root' elements during syntax validation is invalid~~

cbielow · 2022-05-04T14:54:59Z

schema validation

The value to the 'schema validation' key is the parsed result to the JSONschema validation of given file, using the current schema (unless stated otherwise).

semantic validation

The value to the 'semantic validation' key is an array of checks performed on the deserialised mzQC object according to the latest specification. The checks are the following:

'input files':

Inconsistent input file of severity 4 and message: Inconsistent file name and location: auto_doc
Reused file location of severity 6 and message: Duplicate inputFile locations within a metadata object: accession = auto_doc
Duplicate input files of severity 5 and message: Duplicate input files in a run/set: accession = auto_doc
'## metric use':
ID based metric but no ID input file of severity 6 and message: ID based metrics present but no ID input file could be found registered in the mzQC file: accession = auto_doc
Metric uniqueness of severity 6 and message: Duplicate quality metric in a run/set: accession = auto_doc
Metric use of severity 5 and message: Non-metric CV term used in metric context: accession = auto_doc
Metric value non-table of severity 6 and message: Table metric CV term used without being a table: accession = auto_doc
Metric value non-column of severity 6 and message: Table metric CV term used with non-column elements: accession = auto_doc
Metric value disproportional table of severity 9 and message: Table metric CV term used with differing column lengths: accession = auto_doc
Metric value missing table column of severity 8 and message: Table metric CV term used missing required column(s): accession(s) = auto_doc
Metric value undefined table column of severity 5 and message: Table metric CV term used with extra (undefined) columns: accession(s) = auto_doc
Metric value no-unit of severity 3 and message: Metric CV term used without value unit specification. accession(s) = auto_doc

'ontology load errors':

Loading local vocabulary of severity 5 and message: Loading the following local ontology referenced in mzQC file: auto_doc
Loading online vocabulary of severity 5 and message: Error loading the following online ontology referenced in mzQC file: auto_doc

'ontology term errors':

Ambiguous CVTerms of severity 6 and message: term found in multiple vocabularies = auto_doc
Unknown CVTerm of severity 7 and message: CV term used without matching ontology entry: accession = auto_doc
Used CVTerm without definition of severity 4 and message: Term instance used in file missing definition: accession = auto_doc
Used CVTerms definition conflict of severity 5 and message: Term instance used in file with definition different from ontology: accession = auto_doc
Used CVTerms name conflict of severity 6 and message: Term instance used in file with name different from ontology: accession = auto_doc

'label uniqueness':

Metadata labels of severity 6 and message: Run/SetQuality label auto_doc is not unique in file!

API doc

This is the response to the API call for documentation. The API call for status will be responded with a JSON object summarising the API status and list of endpoints. The API call for validator with a POST of a mzqc JSON object responds with a JSON object, nested for each validation mode: semantic validation and schema validation. For each mode, the value will be a list of validation items found to not (completely) correspond to the standard format.

bittremieux · 2024-01-25T17:11:48Z

I guess higher is worse for severity? 9 levels is pretty detailed, maybe we could even do with just 3? Warning, error, critical in analogy to the Python logging levels.

mwalzer · 2024-01-25T17:15:03Z

Would work for me

bittremieux mentioned this issue Oct 25, 2018

JSON Schema #47

Merged

mwalzer added the specification document label Oct 26, 2018

mwalzer added the Release v1.0 label Feb 21, 2019

mwalzer assigned mwalzer and bittremieux Feb 21, 2019

bittremieux mentioned this issue Mar 12, 2019

Sync examples and CV, spot in mzQC #56

Closed

bittremieux removed the Release v1.0 label Nov 18, 2020

bittremieux closed this as completed in d8939e3 Jun 30, 2021

bittremieux reopened this May 4, 2022

bittremieux mentioned this issue Nov 29, 2023

Semantic validation MS-Quality-Hub/pymzqc#40

Closed

mwalzer closed this as completed Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Semantic validation #51

Semantic validation #51

bittremieux commented Oct 25, 2018 •

edited

Loading

bittremieux commented Mar 12, 2019

bittremieux commented May 4, 2022 •

edited

Loading

cbielow commented May 4, 2022 •

edited by bittremieux

Loading

mwalzer commented Jan 25, 2024

bittremieux commented Jan 25, 2024

mwalzer commented Jan 25, 2024

Semantic validation #51

Semantic validation #51

Comments

bittremieux commented Oct 25, 2018 • edited Loading

bittremieux commented Mar 12, 2019

bittremieux commented May 4, 2022 • edited Loading

cbielow commented May 4, 2022 • edited by bittremieux Loading

mwalzer commented Jan 25, 2024

schema validation

semantic validation

'input files':

'ontology load errors':

'ontology term errors':

'label uniqueness':

API doc

bittremieux commented Jan 25, 2024

mwalzer commented Jan 25, 2024

bittremieux commented Oct 25, 2018 •

edited

Loading

bittremieux commented May 4, 2022 •

edited

Loading

cbielow commented May 4, 2022 •

edited by bittremieux

Loading