Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Schema base URI collision #902

Closed
davaya opened this issue Apr 19, 2021 · 9 comments
Closed

JSON Schema base URI collision #902

davaya opened this issue Apr 19, 2021 · 9 comments
Assignees
Labels
bug Scope: CI/CD Enhancements to the project's Continuous Integration and Continuous Delivery pipeline. Scope: Documentation This issue relates to OSCAL documentation. Scope: Metaschema Issues targeted at the metaschema pipeline Scope: Modeling Issues targeted at development of OSCAL formats

Comments

@davaya
Copy link

davaya commented Apr 19, 2021

Describe the bug

The JSON Schema files for multiple layers all have the same base URI (value of the root $id keyword). This indicates a bug in the schema generation tools, since the URI is intended to (uniquely) identify schema resources.

How do we replicate the issue?

Examine JSON schemas in:

Observe that they begin with:

 { "$schema" : "http://json-schema.org/draft-07/schema#",
  "$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema.json",
  "$comment" : "OSCAL Control Catalog Model: JSON Schema",

{ "$schema" : "http://json-schema.org/draft-07/schema#",
  "$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema.json",
  "$comment" : "OSCAL Profile Model: JSON Schema",

{ "$schema" : "http://json-schema.org/draft-07/schema#",
  "$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema.json",
  "$comment" : "OSCAL Component Definition Model: JSON Schema",

Expected behavior (i.e. solution)

The base URI of each distinct schema should identify no other schema. For example:

"$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema/catalog.json"

"$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema/profile.json",

"$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema/component.json",
@david-waltermire
Copy link
Contributor

We are working to produce a single JSON and XML schema for all of OSCAL. This will address this issue once deployed.

@david-waltermire david-waltermire added Scope: CI/CD Enhancements to the project's Continuous Integration and Continuous Delivery pipeline. Scope: Documentation This issue relates to OSCAL documentation. Scope: Metaschema Issues targeted at the metaschema pipeline Scope: Modeling Issues targeted at development of OSCAL formats labels Apr 30, 2021
@davaya
Copy link
Author

davaya commented May 7, 2021

At first glance restructuring OSCAL from modular to monolithic seems like a step in the wrong direction. Loose coupling using namespaces would be the natural approach - is there a rationale or pros and cons for using a monolithic JSON and XML schema for all of OSCAL?

@GaryGapinski
Copy link

I thought there was an aversion to using more than one namespace (if that is what @davaya means — i.e., one namespace per sub-schema).¹

At the moment, there are multiple OSCAL schemas each within the same namespace, which makes

thus requiring the use of explicit schema association per instance document using

¹ OVAL made profligate use of namespaces which IMO markedly decreased its usability by increasing its complexity.

@david-waltermire david-waltermire self-assigned this May 18, 2021
wendellpiez added a commit to wendellpiez/metaschema that referenced this issue May 18, 2021
wendellpiez added a commit to wendellpiez/metaschema that referenced this issue May 18, 2021
david-waltermire pushed a commit to usnistgov/metaschema that referenced this issue May 19, 2021
@davaya
Copy link
Author

davaya commented Jun 3, 2021

BLUF: Schema namespaces yes. Data namespaces no.

After reading the OSCAL metaschema paper https://www.balisage.net/Proceedings/vol23/print/Piez01/BalisageVol23-Piez01.html the motivation for a single namespace becomes clearer. But when discussing the "OSCALizable subset of XML", the distinction between schema and data namespaces is, or appears to be, lost.

JSON data has no namespaces but JSON Schema does - the root $id of each schema file gives that file's namespace. Namespacing enables reuse of definitions - there's no need for OSCAL to re-invent SI units for length, mass and temperature, no need to re-invent GPS coordinates, etc. Those types can be created by experts and referenced when needed. JSON schema facilitates cross-namespace referencing using $ref, but the resulting data has no trace of namespacing because the data format explicitly does not support it.

I think it would be appropriate for each of the OSCAL schema/model layers to have its own namespace - there is no danger of namespace proliferation because the number of layers might grow from 7 to 8 or 9, but not to thousands. It might also be appropriate for the OSCAL XML data to emulate JSON data and be constructed without element prefixes. Data structure provides namespace separation the way filesystem paths ensure that there is no collision between files of the same name in different folders:

<markup>
    <table>
        <head/>
        <body/>
    </table>
</markup>

{
  "markup": {
    "table": {
      "head": [],
      "body": []
}}}

is not confused with:

<furniture>
   <table>
       <material/>
       <weight/>
   </table>
</furniture>

{
  "furniture": {
    "table": {
      "material": "oak",
      "weight": 52
}}}

@david-waltermire
Copy link
Contributor

david-waltermire commented Jun 3, 2021

@davaya A JSON schema does not have a namespace. It has a unique schema identifier expressed as a canonical URI. This is not the same as a namespace.

FWIW, we made an early decision that all of OSCAL will be in the same XML namespace, which I think at this point we need to keep for OSCAL v1. This allows us to reuse common information items across the OSCAL models (and schemas). Since in OSCAL XML all information items are in the same namespace, we can avoid having to alternate namespaces, which has been a confusing and problematic issue for users of other efforts that do this (i.e., OVAL, etc.).

@wendellpiez
Copy link
Contributor

Additional note: the Metaschema back end gives us a great deal of flexibility in this, for generating schemas (both XML and JSON) with specialized namespaces as well as with unified namespaces when/as appropriate. I am not sure everyone will regard this approach as a solution so much as (again) moving the problem. But it might offer options going forward.

@davaya
Copy link
Author

davaya commented Jun 4, 2021

@david-waltermire-nist:
"A Package is a namespace for its members, which comprise those elements associated via packagedElement (which are
said to be owned or contained), and those imported." -- https://www.omg.org/spec/UML/2.5.1/PDF Section 12.2.3.1

"Namespace is an abstract named element that contains (or owns) a set of named elements that can be identified by name. In other words, namespace is a container for named elements." -- https://www.uml-diagrams.org/namespace.html#:~:text=UML%20Common%20Structure,package

"When writing computer programs of even moderate complexity, it’s commonly accepted that “structuring” the program into reusable functions is better than copying-and-pasting duplicate bits of code everywhere they are used." -- https://json-schema.org/understanding-json-schema/structuring.html

A JSON schema file with a root $id acts like a package with a namespace and is used like a namespace, so if there is some terminological technicality that says it is not, the distinction will have to be articulated with much greater precision.

How to structure OSCAL is a design decision, and using a single namespace is certainly a valid option. It does require close coupling between the layers, and since they were apparently developed assuming loose coupling, any name collisions will need to be resolved before the single namespace can be realized. That's easily doable, but I would have favored loose coupling.

Cheers.

@david-waltermire
Copy link
Contributor

All name collisions within the OSCAL domain are handled by the Metaschema XML and JSON schema processing. The draft JSON and XML schemas produced should not have naming collisions.

FYI. The JSON definition IDs used in the "complete" schema are the same JSON definition ids used in each "model" schema. The same applies for XML types used in the "complete" vs "model" schemas. This allows the common information items to be easily identified.

david-waltermire added a commit to usnistgov/metaschema that referenced this issue Jun 6, 2021
* Rework of docs focusing on JSON docs and model pipeline
* Improvements to composition toolchain
* Fixed a few small bugs in the metaschema-check. Improved performance of the compose pruning using an accumulator.
* Moved edge-case samples into testing directory
* Made shadowing warning a warning
* Initial commit of an Oxygen Metaschema framework.
* Creation of new compose schematron unit tests.
* Cross-linking XML and JSON syntax pages and other improvements to links
* Now building XML and JSON indexes to reference pages, with links to steps
* Reconfigured docs pipeline (XSLT entry points); adding new files including pipeline steps
* Migrating schema generation tools to new/improved composition pipeline
* Addressing usnistgov/OSCAL#902 thanks for finding this bug
* Enhancements to JSON Schema definition (with better performance too)
* Adding support for json-base-uri as a metaschema property
* Updated JSON schema $id; factoring out common docs XSLT
* Fixing IDs in JSON schema per issue usnistgov/OSCAL#933.
* Addressing datatype validation issues: whitespace collapsing; non-empty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911  usnistgov/OSCAL#805 also #33 #67 #68
* Improvements to XSD production; fully aligning 'token' datatype across XSD and JSON Schema implementations.
* Updating bidirectional XML/JSON converter generators (#143)
* Committing a version that handles test data correctly (so far) from rebuilt metaschema composition addressing #51 #53 #76
* Now displaying constraints in documentation at point of definition;
* Docs generation revamp Reworked reference and other pages to sketch - #128 and others

Co-authored-by: Wendell Piez <wendell.piez@nist.gov>
@david-waltermire
Copy link
Contributor

The "complete" XML and JSON schemas have been integrated in PR #948. These will be released in OSCAL 1.0.0.

nikitawootten-nist pushed a commit to nikitawootten-nist/metaschema-xslt that referenced this issue Jul 21, 2023
* Rework of docs focusing on JSON docs and model pipeline
* Improvements to composition toolchain
* Fixed a few small bugs in the metaschema-check. Improved performance of the compose pruning using an accumulator.
* Moved edge-case samples into testing directory
* Made shadowing warning a warning
* Initial commit of an Oxygen Metaschema framework.
* Creation of new compose schematron unit tests.
* Cross-linking XML and JSON syntax pages and other improvements to links
* Now building XML and JSON indexes to reference pages, with links to steps
* Reconfigured docs pipeline (XSLT entry points); adding new files including pipeline steps
* Migrating schema generation tools to new/improved composition pipeline
* Addressing usnistgov/OSCAL#902 thanks for finding this bug
* Enhancements to JSON Schema definition (with better performance too)
* Adding support for json-base-uri as a metaschema property
* Updated JSON schema $id; factoring out common docs XSLT
* Fixing IDs in JSON schema per issue usnistgov/OSCAL#933.
* Addressing datatype validation issues: whitespace collapsing; non-empty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911  usnistgov/OSCAL#805 also usnistgov#33 usnistgov#67 usnistgov#68
* Improvements to XSD production; fully aligning 'token' datatype across XSD and JSON Schema implementations.
* Updating bidirectional XML/JSON converter generators (#143)
* Committing a version that handles test data correctly (so far) from rebuilt metaschema composition addressing usnistgov#51 usnistgov#53 usnistgov#76
* Now displaying constraints in documentation at point of definition;
* Docs generation revamp Reworked reference and other pages to sketch - #128 and others

Co-authored-by: Wendell Piez <wendell.piez@nist.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Scope: CI/CD Enhancements to the project's Continuous Integration and Continuous Delivery pipeline. Scope: Documentation This issue relates to OSCAL documentation. Scope: Metaschema Issues targeted at the metaschema pipeline Scope: Modeling Issues targeted at development of OSCAL formats
Projects
None yet
Development

No branches or pull requests

4 participants