Skip to content

Commit

Permalink
Batch validation of computer samples against their XSD (#102)
Browse files Browse the repository at this point in the history
Scripts and XProc supporting validation of XSD generation
  • Loading branch information
wendellpiez committed Mar 15, 2024
1 parent 0010340 commit 7637dd2
Show file tree
Hide file tree
Showing 8 changed files with 287 additions and 29 deletions.
28 changes: 13 additions & 15 deletions src/schema-gen/InspectorXSLT/TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,26 +19,16 @@ Find resources for testing the XSLT Inspector and its production in the [testing

Documented in the Makefile -

For example
Use

```
> make test
> make help
```

Runs all tests using provided scripting.
to see the available make targets, supporting smoke testing, unit testing of XSLT production and others.

Don't commit unless this passes.

```
> make smoke-test
```

Runs only the 'smoke tests' (end to end production pipeline testing - are viable artifacts produced irrespective of their functionality or correctness)

Available:
- `smoke-test` - builds an XSLT from the `computer_metaschema.xml` test example, and attempts to execute the resulting XSLT over stub input. A failure indicates a problem in the production pipeline - it is either broken or wrong
- `spec-test` - runs specification tests - does the produced XSLT behave as expected when used on the possible range of (XML) inputs? this is a validator: does it validate?
- `unit-test` - runs transformation template- and function-level unit tests regulating the mapping between source (Metaschema) and target (XSLT) expressions.
Some utilities described on this page are also available using scripts, which will function in place despite not being accessible via `make`.
expressions.

Note this is work in progress and may change over time especially as we bring more tests in.

Expand Down Expand Up @@ -70,6 +60,14 @@ This XSD should validate the same set of rules as the Inspector (excluding Metas

A copy of the current-best schema is also here, to be refreshed as necessary): [testing/current/computer_metaschema-xmlschema.xsd](testing/current/computer_metaschema-xmlschema.xsd)

#### XSD Validate the Samples

A bash script `testing/xsd-crosscheck-samples.sh` executes an XProc pipeline that performs batch 'go/no-go' validation of XML sources expected to be either valid, or invalid, to the computer_metaschema model, as found in the `testing` folder.

It will report on the command line whether any files expected to be valid (based on their placement in the `computer-valid` folder) are not valid, or any files expected not to be valid (because in the `computer-invalid` folder) are found actually to be valid.

The validating parser used is the Java built-in parser, Xerces, as instrumented in XML Calabash (using `p:validate-with-xml-schema`).

### Refresh the 'computer model' Inspector XSLT

Before testing the Computer Inspector XSLT, the copy kept for testing must be refreshed.
Expand Down
28 changes: 20 additions & 8 deletions src/schema-gen/InspectorXSLT/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,21 @@ Check your XML with an XSLT to see if it is valid to the rules defined in a meta

Developers are urged to review this file in outline before reading it in detail. For experienced XSLT developers, it explains the interfaces. For beginners, it should tell you what you need to know. Any sections not of immediate interest can be skipped for later.

## Project status

This tool is now at a nominal version **0.8**, with plans for these milestones:

**v0.9** - will support all Metaschema features used in [OSCAL](http://pages.nist.gov/OSCAL/), the Open Security Controls Assessment Language, with conformance demonstrated with tests

**v1.0** - Supports all features specified for [Metaschema](http://pages.nist.gov/metaschema/) v1.0 as applied to XML, also with tests demonstrating conformance.

Metaschema is a NIST project in the Information Technology Lab (ITL) supporting the abstract description of data models expressible in either XML or JSON syntax (or other syntaxes), designed to enable and facilitate standards-based data exchange of information related to systems security.

As of early 2024, Metaschema is not yet finalized at version 1.0. As noted below, providing its definitions with validation in the form of a (second) conformant implementation is a driving motive for this project.

## How this works

A standalone XSLT ("stylesheet" or transformation specification) can be produced by applying a stylesheet (an XSLT) to a metaschema. Using the composition pipeline, it can apply to a top-level module of a modular metaschema.
A standalone XSLT ("stylesheet" or transformation specification) can be produced by applying a stylesheet (an XSLT) to a metaschema. Using the metaschema-xslt [composition pipeline](../../compose/), it can apply to a top-level module of a modular metaschema.

The XSLT that is created this way can be used to test XML instances for errors in view of the rules defined by the metaschema definitions.

Expand All @@ -27,7 +39,7 @@ Currently we plan to support only XML-based formats as defined by Metaschema, no

Users of Metaschema-defined JSON can try reformatting their data as XML using automated means such as scripts produced by the [Metaschema XSLT Converter Generators](../../converter-gen). Successful conversion will be valid on the other side. But failures will be indicated not typically be invalid results, and instead results missing those parts of the invalid that went unrecognized by the converter because they are invalid.

## Demo
## Demonstrations

The [testing/current](testing/current) directory shows such an XSLT, which can be applied to an instance or set of instances (documents) to be tested against the rules defined by its metaschema - in this case the Computer Model metaschema example provided.

Expand Down Expand Up @@ -94,8 +106,9 @@ For XML- and XSLT-focused developers of Metaschema and Metaschema-based technolo

### Use cases we have not catered to

- Developers who wish to build metaschema-aware applications
- Developers who wish to build their own metaschema-aware applications
- This application is intended to be operated as a black box: while at core this is a code generator, it is not designed to be easily extensible as such or produce a 'library', so you might prefer to reverse engineer it than to extend it
- I.e., while it should be easy enough to generate, deploy and use an InspectorXSLT for a metaschema you find or build, for building your own XSLT transpiler, you are welcome to borrow but you are on your own.
- Robots or 'lights out' automated processes (untested)
- The interfaces are designed to be flexible for interfacing but YMMV as to scale/throughput - experience will tell
- Our expectation is that performance will be good under normal loads but metaschemas will also vary considerably
Expand All @@ -117,7 +130,6 @@ For XML- and XSLT-focused developers of Metaschema and Metaschema-based technolo
- Validate lexical rules over datatypes
- Validate constraints as defined by Metaschema
- allowed values; string matching; referential integrity; arbitrary queries (assertions)

## Interfaces - how to use

### Schematron harness
Expand Down Expand Up @@ -153,7 +165,7 @@ Command line flags and options for using the InspectorXSLT with Saxon - note use

- `-s` required flag indicates the source file or directory - if a directory, `-o` is also required
- `-o` optional flag indicates where to write a report file; if omitted the report comes back to STDOUT; required when `-s` is a directory
- `-it` (or `-initial-template`) settings are supported as aliases of the `format` parameter (see below). If `format` is not given, a template can be called by name to initiate the same behavior. This is mainly useful for debugging or to configure a different fallback behavior from the core default in deployment.
- `-it` (or `--initial-template`) settings are supported as aliases of the `format` parameter (see below). If `format` is not given, a template can be called by name to initiate the same behavior. This is mainly useful for debugging or to configure a different fallback behavior from the core default in deployment.

##### Parameters

Expand Down Expand Up @@ -184,7 +196,7 @@ The `echo` parameter can be used to supplement output reports with messages to t

When producing HTML reports, a file name reference to an out-of-line CSS resource can be provided. It will drop from HTML outputs the inlined CSS, and instead provide a link to the named resource. Provide a CSS file with that name to control all the styling of the reports.

- `css=cssfile.css` replaces CSS in your HTML header with `\<link rel="stylesheet" href="cssfile.css">`.
- `css=cssfile.css` replaces CSS in your HTML header with `<link rel="stylesheet" href="cssfile.css">`.

TBD, to be considered (let us know):

Expand Down Expand Up @@ -364,7 +376,7 @@ No need to quit after first error; take advantage of the 'pull' process (random

The aims of the reporting are clarity/ease of use; to be unambiguous; to be traceable. To be concise and economical is a secondary goal.

Reporting can be parsimonious - no need to be exhaustive.
Reporting can be parsimonious - sometimes there is no need to be exhaustive.

At the same time, errors anywhere are of interest (see 'no need to quit'). Some amount of redundancy is okay if not too noisy.

Expand All @@ -386,7 +398,7 @@ Interestingly, this different perspective on the rule set leads to different str

If any of this is true, the application will show.

### Advantages
### Advantages of this approach

- Open-endedness with respect to arbitrariness of rules including contingent and co-occurrent rules
- Ease of post processing for presentation
Expand Down
162 changes: 162 additions & 0 deletions src/schema-gen/InspectorXSLT/testing/XSD-VALIDATE-COMPUTER-SAMPLES.xpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions" version="1.0"
xmlns:metaschema="http://csrc.nist.gov/ns/metaschema/1.0" type="metaschema:XSD-FUNCTIONAL-SAMPLES"
name="XSD-FUNCTIONAL-SAMPLES" xmlns:x="http://www.jenitennison.com/xslt/xspec"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:nm="http://csrc.nist.gov/ns/metaschema"
xmlns:xs="http://www.w3.org/2001/XMLSchema">

<!-- Input: depends on finding file 'inspector-functional-xspec/validations-in-batch.xspec' in place -->
<!-- Input: Additionally, all inputs named therein (expect errors for files broken or missing)-->
<!-- Input: Additionally, an up-to-date XSD for the computer model located at '../current/computer_metaschema-schema.xsd' -->
<!-- Output: an 'all is well' message, or unexpected results such as errors from files expected to be valid or validity from files expected to be invalid -->
<!-- Purpose: test alignment between XSD-based Metaschea validates and other forms, by providing a basis for comparisohn-->


<!-- &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& &&& -->
<!-- Ports -->

<p:input port="parameters" kind="parameter"/>


<!-- Lists are being maintained by hand for now, for clarity and robustness -->
<!-- Align with file 'inspector-functional-xspec/validations-in-batch.xspec'-->

<!-- TODO: separate out layers
provide valid and invalid tags as options
(along with file list and XSD)
parameterizing them for XSLT -->

<p:input port="samples" sequence="true">
<p:document href="computer-valid/valid1.xml"/>
<p:document href="computer-valid/valid2.xml"/>

<p:document href="computer-invalid/invalid1.xml"/>
<p:document href="computer-invalid/invalid2.xml"/>
<p:document href="computer-invalid/invalid3.xml"/>
<p:document href="computer-invalid/invalid4.xml"/>
<p:document href="computer-invalid/invalid5.xml"/>
<p:document href="computer-invalid/invalid6.xml"/>
<p:document href="computer-invalid/invalid7.xml"/>
<p:document href="computer-invalid/invalid8.xml"/>
<p:document href="computer-invalid/invalid9.xml"/>
<p:document href="computer-invalid/invalid10.xml"/>
</p:input>

<p:input port="computer-schema">
<p:document href="current/computer_metaschema-schema.xsd"/>
</p:input>

<p:serialization port="survey" indent="true"/>
<p:output port="survey">
<p:pipe port="result" step="assessment"/>
</p:output>

<p:serialization port="summary" indent="true" method="text"/>
<p:output port="summary">
<p:pipe port="result" step="summary"/>
</p:output>

<p:for-each>
<p:iteration-source>
<p:pipe port="samples" step="XSD-FUNCTIONAL-SAMPLES"/>
</p:iteration-source>
<p:variable name="base" select="base-uri(.)"/>

<p:try>
<p:group>
<p:validate-with-xml-schema name="validate-sample" assert-valid="true" mode="strict">
<!--<cx:message>
<p:with-option name="message" select="'here a message'"/>
</cx:message>-->
<p:input port="schema">
<p:pipe port="computer-schema" step="XSD-FUNCTIONAL-SAMPLES"/>
</p:input>
<!-- xsi:VALIDATING will be invalid unless contrived to be otherwise, but coming back valid also indicates success -->

</p:validate-with-xml-schema>

</p:group>
<p:catch>
<p:add-attribute attribute-name="VALIDATION-STATUS" match="/*" attribute-value="XSD-INVALID"/>
</p:catch>
</p:try>

<p:add-attribute attribute-name="base-uri" match="/*">
<p:with-option name="attribute-value" select="$base"/>
</p:add-attribute>


</p:for-each>


<p:wrap-sequence name="wrapup" wrapper="ANY-VALID"/>

<p:xslt name="assessment">
<p:input port="stylesheet">
<p:inline>
<xsl:stylesheet version="3.0" exclude-result-prefixes="#all">
<xsl:mode on-no-match="shallow-copy"/>

<xsl:function name="nm:found-in-path" as="xs:boolean">
<xsl:param name="path" as="xs:anyURI"/>
<xsl:param name="dirname" as="xs:string"/>
<xsl:sequence select="tokenize($path,'/')=$dirname"/>
</xsl:function>

<xsl:template match="/*">
<xsl:copy>
<NOMINALLY-VALID>
<xsl:apply-templates select="*[nm:found-in-path(@base-uri => xs:anyURI(),'computer-valid')]"/>
</NOMINALLY-VALID>
<NOMINALLY-INVALID>
<xsl:apply-templates select="*[nm:found-in-path(@base-uri => xs:anyURI(),'computer-invalid')]"/>
</NOMINALLY-INVALID>
</xsl:copy>
</xsl:template>
<xsl:template match="/*/*">
<document href="{@base-uri}">
<xsl:copy-of select="@VALIDATION-STATUS"/>
</document>
</xsl:template>
</xsl:stylesheet>
</p:inline>
</p:input>
</p:xslt>

<p:xslt name="summary">
<p:input port="stylesheet">
<p:inline>
<xsl:stylesheet version="3.0" exclude-result-prefixes="#all">
<!--<xsl:mode on-no-match="shallow-copy"/>-->


<xsl:template match="/*">

<REPORT>
<xsl:apply-templates select="child::NOMINALLY-VALID/document[@VALIDATION-STATUS='XSD-INVALID']"/>
<xsl:apply-templates
select="child::NOMINALLY-INVALID/document[not(@VALIDATION-STATUS='XSD-INVALID')]"/>

<xsl:on-empty>
<summary>ALL GOOD - confirming expected results from XSD validation</summary>
</xsl:on-empty>

</REPORT>
</xsl:template>

<xsl:template match="NOMINALLY-VALID/document[@VALIDATION-STATUS='XSD-INVALID']">
<finding href="{@href}">Unexpectedly found to be INVALID against the current computer_metaschema XSD</finding>
</xsl:template>

<xsl:template match="NOMINALLY-INVALID/document[not(@VALIDATION-STATUS='XSD-INVALID')]">
<finding href="{@href}">Unexpectedly found to be VALID against the current computer_metaschema XSD</finding>
</xsl:template>

</xsl:stylesheet>

</p:inline>
</p:input>
</p:xslt>

</p:declare-step>
5 changes: 2 additions & 3 deletions src/schema-gen/InspectorXSLT/testing/planning.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,8 @@

## Testing

- [ ] [Run CI/CD on forks?]( https://github.com/marketplace/actions/publish-test-results#support-fork-repositories-and-dependabot-branches)
- [ ] If not, then find a graceful way to error on failures, in forks
- [x] Refactor testing in this directory
- [x] Run CI/CD on forks?
- [x] Refactor testing in this directory
- [x] smoke-test: is a functional XSLT produced from a valid Metaschema
- [x] unit-test:
- current production tests (build out a little)
Expand Down
16 changes: 15 additions & 1 deletion src/schema-gen/InspectorXSLT/testing/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

See the [TESTING](../TESTING.md) docs for Inspector XSLT for explanation of how to test the application using these resources.

*Plus* - to see quickly what utilities for testing and development are supported using `make`, open a command line interface to the [InspectorXSLT directory[(..) and use

```
> make help
```

## Apologies to the XML-averse (for now)

Keep in mind when considering testing that Inspector XSLT currently only supports XML-based formats defined by a metaschema. Your JSON can be inspected only if you convert it to XML first - a conversion that is dependable only if the data was already valid to begin with.
Expand All @@ -26,10 +32,18 @@ Some [known-valid](valid/) and [known-invalid](invalid/) instances are also prov

Additionally, `tiny_metaschema.xml` is a small metaschema made specifically for trying and testing the markup datatypes.

### XSD validations of computer samples

This can be done with a script in batch, respecting the organization of examples that should be found valid or invalid, as indicated by their file location.

The script is `xsd-crosscheck-samples.sh`, which invokes XProc pipeline `XSD-VALIDATE-COMPUTER-SAMPLES.xpl`.

See [InpspectorXSLT/TESTING.md](../TESTING.md) for more information.

### XSpec demonstrating correctness of the Inspector

An Inspector can be generated from a metaschema such as `computer_model.xml` and tested against known inputs to demonstrate that the tests performed by the Inspector bring the correct results.

Exercising these tests, a number of XSpec files in this folder calling `current/computer_metaschema-inspector.xsl` should all complete successfully and report "all green" -- no warnings, no errors, no unexpected 'pending' sections.

See the [TESTING](../TESTING.md) docs for more information.
Again, see the [TESTING](../TESTING.md) docs for more information.
43 changes: 43 additions & 0 deletions src/schema-gen/InspectorXSLT/testing/xsd-crosscheck-samples.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env bash

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=../common/subcommand_common.bash

source "$SCRIPT_DIR/../../../common/subcommand_common.bash"

# XProc produces Inspector XSLT with a fail-safe check by compiling and running it
XPROC_FILE="${SCRIPT_DIR}/XSD-VALIDATE-COMPUTER-SAMPLES.xpl"

usage() {
cat <<EOF
Usage: ${BASE_COMMAND:-$(basename "${BASH_SOURCE[0]}")} [ADDITIONAL_ARGS]
Produces a validation report for a file set designated in the pipeline XSD-VALIDATE-COMPUTER-SAMPLES.xpl
Get this message with first argument '--help' or '-h'
Otherwise arguments are passed to XML Calabash, so take care (YMMV)
EOF
}

ADDITIONAL_ARGS=$(echo "${*// /\\ }")

CALABASH_ARGS="-osurvey=/dev/null $ADDITIONAL_ARGS \"${XPROC_FILE}\""

# echo "${CALABASH_ARGS}"

## show usage if a first argument is '-h', expanding $1 to '' if not set
if [ "${1:-}" = '-h' ] || [ "${1:-}" = '--help' ];

then

usage

else

invoke_calabash "${CALABASH_ARGS}"

fi

echo GAMBARIMASU!
Loading

0 comments on commit 7637dd2

Please sign in to comment.