Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A generalized XML validation step #135

Open
ndw opened this issue Feb 9, 2015 · 3 comments

Comments

Projects
None yet
4 participants
@ndw
Copy link
Collaborator

commented Feb 9, 2015

From Gerrit Imsieke:

Others have already asked for unified report ports for the validation
steps p:validate-with-relax-ng and p:validate-with-xml-schema. While
we see that it might not be easy to change the signature of the
existing standard library steps, here’s a fresh approach that also
saves a lot of verbosity.

It builds upon the xml-model processing instruction that may be
prepended to an XML document (http://www.w3.org/TR/xml-model/).

We could either add a step p:validate-according-to-xml-models that
executes each validation and creates a sequence of c:errors and
svrl:schematron-output documents on the report port.

But as we strive for terseness of expression, we may add an attribute
use-xml-models="assert-valid|report-only|none" to input and output
ports. (p:input: both declarations and connections, where the
attribute value on connections has precedence).

If the attribute is on a step’s input port and its value is
'report-only', it will add a port 'report' (sequence=true) to the
readable ports within the step. Alternatively, the port could be named
'error', to avoid an additional name for ports that magically spring
into existence.

If the attribute value is 'assert-valid' and if the step is within a
p:try/p:group, it will add these report documents to the error port of
subsequent p:catch instructions.

This will greatly reduce verbosity by eliminating the need to spell
out input/output validation steps explicitly. It is syntactic sugar
that may be expanded to long-form explicit validation instructions (by
means of XSLT transformation, for example).

If there are no xml-model PIs, no validation will occur.

xml-model-based validation should support Relax NG, Relax NG compact
syntax, XSD in different versions, ISO Schematron, NVDL, and DTD.

Because prepending xml-model PIs to documents is a bit cumbersome,
there should be an optional step p:prepend-xml-model like this:

@ndw

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 18, 2015

Proposal for a generalized validation step

The following declaration is for a generalized validation step. Much of what follows describes how this step applies to XML validation, but the actual validation performed is implementation defined. Validation of JSON data against json-schema would be entirely plausible.

<p:declare-step type="p:validate">
   <p:input port="source" primary="true"
            content-types="application/octet-stream"/>
   <p:input port="schema" sequence="true"
            content-types="application/octet-stream"/>
   <p:input port="models" sequence="true"
            content-types="application/xml */*+xml text/*"/>
   <p:output port="result" primary="true" sequence="true"/>
   <p:output port="report" sequence="true"/>
   <p:output port="validation-attempted" sequence="true"/>
   <p:option name="assert-valid" select="'true'" as="xs:boolean"/>
   <p:option name="group" select="''" as="xs:string"/>
   <p:option name="phase" select="''" as="xs:string"/>
   <p:option name="version" as="xs:string"/>
   <p:option name="parameters" as="map(xs:QName,item())"/>
</p:declare-step>

The semantics of the p:validate step are that the source document is validated in an implementation defined way. The schema and models ports exist only to provide suggestions to the implementation.

There are several possible outputs:

  1. If the processor considers that no validation was requested (or does not recognize or cannot perform the requested validation), or if the assert-valid option was false and validation failed, then the original document is returned on the result port.
  2. If the processor attempts to validate and succeeds, then the validated document or documents are returned on the result port. In this case, the validation-attempted port should document the validation or validations that were attempted.
  3. If the processor attempts to validate and fails, and the assert-valid option is true, then nothing appears on the output port and an error is raised.

ISSUE: In the case where this error is caught by p:catch (how) can the validation-attempted and report steps be read?

The output on the report step depends on the validation attempted. For Schematron validation, a report format is defined. For other kinds of validation, the report is implementation-defined.

Although the step is for generalized validation, it does have a couple of options designed to support a specific XML scenario: the XML Model Processing Instruction. In the absense of other information, implementations should use the XML Model PI to determine what kind of validation to perform on XML documents.

The group and phase options provide the corresponding values as discussed in the XML Model PI spec.

The models input port and the validation-attempted output port use XML documents to describe desired validation in the former case and validations attempted in the latter. The following c:model element definition should be supported.

<c:model
   href? = anyURI
   type? = string
   schematypens? = anyURI
   charset? = string
   title? = string
   group? = string
   phase? = string
   />

Additional variations on c:model are allowed, as are entirely different vocabulary elements as appropriate.

Validation with RELAX NG

When RELAX NG validation is selected, the following parameters should be recognized: dtd-attribute-values, and dtd-id-idref-warnings.

Validation with XML Schema

When XML Schema validation is selected, the following parameters should be recognized: use-location-hints, try-namespaces, and mode.

Validation with NVDL

If an NVDL schema appears on the models port, NVDL validation should be attempted.

@ndw

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 18, 2015

This was discussed at the 25 Feb 2015 meeting, http://www.w3.org/XML/XProc/2015/02/25-minutes (the issue, that is, not the proposal)

@josteinaj

This comment has been minimized.

Copy link

commented Apr 8, 2015

An output port with basic information about the validation when assert-valid="false" would be useful. Such as the total number of assertions, number of assertions failed, skipped, with warnings and succeeded. Currently a schematron validation succeeds if count(//svrl:failed-assert) + count(//svrl:successful-report) = 0, and this XPath is different for other kinds of validations. Some metadata about the validation, when available, might also be useful to include in such a document, such as name (/sch:schema/sch:title) and base URI of the source document. Maybe just something like:

<c:result name="Test Name" tests="18" skipped="7" errors="5" warnings="3"/>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.