Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control over schema validation in parse-xml(), doc(), etc. #490

Open
michaelhkay opened this issue May 10, 2023 · 2 comments
Open

Control over schema validation in parse-xml(), doc(), etc. #490

michaelhkay opened this issue May 10, 2023 · 2 comments
Labels
Enhancement A change or improvement to an existing feature PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 XQFO An issue related to Functions and Operators

Comments

@michaelhkay
Copy link
Contributor

I'm struggling with a problem with the stylesheet that generates QT4 tests from the examples in the function catalog, and I think it's an example of a more general problem in schema-aware processing.

The spec gives this example (for json-to-xml):

The expression json-to-xml('{"x": "\\", "y": "\u0025"}', map{'escape': true()}) returns 
(with whitespace added for legibility):

<map xmlns="http://www.w3.org/2005/xpath-functions">
  <string escaped="true" key="x">\\</string>
  <string key="y">%</string>
</map>

But the test we actually generate expects the result:

<map xmlns="http://www.w3.org/2005/xpath-functions">
    <string escaped="true" key="x" escaped-key="false">\\</string>
    <string key="y" escaped="false" escaped-key="false">%</string>
</map>

and the test is failing because the result produced by Saxon correctly excludes the escaped-key="false" attributes which the test is expecting. How did the attributes get there?

The answer is that the stylesheet is doing parse-xml() followed by some transformation to normalise whitespace, followed by serialize(). The parse-xml() call is invoking schema validation, which adds default attributes.

We probably don't want schema validation here; if we do want it, we probably don't want default attribute values to be expanded. But parse-xml() doesn't give us the choice. It says it's implementation-defined and it gives no options for the user to control it. Saxon provides configuration-level options but they aren't fine-grained enough to use here.

Without being able to control this, the only option seems to be for the stylesheet to transform the result to take out the defaulted attributes that the schema processor has added.

We need options on functions like doc() and parse-xml() to control whether and how schema validation is performed.

One of the options we need whenever we do validation is probably "validate+strip" - validate the input, report errors if it's invalid, but return the untyped data that was supplied to the validator, not the type-annotated data with expanded defaults.

@ndw
Copy link
Contributor

ndw commented May 11, 2023

Maybe the short term solution is to change the schema so that those values aren't default attributes and change the processing expection to be that absent values are treated as false?

@michaelhkay
Copy link
Contributor Author

I've done a short term fix by adding a transformation pass to remove the unwanted attributes.

@ChristianGruen ChristianGruen added XQFO An issue related to Functions and Operators Editorial Minor typos, wording clarifications, example fixes, etc. Enhancement A change or improvement to an existing feature and removed Editorial Minor typos, wording clarifications, example fixes, etc. labels May 14, 2023
@ndw ndw added PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 labels Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement A change or improvement to an existing feature PRG-hard Categorized as "hard" at the Prague f2f, 2024 PRG-required Categorized as "required for 4.0" at the Prague f2f, 2024 XQFO An issue related to Functions and Operators
Projects
None yet
Development

No branches or pull requests

3 participants