New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes as document-properties and p:document-properties-document() #371

Closed
xml-project opened this Issue Apr 21, 2018 · 18 comments

Comments

Projects
None yet
5 participants
@xml-project
Copy link
Contributor

xml-project commented Apr 21, 2018

Currently document-properties are defined as map(xs:QName, item()), and there is the suggestion to specify them as map(*). This means, that a map might contain a node.
What is supposed to happen to this node in p:document-properties-document()? The specs say:

Each key in the properties becomes an element, each value becomes the content of that element.

Does this mean the node becomes the child of the element?
And what do we say about maps and arrays?

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 21, 2018

Did some research in the last hours and I think there are some options:

  1. We stay with the current specs and say that nodes, maps, arrays and function are allowed in document-properties. In this case I would recommend to change the prose of p:document-properties-document(), so that it conforms to the standard XPath behaviour for maps-to-xml. This would also work with map(*), when we would have sequences as values and xs:integer as keys.

  2. We change definition of document properties from

The document properties are exposed to the XProc pipeline as a map (map(xs:QName, item())

to map(anyAtomicType, anyAtomicType) and exclude maps, arrays, nodes and functions explicitly as values. Then we could say that the property-key becomes the element name if the key is a string (or derived from xs:string) or a xs:QName (or derived from it). All other keys (which like xs:integer) that will not result in proper element names are prefixed by "property", so become "property-123" for instance.

  1. Also as map(anyAtomicType, anyAtomicType), but the key does not become the element name, but an attribute value, so we would have
<p:document-properties>
  <p:property key="content-type">application/xml</p:property>
  <p:property key="base-uri" xsi:type="xs:anyURI">file:///somepath</p:property>
</p:document-properties>
  1. We use map(xs:QName, anyAtomicType) and leave everything as it is.

  2. We use map(xs:string, anyAtomicType) and leave everything as it is.

All options would also work with if we use anyAtomic+ and thereby allow values to be sequences: Either we have more than one element named after the key or more than one p:property element.

I do not have any preference, except that option (4) means we have to write

p:document-properties(.,xs:QName('content-type'))  or
<p:inline document-properties = "map{xs:QName('content-type'):'text/plain'}">

which I do not really like and option (5) would exclude qualified names which is not perfect.

I do think we should try to reach a consensus in this point before I work on a PR. Also this is a quite central point to any implementation and for the test suite tests.

@eriksiegel

This comment has been minimized.

Copy link
Contributor

eriksiegel commented Apr 22, 2018

I think (and didn't we reach consensus about that) that all maps, including this one, are always map(*). Let's stick to that please.

There seems to be a problem with opening the link for option 1 on my computer right now, so i cant have a look at it. But if there is some standard for it, lets follow that?

I attached some text from the programmers manual to this. You can find how I understood this whole map thing in section 5.3
xproc-book ch5.pdf
. Maybe that helps because it summarizes things?

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 22, 2018

I do not think we had a consensus on this, what did I miss?
I some places, where we refer to XPath function which use maps I think we should follow their lines, but even this could be disputed because we might restrict the range for key and value.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 22, 2018

Somehow editing comments went away, so I have to pose a second comment:
If, as you said, all maps are map(*), then we can not use the current specs of how to represent it as an XML document any more. It did not fit when it was map(xs:QName, item()) because it does not say how maps, functions, arrays and nodes are to be represented in the XML-document.
If move to map(*) an xs:integer or a xs:dateTime might appear as key and they do not make up a proper element name.
Sorry that I was not able to state my point clear enough.

@eriksiegel

This comment has been minimized.

Copy link
Contributor

eriksiegel commented Apr 24, 2018

After some thinking, here is my opinion on this:
One of the goals of our XProc 3 exercise is to make the language better, understandable and explainable. I think the decision to make all maps a map(star) is one that belongs absolutely in this category. I found it easy to describe in the section in the programmer's manual which is, for me personally, a good measure of how difficult things are.
But let's keep our priorities straight. Steps and functions that convert these maps to XML are important, but not as important as the map(star) decision originating from the goal to make the language better. So when in some edge cases these steps/functions run into trouble thats fine (as long as for the majority of cases everything is ok). I think it's acceptable and explainable that when you put in an entry of type xyz you cannot use the XML conversion and get an error.
Nonetheless, we should define some way to convert the majority of entry types in a map(star) to XML and back. I have absolutely no preference for this. Something like <entry type="xs:integer" key="1234">value</entry> for keys and something nice for map and array values? If there is some standard for that, lets follow it, even if its only for the reason we can refer to it and don't have to describe it.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 25, 2018

@ndw @gimsieke
Could we please reach a consensus on this as soon as possible, because p:document-properties-document() is used in a lot of tests in my test-suite (ca. 100 or so). If you have to change the format of the returned XML document because we change document-properties to map(*) I have to change all this tests.

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Apr 25, 2018

It will make the life of pipeline authors harder if the result of p:document-properties() were map(*) instead of map(xs:QName,item()). There is some value in having the keys as element names in p:document-properties-document().
If the property values are maps or arrays, we should include the json-to-xml() representation in the corresponding element (whose name is the property name) in the properties document. If the values are functions, I don’t know. Maybe make it implementation-defined whether and how to represent item()s other than atomic values, element/text/comment/PI nodes, maps, and arrays.
Not sure how to represent document nodes as values in the properties document so leave it up to the implementation or say that they will be dissolved and replaced with their content node(s?).

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 26, 2018

I strongly agree with @gimsieke that having keys as element names is an important point for pipeline authors. So we need a type for keys, that allows this construction or (which is the opposite point) excludes those types as key, that will not result in proper element names.
The reservation we had against xs:QName as type for the keys was, that we can not write
p:document-property(., 'content-type') but have to write xs:QName('content-type') over and over again in my pipeline.
Last week I was thinking about a union type for the key as union(xs:string, xs:QName) but I found out that this is not a viable way for Saxon based XProc processors since user-defined types are Saxon EE only.
To solve the problem with the construction of QName I would propose to have an additional function

p:document-property($doc as document-node(), $key as xs:string) as item()?

which is the same as

p:document-property($doc, xs:QName($key))

This shifts the burden of constructing the QName to the implementation and therefor solves the xs:string vs. xs:QName problem.
For the value problem I think we should follow @gimsieke 's proposal.

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Apr 26, 2018

I second the proposed second function that accepts a string as the second argument and then casts it to a QName (as specified in XML Schema Part 2: Datatypes Second Edition).
Not in order to stifle debate but rather merely for the sake of adding some more “second”s to this comment, I suggest that we don’t waste a second waiting for a second opinion and agree to proceed as suggested.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 26, 2018

OK, I will try to write the prose and make a PR. But since I am on vacation this weekend it will probably take until Monday or Tuesday.

@cmarchand

This comment has been minimized.

Copy link

cmarchand commented Apr 26, 2018

I do agree with the reached consensus. Keeping property names as element names is a strict necessity. Having a second function where property name is a xs:string simplifies the pipeline writing.

Do we allow the use of a qualified name as a string : p:document-property($doc, 'els:myOwnProperty'), and let the processor resolve the string as a QName according to inscope namespace mapping ?

This could be an interesting feature for writers, but is much more complex to implement...

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 26, 2018

@cmarchand I was quite sure that someone would request this feature, one you have the string function.
Actually don't know whether we should implement this feature: Since the second argument will typically be a variable the QName can only be determined at runtime, so all possible namespace binding must be present at runtime. Since the XMLCalabash and MorganaXProc will heavily enforce parallelism, a feature like this will slow the processors done again.

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Apr 26, 2018

I thought being able to supply prefixed strings was already implied in the proposal and that they would be cast as QName using the (readily available, or am I mistaken?) in-scope namespace bindings. If casting fails (because a prefix isn’t bound in the given context), a dynamic error will be thrown.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 26, 2018

No, it was probably what you wished for but not what I proposed. ;-))
But looking at the specs again: The function call is an XPath expression and for processor evaluated XPath expressions all in-scope namespaces are visible, so a prefixed string could be resolved in this context.
However no in-scope namespace is visible in step evaluated XPath expression, so using a prefix string in these contexts will always result in an error.
I am pretty sure that this will be a recurring question on any help list.

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Apr 26, 2018

But p:document-property(), p:document-properties(), and p:document-properties-document() are available in processor-evaluated XPath expressions only. Therefore the question whether namespace bindings are present or not in step-evaluated XPath expressions isn’t relevant here because these functions aren’t available in step-evaluated XPath anyway.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

xml-project commented Apr 26, 2018

Yes, exactly this argument occurred to me some minutes ago and I was actually editing my previous to remove my mistake!

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Sep 5, 2018

Resolution: Use map(xs:QName, item()*) consistently in the spec

@ndw ndw self-assigned this Sep 6, 2018

@ndw

This comment has been minimized.

Copy link
Contributor

ndw commented Jan 26, 2019

The consensus here has been implemented. I'm closing this.

@ndw ndw closed this Jan 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment