Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open questions on p:xslt #61

Open
xml-project opened this issue Apr 18, 2019 · 15 comments

Comments

Projects
None yet
3 participants
@xml-project
Copy link
Contributor

commented Apr 18, 2019

Unless I missed something important, some questions about p:xslt seems to be open:

  1. An XSLT 3.0 processor is invoked with an initial match selection which can be a sequence of XDM items. In the current signature of p:xslt port source is sequence="false". Do we stay we the item only approach or do we allow a sequence of documents to appear for port source? [For correction please see below]

  2. An XSLT 3.0 processor may return a sequence of XDM items as primary results, but port result is defined as sequence="false". Do we want to change this?

  3. As far as I understand our discussion yet, we still want to support XSLT 1.0 and 2.0 processors with p:xslt. If so, I think it should be a dynamic error, if the invoked processor is not 3.0 and there is not exactly one document on port source and this has to have an XML content-type.

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 18, 2019

Aren't the multiple output documents of an XSLT 3.0 transformation always secondary results?

If so, these multiple items (which you mention) should not create a sequence of documents, but a single document containing a sequence, and it seems to me the interpretation of the sequence should be dealt with according to the serialization specification. https://www.w3.org/TR/xslt-xquery-serialization-31/#serdm

@xml-project

This comment has been minimized.

Copy link
Contributor Author

commented Apr 18, 2019

Aren't the multiple output documents of an XSLT 3.0 transformation always secondary results?

Not as far as I understand the XSLT 3.0 specs (and the Saxon 9.8.xx) interface. It reads:

and the results of processing each item are then concatenated into a single sequence, respecting the order of items in the input sequence.

The creation of a secondary result with xsl:result-document is handled in a completely different section, so they are independent to my understanding.
Did I miss something?

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 18, 2019

I was talking about the outputs of the XSLT transformation, rather than the input.

As I understand it, the only way an XSLT 3 transform can produce a sequence of documents, is by using xsl:result-document; if it produces a sequence as its "principal result", then that should be output as a single document, and should not therefore be an XProc sequence, so sequence='false' is correct for the result port of p:xslt. That was the point I was trying to make.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

commented Apr 18, 2019

I think the quoted sessions talks about output or what other meaning could I attach to "results of processing"? (BTW I was using 'document' in the XProc sense = An XML document, a TEXT document, a JSON document etc.)

Could you please cite any passage from the XSLT 3.0 spec to support your understanding?

I tried the following with Saxon 9.8.x:

<xsl:template match=".">
  <xsl:sequence select="(1, 2, 3)" />
</xsl:template>

The result returned is a sequence of three atomic values, hence a sequence. No result document involved.

@xml-project

This comment has been minimized.

Copy link
Contributor Author

commented Apr 18, 2019

Correction of my initial post for (1): Of course in the signature of p:xslt port 'source' is sequence=true, but as the prose says, only the first item is used as initial match selection.

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 18, 2019

What I'm thinking of is "sequence normalization" as described here: https://www.w3.org/TR/xslt-xquery-serialization-31/#serdm

It is my belief that sequence normalization is a necessary (or at least, highly desirable) step, and if that's true, then the result of your example should be a text document containing a single text node 1 2 3. That is what I would expect the p:xslt step to emit on the result port (unless your XSLT specifies the json serialization method).

@ndw

This comment has been minimized.

Copy link
Collaborator

commented Apr 18, 2019

Given the definition of "initial match selection", I think we have to allow a sequence on source. Then we have to deal with the current semantics of a sequence on that port. I guess if version>=3.0 we say that the sequence becomes the initial match selection; with version<3.0 we keep the first-value/collection semantics.

I'd prefer (I think) not to enforce sequence normalization on the results, so I guess result becomes a sequence as well.

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 18, 2019

I'm actually conflicted about this now; I can see that sequence normalization would have the effect of erasing type information in the output tree (e.g. flattening numeric types to strings, etc).

But failing to normalize the sequence has odd consequences too: a stylesheet which when run from the command line produces a single output might produce many output documents (fragments) when run within an XProc pipeline. I don't think that an XSLT author would as a general rule want their output broken up into separate documents. Although there's no serialization as such in an XProc pipeline, it seems to me that the effect of running an XSLT inside XProc should not be radically different to running it in an environment in which serialization takes place, as a matter of design
principle.

@ndw

This comment has been minimized.

Copy link
Collaborator

commented Apr 18, 2019

Yes. I wonder if we're going to need an option for this ☹️

@ndw

This comment has been minimized.

Copy link
Collaborator

commented Apr 18, 2019

Per the 18 Apri 2019 editorial meeting, we agreed to make source a sequence with the semantics Norm describes above. We agreed to make the result a sequence as well and say that if the stylesheet produces a sequence, that's what you get.

Yes, this means you will possibly get different semantics than if you run the stylesheet outside XProc. But it's simple and consistent, or so we think.

@ndw

This comment has been minimized.

Copy link
Collaborator

commented Apr 18, 2019

Per point 3: yes, we should allow users to specify 1.0 and 2.0, but it's implementation defined what processors do with the different versions.

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 19, 2019

Do I have this right? If, for example, I have a stylesheet which converts an XML document to text using only a template that matches and copies certain text nodes, then this would produce either a single text document, or a multitude of text documents, depending on whether the XSLT is run in an XProc pipeline or not?

Or if my XSLT were to output a sequence consisting of a comment followed by an element, would they also appear in separate documents on the result port?

That would seem a very strange result, to me.

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 19, 2019

I've been reading up on the XSLT spec and I now see that sequence normalization should only be performed if the effective value of the build-tree option (specified as an attribute of xsl:output or xsl:result-document) is true. That is the default when method="xml" which means that so long as this configuration is actually respected, then when I am integrating some legacy XSLT in my XProc app I would not generally have to worry that it would output a spurious sequence. If an XSLT output declaration specified build-tree="false" then it could generate a sequence quite legitimately, and on that basis I'd withdraw my objection to the p:xslt step's result port being a sequence.

NB when method="json" the default value of build-tree is false.

@ndw

This comment has been minimized.

Copy link
Collaborator

commented Apr 23, 2019

I'm a little concerned about how and where the build-tree option is exposed, but I guess that's an implementor's problem. It would appear to me that we should respect the build-tree option.

@Conal-Tuohy

This comment has been minimized.

Copy link

commented Apr 25, 2019

I think Saxon will expose the build-tree option along with other serialization options, in a SerializationProperties object, passed to this method:
http://www.saxonica.com/documentation/index.html#!javadoc/net.sf.saxon.s9api/XdmDestination@getReceiver
http://www.saxonica.com/documentation/index.html#!javadoc/net.sf.saxon.serialize/SerializationProperties

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.