New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option "serialization" on p:output for atomic steps #428

Open
xml-project opened this Issue Jun 24, 2018 · 18 comments

Comments

Projects
None yet
4 participants
@xml-project
Contributor

xml-project commented Jun 24, 2018

Currently we have:

<p:output
  port? = NCName
  sequence? = boolean
  primary? = boolean
  content-types? = ContentTypes
  expand-text? = boolean
  serialization? = map(xs:QName,xs:anyAtomicValue) />

for atomic steps, but I wonder what "serialization" is for.

If the processor is not serializing (if, for example, the pipeline has been called from another pipeline), then the serialization must be ignored.

And since an atomic step is always called from another step, the serialization options are always ignored.

Is there any (non-propertary) use case for this? I could think of calling an atomic step from within a p:library, but this is not a standard feature of a processor.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Jun 24, 2018

Contributor

I don’t think that your presupposition that an atomic step is always called from another step is justified. At least for Calabash, there’s nothing wrong with calling, for example, -s p:xslt on the command line. Then the serialization option, at least on the result port, is meaningful.

Contributor

gimsieke commented Jun 24, 2018

I don’t think that your presupposition that an atomic step is always called from another step is justified. At least for Calabash, there’s nothing wrong with calling, for example, -s p:xslt on the command line. Then the serialization option, at least on the result port, is meaningful.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 24, 2018

Contributor

According to the specs, an atomic step is ALWAYS called from another step.
What you mentioned is what I called a proprietary use case in my comment. In MorganaXProc you can call an atomic step from a library, but there is no place in the specs saying either use case is required.

The point is what to answer if someone asks what "serialization" is for and you can not find an answer in the specs. Possible solutions:
(1) Say that the processor may execute atomic steps by a processor specific mechanism.
(2) Remove "serialization". This does not prevent the implementer from doing any magic, but it leaves no un-explained feature in the specs. If you want to set serialization parameters for an output port: Write a pipeline!

Contributor

xml-project commented Jun 24, 2018

According to the specs, an atomic step is ALWAYS called from another step.
What you mentioned is what I called a proprietary use case in my comment. In MorganaXProc you can call an atomic step from a library, but there is no place in the specs saying either use case is required.

The point is what to answer if someone asks what "serialization" is for and you can not find an answer in the specs. Possible solutions:
(1) Say that the processor may execute atomic steps by a processor specific mechanism.
(2) Remove "serialization". This does not prevent the implementer from doing any magic, but it leaves no un-explained feature in the specs. If you want to set serialization parameters for an output port: Write a pipeline!

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 24, 2018

Contributor

@gimsieke Do you prefer to keep it and add a note saying it is used, when processors are able to call atomic steps?

Contributor

xml-project commented Jun 24, 2018

@gimsieke Do you prefer to keep it and add a note saying it is used, when processors are able to call atomic steps?

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Jun 24, 2018

Contributor

I suppose this is a case where its is absolutely clear for end-users when the serialization options are applied: When by the processor the results on the port are serialized. I know that for implementers things are often different...
ButI don't understand: an atomic step is ALWAYS called from another step? There must always be a first step that is "called", "processed" or whatever. Otherwise nothing could be done. So I suppose the standard is in error here?

I'm all for keeping the serialization option parameter.

Contributor

eriksiegel commented Jun 24, 2018

I suppose this is a case where its is absolutely clear for end-users when the serialization options are applied: When by the processor the results on the port are serialized. I know that for implementers things are often different...
ButI don't understand: an atomic step is ALWAYS called from another step? There must always be a first step that is "called", "processed" or whatever. Otherwise nothing could be done. So I suppose the standard is in error here?

I'm all for keeping the serialization option parameter.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 24, 2018

Contributor

@eriksiegel

ButI don't understand: an atomic step is ALWAYS called from another step? There must always be a first step that is "called", "processed" or whatever.

Yes, but this step is a compound step and for compound steps serialization makes sense. But we do not say in any place, that e.g. p:add-attribute may be called outside a compound step.

Contributor

xml-project commented Jun 24, 2018

@eriksiegel

ButI don't understand: an atomic step is ALWAYS called from another step? There must always be a first step that is "called", "processed" or whatever.

Yes, but this step is a compound step and for compound steps serialization makes sense. But we do not say in any place, that e.g. p:add-attribute may be called outside a compound step.

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Jun 24, 2018

Contributor

I think that’s a bug. We used to have a p:serialization element to define serialization parameters. When we made it a map, we must have just stuck it on p:output generally.

Atomic steps don’t serialize their output. Even in XML Calabash where you can run a step “directly”, I believe that’s described as syntactic sugar for a p:pipeline containing a single step.

We could leave it, and say it only applies on pipeline outputs, but that might be confusing. We could put p:serialization back in and hang the parameters off it. But it seems nicer not to have to have that level of indirection, I think.

We could make p:output on p:declare-step different than p:output elsewhere, but I wonder if it’s worth the effort?

Contributor

ndw commented Jun 24, 2018

I think that’s a bug. We used to have a p:serialization element to define serialization parameters. When we made it a map, we must have just stuck it on p:output generally.

Atomic steps don’t serialize their output. Even in XML Calabash where you can run a step “directly”, I believe that’s described as syntactic sugar for a p:pipeline containing a single step.

We could leave it, and say it only applies on pipeline outputs, but that might be confusing. We could put p:serialization back in and hang the parameters off it. But it seems nicer not to have to have that level of indirection, I think.

We could make p:output on p:declare-step different than p:output elsewhere, but I wonder if it’s worth the effort?

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 25, 2018

Contributor

May be we should take away "@serialization" from p:output all together, because it is also strange to have it on p:for-each etc.

What about adding an option "serialization" to p:declare-step which is "map(xs:string, map(*))" with the port name as key?

Contributor

xml-project commented Jun 25, 2018

May be we should take away "@serialization" from p:output all together, because it is also strange to have it on p:for-each etc.

What about adding an option "serialization" to p:declare-step which is "map(xs:string, map(*))" with the port name as key?

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Jun 25, 2018

Contributor

Hmm, that would become a complicated map and a bit hard to explain...
Re-instate p:serialization, but now with a map as the thing that sets the serialization options?

Contributor

eriksiegel commented Jun 25, 2018

Hmm, that would become a complicated map and a bit hard to explain...
Re-instate p:serialization, but now with a map as the thing that sets the serialization options?

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 25, 2018

Contributor

I am not opposed to p:serialization, but I thought the map would be easier to understand and save an additional XProc element.

Contributor

xml-project commented Jun 25, 2018

I am not opposed to p:serialization, but I thought the map would be easier to understand and save an additional XProc element.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Jun 25, 2018

Contributor

I’d favor differentiating between p:output in

  1. atomic step declarations (p:declare-step that does not contain a subpipeline – standard steps and extension steps, anything else?) and compound steps (p:for-each, p:viewport,p:choose,p:if,p:group,p:try`) on the one hand
  2. p:declare-step whose declaration contains a subpipeline on the other hand

Some clarification what exactly is „atomic“ might be due anyway. For example:

User-defined pipelines (identified with pfx:user-pipeline in the preceding syntax summary) are atomic. (http://spec.xproc.org/master/head/xproc/#note-udp)

This is a bit confusing IMHO in that it can make you think that all user-defined steps (those that are defined with p:declare-step and that have a subpipeline) are „atomic steps“. But if you buy into the definition that I tried to give above, then only standard steps and extension steps qualify as atomic steps.

Another interesting point wrt executing declared steps directly, as opposed to executing a p:declare-step document as the only means to execute XProc, is item 35. at http://spec.xproc.org/master/head/xproc/#implementation-defined
It seems to suggest that declared steps (those with a type, whether their declaration contains a subpipeline or not) may be executed directly by a processor. This is certainly interesting for steps in a user-supplied p:library, where the step declarations contain subpipelines, but there is the implicit assumption that processors may offer execution of standard atomic or extension steps.

Contributor

gimsieke commented Jun 25, 2018

I’d favor differentiating between p:output in

  1. atomic step declarations (p:declare-step that does not contain a subpipeline – standard steps and extension steps, anything else?) and compound steps (p:for-each, p:viewport,p:choose,p:if,p:group,p:try`) on the one hand
  2. p:declare-step whose declaration contains a subpipeline on the other hand

Some clarification what exactly is „atomic“ might be due anyway. For example:

User-defined pipelines (identified with pfx:user-pipeline in the preceding syntax summary) are atomic. (http://spec.xproc.org/master/head/xproc/#note-udp)

This is a bit confusing IMHO in that it can make you think that all user-defined steps (those that are defined with p:declare-step and that have a subpipeline) are „atomic steps“. But if you buy into the definition that I tried to give above, then only standard steps and extension steps qualify as atomic steps.

Another interesting point wrt executing declared steps directly, as opposed to executing a p:declare-step document as the only means to execute XProc, is item 35. at http://spec.xproc.org/master/head/xproc/#implementation-defined
It seems to suggest that declared steps (those with a type, whether their declaration contains a subpipeline or not) may be executed directly by a processor. This is certainly interesting for steps in a user-supplied p:library, where the step declarations contain subpipelines, but there is the implicit assumption that processors may offer execution of standard atomic or extension steps.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 25, 2018

Contributor

@gimsieke Sorry, I fail to see, what you are suggesting as a solution. Would you mind to elaborate on your point a bit more, so I can see, what syntax and implementation is suggested.

Contributor

xml-project commented Jun 25, 2018

@gimsieke Sorry, I fail to see, what you are suggesting as a solution. Would you mind to elaborate on your point a bit more, so I can see, what syntax and implementation is suggested.

@xml-project xml-project reopened this Jun 25, 2018

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Jun 25, 2018

Contributor

I first suggested that we allow a serialization map on p:option if it is a child of a p:declare-step that has a subpipeline, otherwise not.

In addition, I suggested that we clarify the spec wrt the definition of atomic steps and direct invocation of atomic steps.

Contributor

gimsieke commented Jun 25, 2018

I first suggested that we allow a serialization map on p:option if it is a child of a p:declare-step that has a subpipeline, otherwise not.

In addition, I suggested that we clarify the spec wrt the definition of atomic steps and direct invocation of atomic steps.

@xml-project

This comment has been minimized.

Show comment
Hide comment
@xml-project

xml-project Jun 25, 2018

Contributor

Thanks, now I got it. I completely agree with you on the first suggestion, but I am not sure that we could cover it without causing readers confusion: IMHO it would result in three different p:output-constructs (currently we have two):

  1. p:output as child of p:declare-step with subpipeline: p:output may have a connection and serialization options,

  2. p:output on compound steps (for-each etc.): connection, but no serialization.

  3. p:output on atomic step (steps that do not have a subpipeline): No connection, no serialization.

As Norm said, I am not sure it is worth it. May be Eric is right and we should use p:serialization again, with is only allowed as a child of p:declare-step with a subpipeline.

Concerning the direct invocation of atomic steps I do not have an opinion. I do not think that the current specs prohibit someone from inventing this feature, but I feel no need to say, that a conformant processor may do this or even should do this.

Contributor

xml-project commented Jun 25, 2018

Thanks, now I got it. I completely agree with you on the first suggestion, but I am not sure that we could cover it without causing readers confusion: IMHO it would result in three different p:output-constructs (currently we have two):

  1. p:output as child of p:declare-step with subpipeline: p:output may have a connection and serialization options,

  2. p:output on compound steps (for-each etc.): connection, but no serialization.

  3. p:output on atomic step (steps that do not have a subpipeline): No connection, no serialization.

As Norm said, I am not sure it is worth it. May be Eric is right and we should use p:serialization again, with is only allowed as a child of p:declare-step with a subpipeline.

Concerning the direct invocation of atomic steps I do not have an opinion. I do not think that the current specs prohibit someone from inventing this feature, but I feel no need to say, that a conformant processor may do this or even should do this.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Jun 25, 2018

Contributor

I’d prefer p:output with three slightly different models over re-introduction of p:serialization. I don’t think that there will be much confusion, in particular because these differences can nicely be captured in our Relax NG schema, so tooling (nXML, oXygen, …) will be informed to offer context-dependent completion only for the p:output attributes that are valid according to the schema.

Contributor

gimsieke commented Jun 25, 2018

I’d prefer p:output with three slightly different models over re-introduction of p:serialization. I don’t think that there will be much confusion, in particular because these differences can nicely be captured in our Relax NG schema, so tooling (nXML, oXygen, …) will be informed to offer context-dependent completion only for the p:output attributes that are valid according to the schema.

@eriksiegel

This comment has been minimized.

Show comment
Hide comment
@eriksiegel

eriksiegel Jun 25, 2018

Contributor

Agree with @gimsieke, forget my former suggestion about re-instating p:serialization.

Contributor

eriksiegel commented Jun 25, 2018

Agree with @gimsieke, forget my former suggestion about re-instating p:serialization.

@gimsieke

This comment has been minimized.

Show comment
Hide comment
@gimsieke

gimsieke Jun 25, 2018

Contributor

Because @xml-project asked about RNG-based content completion, I created a small demonstration that oXygen will provide decent completion if configured only by our RNG (or RNC) schema. Achim was afraid that it wouldn’t even offer the serialization attribute unless and until there is a subpipline further down in p:declare-step. I confirmed that the serialization attribute will indeed be suggested in the absence of a subpipeline. A rather helpful error message will display when the @serialization attribute has been entered and the subpipeline is still missing. I think it’s quite usable and in no way confusing for the user to have 3 different, context-dependent models for p:output (2 of which I have sketched in attached atomic.zip). Likewise, nXML mode should be able to use the RNC variant for content completion.

Contributor

gimsieke commented Jun 25, 2018

Because @xml-project asked about RNG-based content completion, I created a small demonstration that oXygen will provide decent completion if configured only by our RNG (or RNC) schema. Achim was afraid that it wouldn’t even offer the serialization attribute unless and until there is a subpipline further down in p:declare-step. I confirmed that the serialization attribute will indeed be suggested in the absence of a subpipeline. A rather helpful error message will display when the @serialization attribute has been entered and the subpipeline is still missing. I think it’s quite usable and in no way confusing for the user to have 3 different, context-dependent models for p:output (2 of which I have sketched in attached atomic.zip). Likewise, nXML mode should be able to use the RNC variant for content completion.

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Jul 20, 2018

Contributor

If we say that the serialization attribute is only allowed when p:output is an immediate child of p:declare-step, does it make sense to say also that the content-types attribute is only allowed in this case?

Contributor

ndw commented Jul 20, 2018

If we say that the serialization attribute is only allowed when p:output is an immediate child of p:declare-step, does it make sense to say also that the content-types attribute is only allowed in this case?

@ndw

This comment has been minimized.

Show comment
Hide comment
@ndw

ndw Sep 5, 2018

Contributor

We should have two declarations (stanzas) for p:declare-step, one for pipelines and one for atomic steps. We should restrict what can occur on p:output in the atomic step case.

Contributor

ndw commented Sep 5, 2018

We should have two declarations (stanzas) for p:declare-step, one for pipelines and one for atomic steps. We should restrict what can occur on p:output in the atomic step case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment