Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xinclude regression: non-recursive, mutual xinclude fails #295

Open
dwcramer opened this issue Sep 12, 2019 · 3 comments

Comments

@dwcramer
Copy link

commented Sep 12, 2019

As I understand the xinclude spec, it is legal to xinclude for document's to xinclude from each other as long as they don't loop: https://www.w3.org/TR/xinclude/#loops

When I process either of the following documents with xmlcalabash version 1.22 and earlier, it resolves the xincludes successfully. However, beginning in xmlcalabash 1.23, I now receive the error "err:XC0029:XInclude document includes itself: b.xml". It's possible that the fix for #273 introduced this regression.

To reproduce

Process the following files with Calabash: $ java -jar xmlcalabash-1.1.27-99.jar -i a.xml xinclude.xpl

  • a.xml
    <article  xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.1">
        <title>Title of a.xml</title>
        <para xml:id="a">Para in a.xml</para>
        <xi:include href="b.xml" xpointer="b"/>
    </article>
  • b.xml
    <article  xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.1">
        <title>Title of a.xml</title>
        <para xml:id="b">Para in b.xml</para>
        <xi:include href="a.xml" xpointer="a"/>
    </article>
  • xinclude.xpl
    <p:declare-step version="1.0" xmlns:p="http://www.w3.org/ns/xproc" name="main">
        <p:input port="source" primary="true"/>
        <p:output port="result"/>
        <p:input port="parameters" kind="parameter"/>
        <p:xinclude />
    </p:declare-step>

Expected result

The pipeline should produce an xml file with the xincludes resolved:

<article xmlns="http://docbook.org/ns/docbook" xmlns:xi="http://www.w3.org/2001/XInclude" version="5.1">
    <title>Title of a.xml</title>
    <para xml:id="a">Para in a.xml</para>
    <para xml:id="b">Para in b.xml</para>
</article>

Actual result

ERROR: err:XC0029:XInclude document includes itself: b.xml
ERROR: It is a dynamic error if an XInclude error occurs during processing.
@ndw

This comment has been minimized.

Copy link
Owner

commented Sep 15, 2019

Thanks, @dwcramer

I'm teaching at XML Summer School this week, but I'll try to take a look. I have a couple of pending issues that I should try to get wrapped up and push a new release.

@dwcramer

This comment has been minimized.

Copy link
Author

commented Sep 18, 2019

No problem. Another data point about the non-recursive mutual xinclude: @raducoravu from Oxygen reports that he's patched the Xerces libraries they use so that it supports some, but not all, cases of non-recursive, mutual xincludes. He has reported the bug to Xerces: https://issues.apache.org/jira/browse/XERCESJ-1694

There is another issue that also cropped up between 1.1.22 and 1.1.23: In certain circumstances, resolving xincludes takes a very long time. I've posted an example doc at http://feline.thingbag.net/slow_xinclude.zip

For this document, resolving xincludes takes much longer in 1.1.23 than 1.1.22:

  • xmlcalabash 1.1.22: 20 seconds
  • xmlcalabash 1.1.23: 83 minutes

There isn't a noticeable performance hit for most other documents, but when there is a large source file (e.g. 5.3M) with many (e.g. 170) tables and different tables are xincluded into the book in different places, the performance problem crops up.

dcramer@anatine ~/projects/slow_xinclude
$ cat xinclude.xpl
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step"
    version="1.0" name="main">
    <p:input port="source"/>
    <p:output port="result">
        <p:pipe port="result" step="xinclude"/>
    </p:output>
    <p:xinclude fixup-xml-base="true" name="xinclude"/>
</p:declare-step>

dcramer@anatine ~/projects/slow_xinclude
$ time java -jar ~/Downloads/xmlcalabash-1.1.23-97/xmlcalabash-1.1.23-97.jar -i slow_xinclude.xml -o out.xml xinclude.xpl

real	83m7.465s
user	80m23.892s
sys	0m51.265s

dcramer@anatine ~/projects/slow_xinclude
$ time java -jar ~/Downloads/xmlcalabash-1.1.22-98/xmlcalabash-1.1.22-98.jar -i slow_xinclude.xml -o out.xml xinclude.xpl

real	0m20.700s
user	0m28.958s
sys	0m1.881s
@ndw

This comment has been minimized.

Copy link
Owner

commented Sep 21, 2019

I think this is an illegal loop per 4.2.7 of the XInclude specification.

  • We begin processing a.xml.
  • That requires that we process b.xml.
  • But processing b.xml requires that we process a.xml.

Even though the xpointers identify regions that aren’t explicitly recursive, the infoset expansion has to be completed before the xpointers can be resolved.

I don’t see any way to complete the infoset expansion without encountering the b.xml#a pointer as an ancestor of itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.