Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot compare whole documents #227

Closed
tofi86 opened this issue Jul 24, 2018 · 17 comments
Closed

Cannot compare whole documents #227

tofi86 opened this issue Jul 24, 2018 · 17 comments
Labels

Comments

@tofi86
Copy link

tofi86 commented Jul 24, 2018

Hi,

I tried to use XSpec to compare a whole document with an expected result but have multiple issues.
This is my setup:

article.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://docbook.org/xml/5.0/rng/docbook.rng" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://docbook.org/xml/5.0/rng/docbook.rng" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
	<info>
		<title>A Short Introduction To XSpec</title>
	</info>
	<sect1>
		<title>A short introduction to the XSpec Unit Testing Framework</title>
		<para>XSpec is a unit test and <link xlink:href="http://en.wikipedia.org/wiki/Behavior_Driven_Development">behaviour driven development</link> (BDD) framework for XSLT, XQuery and Schematron. It is based on the Spec framework of <link xlink:href="http://rspec.info/">RSpec</link>, which is a BDD framework for Ruby.</para>
		<para>XSpec consists of a syntax for describing the behaviour of XSLT or XQuery code and some code that enables to test the code against those descriptions.</para>
		<sect2>
			<title>Getting Started</title>
			<para>To get started, check out the installation instructions for <link xlink:href="https://github.com/xspec/xspec/wiki/Installation-on-Mac-and-Linux">MacOS/Linux</link> and <link xlink:href="https://github.com/xspec/xspec/wiki/Installation-on-Windows">Windows</link> and how to <link xlink:href="https://github.com/xspec/xspec/wiki/Getting-Started">write your first XSpec test</link>.</para>
		</sect2>
		<sect2>
			<title>Support</title>
			<para></para>
		</sect2>
	</sect1>
</article>
article-to-html.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns="http://www.w3.org/1999/xhtml"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	xmlns:xs="http://www.w3.org/2001/XMLSchema"
	xmlns:math="http://www.w3.org/2005/xpath-functions/math"
	xmlns:xlink="http://www.w3.org/1999/xlink"
	xmlns:pa="https://www.pagina.gmbh"
	exclude-result-prefixes="xs math xlink pa"
	xpath-default-namespace="http://docbook.org/ns/docbook"
	version="3.0">
	
	<xsl:output method="xhtml" encoding="UTF-8" indent="yes" html-version="5.0"/>
	
	
	<xsl:template match="/">
		<xsl:apply-templates/>
	</xsl:template>
	
	<xsl:template match="article">
		<html>
			<head>
				<xsl:apply-templates select="info"/>
			</head>
			<body>
				<xsl:apply-templates select="* except info"/>
			</body>
		</html>
	</xsl:template>
	
	<xsl:template match="info/title">
		<title>
			<xsl:apply-templates/>
		</title>
	</xsl:template>
	
	<xsl:template match="info/author"/>
	
	<xsl:template match="sect1 | sect2">
		<section>
			<xsl:apply-templates/>
		</section>
	</xsl:template>
	
	<xsl:template match="sect1/title">
		<h1>
			<xsl:apply-templates/>
		</h1>
	</xsl:template>
	
	<xsl:template match="sect2/title">
		<h2>
			<xsl:apply-templates/>
		</h2>
	</xsl:template>
	
	<xsl:template match="para">
		<p>
			<xsl:apply-templates/>
		</p>
	</xsl:template>
	
	<xsl:template match="link">
		<a href="{@xlink:href}">
			<xsl:apply-templates/>
		</a>
	</xsl:template>
	
</xsl:stylesheet>

This is what oXygen 20 (Saxon 9.8.0_8) produces as output:

article_expected.html
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml">
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
      		
      <title>A Short Introduction To XSpec</title>
      		
      	
   </head>
   <body>
      <section>
         		
         <h1>A short introduction to the XSpec Unit Testing Framework</h1>
         		
         <p>XSpec is a unit test and <a href="http://en.wikipedia.org/wiki/Behavior_Driven_Development">behaviour driven development</a> (BDD) framework for XSLT, XQuery and Schematron. It is based on the Spec framework
            of <a href="http://rspec.info/">RSpec</a>, which is a BDD framework for Ruby.
         </p>
         		
         <p>XSpec consists of a syntax for describing the behaviour of XSLT or XQuery code and
            some code that enables to test the code against those descriptions.
         </p>
         		
         <section>
            			
            <h2>Getting Started</h2>
            			
            <p>To get started, check out the installation instructions for <a href="https://github.com/xspec/xspec/wiki/Installation-on-Mac-and-Linux">MacOS/Linux [SECURE LINK]</a> and <a href="https://github.com/xspec/xspec/wiki/Installation-on-Windows">Windows [SECURE LINK]</a> and how to <a href="https://github.com/xspec/xspec/wiki/Getting-Started">write your first XSpec test [SECURE LINK]</a>.
            </p>
            		
         </section>
         		
         <section>
            			
            <h2>Support</h2>
            			
            <p></p>
            		
         </section>
         	
      </section>
   </body>
</html>

Then I wrote an XSpec scenario to compare actual transformation result with expected result from oXygen:

article-to-html.xspec
<?xml version="1.0" encoding="UTF-8"?>
<x:description xmlns:x="http://www.jenitennison.com/xslt/xspec"
	stylesheet="article-to-html.xsl">
	
	<x:scenario label="The integration test output">
		<x:context href="article.xml"/>
		<x:expect label="should match the expected result." href="article.html"/>
	</x:scenario>
	
</x:description>

First of all: the results do not match:

bildschirmfoto 2018-07-24 um 12 41 58

Obviously it's because of the indentation which is different when the XSpec framework transforms the data.

However, if I remove the indent="yes" or change to "no", it doesn't get any better.

Second: The method="xhtml" produces a <meta> element in oXygen which is totally missing in the XSpec result, as you can see on the screenshot.

Third: If I change the output declaration to <xsl:output encoding="UTF-8" indent="no" method="xml"/> the XSpec result seems to match the expected result from oXygen, all highlights in the HTML report are green, but the test fails anyway:

bildschirmfoto 2018-07-24 um 12 47 28

This seems like a bug to me.

Fourth: When looking at the XML report, there are lots of empty <test:ws xmlns:pkg="http://expath.org/ns/pkg"> elements which seem to be irrelevant and should be removed, right?


How can both of these issues be resolved?

How can I compare whole documents as kind of an "integration test"?

Thanks,
Tobias

@AirQuick
Copy link
Member

AirQuick commented Jul 24, 2018

First of all: the results do not match

Second: The method="xhtml" produces a <meta> element in oXygen which is totally missing in the XSpec result

You expect that a serialized result is equal to another serialized doc when both are parsed again.
So you need to actually serialize the result and parse it. How about this?

<x:description xmlns:x="http://www.jenitennison.com/xslt/xspec"
    stylesheet="article-to-html.xsl" xslt-version="3.0">
    
    <x:scenario label="The integration test output">
        <x:context href="article.xml"/>
        <x:expect label="serialized result should match the expected result when parsed" href="article_expected.html"
            test="
                .
                => serialize(
                    map {
                        'method': 'xhtml',
                        'indent': true(),
                        'html-version': 5.0
                        }
                    )
                => parse-xml()"/>
    </x:scenario>
    
</x:description>

(For more completeness, you may want to generate the serialization parameters dynamically from the stylesheet xsl:output.)

@AirQuick
Copy link
Member

AirQuick commented Jul 24, 2018

Fourth: When looking at the XML report, there are lots of empty <test:ws xmlns:pkg="http://expath.org/ns/pkg"> elements...

test:ws represents a non-ignorable whitespace-only text node. (xmlns:pkg is just a garbage.)
For how whitespace-only text nodes are ignored or recognized in XSpec, see xspec-space.xspec in #179. Every x:expect there results in Success.

@AirQuick
Copy link
Member

AirQuick commented Jul 24, 2018

Third: If I change the output declaration to <xsl:output encoding="UTF-8" indent="no" method="xml"/> ...

Note that in your screenshot, the right-hand Expected Result has XPath / from while the left-hand Result doesn't. It suggests that the expected result is a document node while the actual result is an element. So they are different items and the test result must be Failure.

To make the test result Success, you should either fix the stylesheet to produce a document node

<xsl:template match="/">
    <xsl:document><xsl:apply-templates/></xsl:document>
</xsl:template>

or fix x:expect to select an outermost element in the expected document loaded by its @href

<x:expect label="should match the expected element." href="article_expected.xml" select="element()"/>

whichever meets your actual goal.

@tofi86
Copy link
Author

tofi86 commented Jul 25, 2018

Hi, thanks for your replies @AirQuick!

Regarding your latest comment:

I wasn't aware of the <xsl:template match="/"> template, must have been there by mistake 🙈

However, I also didn't knew, that without a root template with a <xsl:document> a transformation creates an element() node by default and not a document().

Removing the root remplate and using select="element()" on the <x:expect>-Element works fine. Thanks.

Now the next step is to take a closer look at the serialized results.

Thanks, Tobias

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

You expect that a serialized result is equal to another serialized doc when both are parsed again.
So you need to actually serialize the result and parse it. How about this?

Unfortunately, this doesn't work for me. The "result" is always empty () and it seems to be an issue with the XSLT 3 serialize() function. If I just use test="." the result is available but doesn't match the expected XML. As soon as I use test=". => serialize(...) => parse-xml() the result gets empty ().

Any ideas?

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

As soon as I use test=". => serialize(...) => parse-xml() the result gets empty ().

Any ideas?

Okay, this seems to be an issue with the serialization parameter 'html-version' = 5.0. If I leave this out, the results match and the tests succeed. Although I have set html-version="5.0" in my XSLT... 🤷‍♂️

@AirQuick
Copy link
Member

What is your Saxon edition and version? My example worked on Saxon-EE 9.8.0.12 and 9.8.0.14 on my end.

@AirQuick
Copy link
Member

AirQuick commented Jul 26, 2018

Looks like Saxon fn:serialize() was not good at handling html-version until very recently:

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

I was running Saxon-HE 9.8.0.7. Now I'm using 9.8.0.14 and it works with the serialization parameter 'html-version'. Thanks!

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

Next question: Is it somehow possible to

a) ignore indentation whitespace or any other whitespace differences when comparing the result document and expected document?

b) ignore certain special elements within a document? (e.g. a timestamp element)?

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

Regarding b) I was able to come up with the following solution to only compary partial trees of result and expected:

<?xml version="1.0" encoding="UTF-8"?>
<x:description xmlns:x="http://www.jenitennison.com/xslt/xspec"
  xmlns:html="http://www.w3.org/1999/xhtml"
  stylesheet="../xslt/docbook-to-html-serialized.xsl" xslt-version="3.0">
  
  <x:scenario label="The integration test">
    <x:context href="demo2/article.xml"/>
    <x:expect label="should match the expected result." href="demo2/article.html" select="/html:html/html:body"
      
      test="
      /html:html/html:body
      => serialize(
        map {
          'method': 'xhtml',
          'indent': true(),
          'html-version': 5.0
        }
      )
      => parse-xml()"/>
  </x:scenario>
  
</x:description>

But it only works partially.

bildschirmfoto 2018-07-26 um 15 46 55

The expected result (via select="" attribute) is treated as document node despite selecting an element, and the result is treated as element() although the parse-xml() function should return a document node. I got stuck here. Any idea?

@AirQuick
Copy link
Member

AirQuick commented Jul 26, 2018

I think what you really want to extract from the actual result is this @test:

    <x:expect label="/html/body should match the expected result."
      href="demo2/article.html"
      select="/html:html/html:body"
      test="
      (
          .
          => serialize(
            map {
              'method': 'xhtml',
              'indent': true(),
              'html-version': 5.0
            }
          )
          => parse-xml()
      )/html:html/html:body"/>

@AirQuick
Copy link
Member

AirQuick commented Jul 26, 2018

b) ignore certain special elements within a document? (e.g. a timestamp element)?

The three dot feature mentioned in Wiki could help to some extent. I'm not aware of a built-in feature beyond that.

When I have to verify complicated things, I usually write a wrapper XSLT. For example, if I have to test actual.xsl and if it's difficult to test the outcome in XSpec directly, I would write:

wrapper.xsl

<xsl:stylesheet xmlns:my="my">
    <!-- Test target -->
    <xsl:include href="../src/actual.xsl" />

    <!-- Test helper -->
    <xsl:function name="my:test-filter" as="element(baz)">
        <xsl:param name="input" as="element(baz)" />
        
        <!-- do whatever filtering on $input and return the filtered one -->
    </xsl:function>
</xsl:stylesheet>

test.xspec

<x:description stylesheet="wrapper.xsl" xmlns:my="my">
    <x:scenario>
        <x:context><foo/></x:context>
        <x:expect test="my:test-filter(/bar/baz)"><baz/></x:expect>
    </x:scenario>
</x:description>

This usually works for me, although I might have a bug in my:test-filter... 😁

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

Awesome! 👏 Both of these solutions work very well!

I totally forgot about the "three dot" feature ;-)

@AirQuick
Copy link
Member

a) ignore indentation whitespace or any other whitespace differences when comparing the result document and expected document?

That is already done for whitespace-only text nodes, when you write XML within x:context, x:expect etc. So

<x:context>&#x09;&#x0A;&#x0D;&#x20;<foo/></x:context>

and

<x:context><foo/></x:context>

are treated equal.
Whitespace is kept in the other cases as far as I know, otherwise the test wouldn't make sense...
To remove arbitrary whitespace, I think you have to do it on your own.

@tofi86
Copy link
Author

tofi86 commented Jul 26, 2018

Alright, that sounds reasonable.

Thanks a lot for your support, @AirQuick!

We did a team training for XSpec and I think we will come up with a couple of changes and additions to the wiki docs. Stay tuned.

Best, Tobias

@tofi86 tofi86 closed this as completed Jul 26, 2018
@AirQuick
Copy link
Member

AirQuick commented Sep 5, 2018

test:ws represents a non-ignorable whitespace-only text node. (xmlns:pkg is just a garbage.)

The garbage xmlns:pkg is removed by #338.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants