Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Global Variables in coverage report in 3_0_3. #1921

Open
birdya22 opened this issue Apr 29, 2024 · 52 comments
Open

Issue with Global Variables in coverage report in 3_0_3. #1921

birdya22 opened this issue Apr 29, 2024 · 52 comments
Assignees

Comments

@birdya22
Copy link
Collaborator

birdya22 commented Apr 29, 2024

I should say at the beginning that I've done my testing with the changes from #1917.

The stylesheet has the following content:
There are 7 global variables at the start with a variety of contents: xsl:text, xsl:for-each, xsl:choose, xsl:analyze-string. None of them are referenced within the rest of the stylesheet.
There are then 2 named templates:
TestTemplate01 outputs 100 in a root node.
TestTemplate02 outputs 200 in a root node.

CoverageGlobalVariableIssue01.xspec invokes TestTemplate01.
CoverageGlobalVariableIssue02.xspec invokes TestTemplate02.

CoverageGlobalVariableIssue01.xspec coverage report shows some parts of the global variables being hit and some missed, rather than all ignored.

CoverageGlobalVariableIssue02.xspec coverage report looks fine.

Adrian
GlobalVariableCoverageTest.zip

@galtm galtm self-assigned this Apr 29, 2024
@galtm
Copy link
Member

galtm commented Apr 29, 2024

Adrian, thanks for the examples. When working on #1918, I looked out for whether the changes fixed your specific problem but I also wanted to know: Do the changes introduce side effects, especially code being marked as hit, when it's really missed? I generated coverage reports for files in this repo's test suite and noticed that my code changes affected the hit/missed/ignored status of some global XSLT parameters (specifically https://github.com/xspec/xspec/blob/master/test/issue-1564.xspec and https://github.com/xspec/xspec/blob/master/test/global-override.xspec if you or anyone else reading this is interested).

Looking into what the correct behavior should be is on my to-do list. Your examples will probably also be useful for determining whether #1918 is a correct fix or whether it needs adjustments.

@birdya22
Copy link
Collaborator Author

Amanda,
this is rather a long reply after some more investigation but the headline comment is that I think your change is good but some of the rules in the element() | text() template are probably causing problems, and this may be related to any changes that happened to the Saxon trace mechanism in 12.4 compared to when it previously worked (as there is no documentation about what it actually produces it must be difficult).
I haven't managed to get to grips with the references you provided.

Here are my thoughts. This is all from the point of only using Saxon-PE so there are a lot of features that I have never used, and cannot use.

I think the starting point should be a hit for everything that Saxon says is a hit, otherwise things get very tricky if you decide to disagree with Saxon and ignore some hits.
So I don't see anything wrong with your change because it more consistently captures hits in the coverage.xml file which should be the starting point.

So that leaves you with working out what to mark as a hit when it's not in the trace output, and when it's not a hit is it a miss or ignored.

From looking outside at this I would have a couple of questions (which I can partially answer from the coverage-report.xsl but I can't tell the rationale behind the rules in some cases):

  1. Do you have a list of elements that are not in the trace output even when they are hit (and in what circumstances this happens)?
  2. Do you have a written set of criteria that are used to define the situations when a node is hit/miss/ignored if it is not marked as hit in coverage.xml?
    Are the criteria tied down to the minimum use case?
    Do the criteria overlap in any ways such that there is a defined order in which they need to be applied?

From a bit more testing I have the following comments on my points 1 and 2 above (starting with some obvious ones):

  1. Not Traced
    xsl:matching-substring
    xsl:non-matching-substring
    xsl:otherwise
    xsl:when
    xsl:variable - when it is global and uses @select for a string e.g: <xsl:variable name="var02" select="'4'" />
    xsl:param - in a xsl:function (even if it is used it is not in the trace).
    xsl:with-param - when it uses @select for a string <xsl:with-param name="param02" select="'800'" />
    (Not xsl:for-each or xsl:for-each-group as these are in the trace)

  2. Criteria for deciding if hit/miss/ignored
    I don't see any issues with the template for xsl:when etc. except it doesn't need xsl:for-each and xsl:for-each-group.
    A couple of comments on the xsl:choose in the element() | text() template.
    In one of my tests I have a global xsl:param that is not used and it was marked as ignored. The rule which causes this is:
    <xsl:when test="empty(ancestor::xsl:*[parent::xsl:stylesheet or parent::xsl:transform])">ignored</xsl:when>
    I added in a rule above it and it changed to missed as you would expect:
    <xsl:when test="self::xsl:param and ((parent::xsl:stylesheet) or (parent::xsl:transform))">missed</xsl:when>

This may also be the reason why global variables have problems for me. I removed the 2 variable tests because I still think they shouldn't be needed as most xsl:variable elements are hit if used (but see below):
<xsl:when test="self::xsl:variable">
<xsl:sequence select="local:coverage(following-sibling::*[not(self::xsl:variable)][1], $module-id)" />
</xsl:when>
<xsl:when test="ancestor::xsl:variable">
<xsl:sequence select="local:coverage(ancestor::xsl:variable[1], $module-id)" />
</xsl:when>
By itself it makes parts of the xsl:variable ignored and parts missed. So I added in a global variable specific rule, like the xsl:param one:
<xsl:when test="self::xsl:variable and ((parent::xsl:stylesheet) or (parent::xsl:transform))">missed</xsl:when>
This is on the basis that if it isn't hit then it must be missed.

The comment for this rule
<xsl:when test="empty(ancestor::xsl:*[parent::xsl:stylesheet or parent::xsl:transform])">ignored</xsl:when>
says
but is the test having unintended consequences, like my global param being ignored.

I still see a problem for a global xsl:variable which has a select attribute with a string value because they are not hit in the trace when used. So I would say it should be marked as missed, but this is wrong and I don't know how you can get around this without getting Saxonica to change and include it in the trace.

Finally, my suggestion would be a review of the rules in xsl:choose in the element(0 | text() template to see if they are still doing that they are intended to do, whether they do more than was intended and whether any more are needed. I do realise the potential ramifications of this suggestion.

Note: what I found useful in the testing I did was something like a version of the coverage-report that only reports the hits from the coverage.xml file (this made it easy to see what was and wasn't being traced which is how I got some of the suggestions above). I had to knock something up to do this but I assume you have something similar.

Adrian

@birdya22
Copy link
Collaborator Author

birdya22 commented May 1, 2024

I've had a look at the tests in both the links and the reason for the change is the same in both.
In both cases xsl:param elements were ignored but are now hit.
Basically in 3_0_3 there was a hit on the line containing the param element but because it appears as a variable and not a param in the uqname, uqname="Q{http://www.w3.org/1999/XSL/Transform}variable", 3_0_3 didn't record a hit. With your change you don't just rely on the uqname so you recognise it as a hit via the class, class="net.sf.saxon.expr.instruct.GlobalParam".
So I think this is a good thing.

With your change I think this raises a question about whether it is worth doing the comparison with the uqname at all because if it fails you will still record a hit if there is a class name, which seems to make the uqname check redundant. This is your change inside local:hits-on-node
<xsl:if test="($node-uqname eq $hit-traceable-uqname) or
exists($hit-traceable-class)">
There is probably a way to stick with the previous method of relying on the uqname check if this would work:
I believe my initial problem was the fact that the first hit with a class="net.sf.saxon.expr.instruct.TraceExpression" mapped to a choice node and so all other hits of that class also assumed the uqname was choice, which wasn't the case, and so my example missed some hits.
A possible alternative solution might be to check every time whether the class & associated uqname match and if not create a new traceable element to record the new match. That wouldn't require the change to coverage-report.xsl, but also it would mean that the param elements in the 2 tests would still be ignored rather than hit (assuming no other changes).

In 3_0_3 the param lines were shown as ignored which seems a bit strange to me as I would have expected to see them as missed if they weren't hit, particularly as the test xspec expects the param to provide the result value.
This is probably down to this test in the element() | text() template:
<xsl:when test="empty(ancestor::xsl:*[parent::xsl:stylesheet or parent::xsl:transform])">ignored</xsl:when>

Related to my previous comments, does this need re-visiting to think about which global elements should be ignored or missed if they are not hit (this will of course come down to what Saxon records hits on). The XSLT 3.0 spec lists the following declaration elements permitted in the xsl:stylesheet:
xsl:accumulator
xsl:attribute-set
xsl:character-map
xsl:decimal-format
xsl:function
xsl:global-context-item
xsl:import
xsl:import-schema
xsl:include
xsl:key
xsl:mode
xsl:namespace-alias
xsl:output
xsl:param
xsl:preserve-space
xsl:strip-space
xsl:template
xsl:use-package
xsl:variable

From your comment, "I generated coverage reports for files in this repo's test suite and noticed ..." does this mean that you have automated checking of coverage reports against your current baseline?

Here are the details for issue-1564 as it is the simpler of the 2.
In 3_0_3 line 6 was hit in the coverage.xml file but appears as a variable rather than a param and it was shown as ignored (I would have expected it to be shown as missed):
<traceable traceableId="1"
uqname="Q{http://www.w3.org/1999/XSL/Transform}variable"/>
<hit lineNumber="6" columnNumber="82" moduleId="0" traceableId="1"/>

With your changes you've added the @Class into the coverage.xml and that is used to make it a hit via the class and not the uqname. coverage.xml is now
<traceable traceableId="1"
class="net.sf.saxon.expr.instruct.GlobalParam"
uqname="Q{http://www.w3.org/1999/XSL/Transform}variable"/>
<hit lineNumber="6" columnNumber="82" moduleId="0" traceableId="1"/>
with this change inside local:hits-on-node
<xsl:if test="($node-uqname eq $hit-traceable-uqname) or
exists($hit-traceable-class)">

Adrian

@galtm
Copy link
Member

galtm commented May 1, 2024

Adrian, thanks a lot for your analysis and observations. I am hoping to have a block of uninterrupted time sometime in the next few days, to look into this and write back more substantively. Thanks for your patience!

@birdya22
Copy link
Collaborator Author

birdya22 commented May 2, 2024

Amanda,
I noticed something new for you to add into the mix when considering what changes you want to make.
It is a problem in Saxon, a bit similar to my first issue. If there are xsl:if and xsl:choose elements encountered during tracing then whichever is encountered first, all xsl:if and xsl:choose elements are tagged with the first one encountered.

Saxon, and therefore xspec, only includes one of "if" or "choose" in the trace output even if both xsl:if and xsl:choose are present in the stylesheet.
I attached a little test case. 2 named templates with a xsl:if and xsl:choose in different order. The if and when clauses both succeed:
CoverageIfChooseIssue01.xspec calls template "IfThenChoose" and in the trace from 3_0_3 you only see "if" and in the coverage report the choose on lines 14 and 21 are set as missed.
CoverageIfChooseIssue02.xspec calls template "ChooseThenIf" and in the trace from 3_0_3 you only see "choose" and in the coverage report the if on lines 36 and 38 are set as missed.
This is a problem using the uqname check and would need a Saxonica fix (unless, as a special case, you wanted to consider if/choose the same thing).

With your current changes everything is good.

I'm not in any hurry for a response as I'm getting on ok at the moment using your changes and I've also set the tree model to linked in the Saxon config file.

CoverageIfChooseIssue.zip

@galtm
Copy link
Member

galtm commented May 2, 2024

Thanks. I'm looking at this now, in fact.

By the way, I edited your description at the top of this issue, to replace "1" with "2" in two spots ("CoverageGlobalVariableIssue02.xspec invokes TestTemplate02." instead of "CoverageGlobalVariableIssue01.xspec invokes TestTemplate02.", and similar in the last sentence.) I would not normally edit another person's comments, but I think these were merely copy/paste errors that are clearer to fix in place, for the sake of anyone else reading this issue.

@galtm
Copy link
Member

galtm commented May 5, 2024

I think the starting point should be a hit for everything that Saxon says is a hit, otherwise things get very tricky if you decide to disagree with Saxon and ignore some hits.

I agree.

  1. Do you have a list of elements that are not in the trace output even when they are hit (and in what circumstances this happens)?
  2. Do you have a written set of criteria that are used to define the situations when a node is hit/miss/ignored if it is not marked as hit in coverage.xml?
    Are the criteria tied down to the minimum use case?
    Do the criteria overlap in any ways such that there is a defined order in which they need to be applied?

You are asking good questions. XSpec doesn't have a formal specification, and I am not aware of artifacts that would answer your questions. I suspect that some of the XSLT logic was tailored for an older Saxon version that reported hits differently from Saxon 12.

I removed the 2 variable tests...

I think you are right to question those clauses. Perhaps if a past Saxon version never reported variable declarations as hit, it made sense for the coverage XSLT to look at a following sibling of a local variable. Looking at a following sibling of a global variable doesn't seem like the right logic.

I still see a problem for a global xsl:variable which has a select attribute with a string value because they are not hit in the trace when used. So I would say it should be marked as missed, but this is wrong and I don't know how you can get around this without getting Saxonica to change and include it in the trace.

Interesting observation, and it makes the task a little trickier.

The comment for this rule
<xsl:when test="empty(ancestor::xsl:*[parent::xsl:stylesheet or parent::xsl:transform])">ignored</xsl:when>
says <!-- A node within a top-level non-XSLT element -->
but is the test having unintended consequences, like my global param being ignored.

I agree, the comment doesn't match the logic, because that clause also matches top-level elements (including top-level XSLT elements) that don't match an earlier xsl:when clause.

Finally, my suggestion would be a review of the rules in xsl:choose in the element(0 | text() template to see if they are still doing that they are intended to do

Agreed.

Note: what I found useful in the testing I did was something like a version of the coverage-report that only reports the hits from the coverage.xml file (this made it easy to see what was and wasn't being traced which is how I got some of the suggestions above). I had to knock something up to do this but I assume you have something similar.

That's a good idea, and I can make something similar while investigating this.

With your change I think this raises a question about whether it is worth doing the comparison with the uqname at all because if it fails you will still record a hit if there is a class name, which seems to make the uqname check redundant

Yes, I also wondered whether we should merely pass along any hit with matching line/column data, regardless of the uqname or class name. That's not a change I want to do hastily, though. :)

From your comment, "I generated coverage reports for files in this repo's test suite and noticed ..." does this mean that you have automated checking of coverage reports against your current baseline?

Not exactly. The automated testing for coverage reporting in this repo is not as strong as the automated testing for the main XSpec functionality. I made an ad hoc script that generates lots of coverage reports, ran the script in workspaces with and without my code changes, and used a diff tool. Not nearly as automated as I would like.

Re #1921 (comment), I haven't looked at your CoverageIfChooseIssue.zip file yet.

General comment: What I'd like to do is treat these coverage reporting bugs as part of a larger task that involves enhancing automated testing of code coverage reporting in this repo, determining what's correct and what's incorrect, documenting the assumptions that this repo's code makes about what Saxon is reporting, and documenting the logic for when a node that Saxon doesn't report as "hit" should be marked as "missed" or "ignored."

I am happy to hear that you have workarounds at the moment, because I think being able to "refactor without fear" in this corner of the repo is going to require a fair amount of time to gradually build up foundations.

@birdya22
Copy link
Collaborator Author

birdya22 commented May 9, 2024

I've come across another issue which I'll raise with Saxonica direct and let you know what the number is.
There doesn't seem much point in me providing you with a test case and then you raising it with Saxonica (you can always add a comment to the issue).
Basically when you have a match template with a union of nodes the Saxon trace is odd if you don't match the first node in the union e.g.
<template match="nodeA | nodeB">...</template>
When the template is invoked with nodeB you get 0 in some column numbers of the descendants (although I also saw a column number of 23 when I had xsl:apply-templates as a descendant - not the correct number) but the trace also says it matched nodeA.
When the template is invoked with nodeA the trace looks good.

Separately I have put together a stylesheet that includes a large majority of the xslt elements in some simple examples. I need to do more work on it but will share it with you later.

Adrian

@birdya22
Copy link
Collaborator Author

My new Saxonica issue is https://saxonica.plan.io/issues/6419.
I also noticed that Michael has responded to https://saxonica.plan.io/issues/6415#change-26038.

@galtm
Copy link
Member

galtm commented May 10, 2024

My new Saxonica issue is https://saxonica.plan.io/issues/6419.
I also noticed that Michael has responded to https://saxonica.plan.io/issues/6415#change-26038.

Thanks for reporting the new issue, and I see that Michael has already marked it as Resolved.

Separately I have put together a stylesheet that includes a large majority of the xslt elements in some simple examples. I need to do more work on it but will share it with you later.

Fantastic! It sounds like your work can lead to some test cases in this repo. Recent pull request #1923 sets up a structure for code coverage reporting test cases. I invite you to make pull requests with test cases, if/when you are ready (probably after #1923 is merged in). If you want to share your files but have someone else shepherd them through GitHub, that is fine, too.

@birdya22
Copy link
Collaborator Author

GitHub is a mystery to me (I only got an account so I could raise an issue).
See the attached - there is a ReadMe in there.
Can you have a look and let me know what you think - there is quite a lot to digest.
I've provided some thoughts on rules for the elements as well as a test case that covers most of the xslt elements.

XSLTTestExample.zip

Adrian

@birdya22
Copy link
Collaborator Author

I meant to mention about the template variables that are sometimes hit and sometimes missed even though they are all executed. I had a thought about optimisation and tried disabling all optimisations except j (-opt:j). This changed most variables from miss to hit. I also tried it with removing "inline variables" optimisation (-opt:-v)- this was nearly as good.
This is something else to consider, but there is a downside in this disclaimer "Disabling optimizations may in some cases prevent streamed evaluation. For example, some expressions only become streamable because the optimizer is able to work out that sorting nodes into document order is unnecessary."

@galtm
Copy link
Member

galtm commented May 14, 2024

Can you have a look and let me know what you think - there is quite a lot to digest.

Quick look so far -- this looks very, very useful for advancing the testing of the code coverage reporting feature. I will spend more time digesting it, hopefully this week. (I'll be away from computers much of next week.)

Is ../Input/Test01Data.xml referenced from MainStylesheet.xspec supposed to be in the ZIP-file?

@birdya22
Copy link
Collaborator Author

Is ../Input/Test01Data.xml referenced from MainStylesheet.xspec supposed to be in the ZIP-file?

Sorry yes - rename the text file.

I have noticed some issues with some individual tests and although I tried to make them all independent there is some interaction between them, and also having them all in one 1000 line stylesheet means it is easy to get confused. So I think it needs to be split up into small tests for each element to make them easier to understand/check/amend, although there could still be an overarching stylesheet/xspec that includes them all.

If you think this is a good idea I am ok with doing it and would suggest something like the following for a naming convention, unless you already have one:
xsl-element-name-01-c.xsl or xspec. The 01 is in case there are more tests for the element and the -c is to indicate a coverage test (just to make it easier to see when looking at the names). So a couple of examples would be:
xsl-accumulator-01-c.xsl and xsl-accumulator-01-c.xspec
xsl-where-populated-01-c.xsl and xsl-where-populated-01-c.xspec.

Test01Data.txt

@galtm
Copy link
Member

galtm commented May 14, 2024

rename the text file.

Got it. Thanks!

Small tests sound like a good idea for the reasons you mentioned.

The naming convention you proposed sounds good, though I think you can omit the -c part of the name because #1923 proposes a dedicated directory (test/end-to-end/cases-coverage/) for code coverage reporting tests. Initially, the xsl- prefix will be repetitive across the set of files in that directory, but the prefix will be quite useful if XSpec ever supports code coverage reporting for a language other than XSLT. I don't have a strong opinion about the xsl- prefix, so whatever you like is fine with me.

@galtm
Copy link
Member

galtm commented May 14, 2024

I merged #1923 onto the master branch. In case you find it helpful to see the test arrangement for code coverage reporting tests, check out these resources:

@birdya22
Copy link
Collaborator Author

I've started having a look at the links above.

Hopefully a quick question: in my xsl:character-map test the *-result.html output has 3 backward question marks each in a box rather than 100 (Saxon output produces 100). I assume I'm missing a bit of configuration in xspec - do you know what I need to do?

@galtm
Copy link
Member

galtm commented May 15, 2024

I'm not sure yet. Where in the report do those characters appear? Are they literally in the raw HTML (it's not the &#x2E2E; character, ⸮, is it?) or are there some characters your browser can't display and it's showing the backward question marks instead? How can I reproduce the problem?

@birdya22
Copy link
Collaborator Author

I've attached my test case. It uses a character-map hierarchy and maps E000 to 0 and E001 to 1 to produce 100 in the output.
It's not that important at the moment.

characterMapIssue.zip

@galtm
Copy link
Member

galtm commented May 15, 2024

I don't think this is a configuration question after all. If you run the XSLT transformation outside XSpec, Saxon applies the character map using <xsl:output> when serializing the result for you. However, XSpec isn't actually serializing the output using <xsl:output>, so the weird characters you see are how the browser renders the original E000 and E001 characters. I found this out by executing the XPath expression //result//Q{}root/Q{}node/string() => string-to-codepoints() against the .../characterMapIssue/xspec/xsl-character-map-01-result.xml file I got from running the test, and then converting between decimal and hexadecimal.

Using the test attribute of <x:expect>, you can manually make XSpec serialize the result using a character map of your choice, and then re-parse for comparison with the expected result.

<x:expect label="Success"
   test=".
   => serialize(
      map {
         'method': 'xml',
         'use-character-maps': map{'&#xE000;': '0', '&#xE001;': '1'}
      }
   )
   => parse-xml()"
   select="/">
   <root>
      <node type="character-map">100</node>
   </root>
</x:expect>

It isn't pretty, although the details could potentially be hidden away in a helper stylesheet.

The technique I used is similar to the one in #227 (comment) .

@galtm
Copy link
Member

galtm commented May 15, 2024

From a code coverage perspective, the machinations to make XSpec serialize and reparse the result are not helpful, because they don't tell you whether running the transformation outside XSpec will "hit" or use the character map. Would making the XSLT use a character map in a serialize() expression cause XSpec to apply the character map without extra code on the XSpec side? Or does that not work because serialize() expresses the character map differently instead of referring to xsl:character-map?

@galtm
Copy link
Member

galtm commented May 16, 2024

It looks like you've already noted that Saxon does not trace character maps, even when it uses them. But if you need a way in XSpec to exercise a character map without extra XSpec code afterward, you could modify your match="xsl-character-map" template to call transform() with the option 'delivery-format': 'serialized'. Then, the XSpec file you attached in #1921 (comment) works fine. Does this help?

xsl-character-map-01-with-transform.xsl.zip

@birdya22
Copy link
Collaborator Author

I'm going for the very simple option, unless you would prefer me to do something else. My very simple option is:
<xsl:output-character character="0" string="0" />
Use the same value in the character and string attributes (for both 0 and 1). This gets xspec to work and I'm ok with Saxon working.

Also, I've raised another issue with Saxonica 6428 called "Output of xsl:on-non-empty changes when using TraceListener".
Basically you get different results from xsl:on-non-empty when you switch trace on and your using a sequence constructor for xsl:on-non-empty.
I haven't checked if anything else is affected like xsl:on-empty or xsl:where-populated as I'm sure Saxonica will investigate.
Adrian

@galtm
Copy link
Member

galtm commented May 16, 2024

Simplicity is good, and so is reporting an issue you found along the way. Maybe Saxon 13 will be a great release for tracing.

@galtm
Copy link
Member

galtm commented May 17, 2024

Adrian, I looked at your ZIP-file a little more, and it is really a wealth of information. I have some questions:

  1. Do you think "Column=0 in xspec and Saxon trace." in xsl:evaluate reflects a Saxon bug?

  2. Do you think it's reasonable to request that Saxon include xsl:perform-sort in its trace data?

  3. Do you think "Column=0 in xspec trace. Not in Saxon trace." in xsl:assert, xsl:on-non-empty, and xsl:try reflects an XSpec bug?

  4. Can you clarify your rule for xsl:param, "if global (parent is xsl:stylesheet or xsl:transform) or parent is xsl:template miss."? Are you saying that Saxon traces global params and template params, so the rule should be "mark as hit if Saxon says it's hit, and mark as missed otherwise"?

  5. If we add a new "unknown" category to the report, we'll need to explain to users how it differs from the "ignore" category. Are you thinking that "ignore" is a design decision that can stand on its own logic, whereas "unknown" is a result of insufficient information?

@birdya22
Copy link
Collaborator Author

As you're not around next week I'll have a think about your questions and reply later.
I've nearly got the individual test cases done and will look to run them through "Running the Test Suite Locally" next week.
I'm also going to try and put together some information about the impact, or not, or using the linked tree and turning optimization off, but I probably need some different test cases because the MainStylesheet.xsl and my individual test cases don't reflect real use and probably won't be affected very much by the linked tree and optimization changes.
Adrian

@birdya22
Copy link
Collaborator Author

Starting with replies to your questions.

Do you think "Column=0 in xspec and Saxon trace." in xsl:evaluate reflects a Saxon bug?

Yes although from the comments from Michael Kay I'm not sure Saxonica will do anything about it until version 13 when it all might change.

Do you think it's reasonable to request that Saxon include xsl:perform-sort in its trace data?

I would say yes because it is an instruction.

Do you think "Column=0 in xspec trace. Not in Saxon trace." in xsl:assert, xsl:on-non-empty, and xsl:try reflects an XSpec bug?

There is a change in my understanding of xsl:assert and xsl:try (see the XML file) as the TryCatch class trace appears on the line after the xsl:try and depends on what the first element is in xsl:try.
But, no I don't think it is an XSpec bug.

Can you clarify your rule for xsl:param, "if global (parent is xsl:stylesheet or xsl:transform) or parent is xsl:template miss."? Are you saying that Saxon traces global params and template params, so the rule should be "mark as hit if Saxon says it's hit, and mark as missed otherwise"?

Yes.

If we add a new "unknown" category to the report, we'll need to explain to users how it differs from the "ignore" category. Are you thinking that "ignore" is a design decision that can stand on its own logic, whereas "unknown" is a result of insufficient information?

Basically yes. From my tests I think "unknown" will be for elements that use a select attribute and are not traced. If the elements has a sequence constructor it is possible to use the state of the children to determine the state of the parent. I added some more tests in for this and the results are in the updates to the XSLT file.

Here are my thoughts on Optimization and Tree Models.

Optimization

Having played with a bit of Java I've discovered that HE only supports the following optimization values: lmtv.
From a bit of testing, changing the values seems to just affect xsl:variable (but there may be other things as well).
See https://saxonica.plan.io/issues/6415 for a Saxonica comment on tracing and optimization.
With optimization turned on the trace can miss variables that are used.
With optimization turned off the trace can hit variables that are not used.
So neither option is great.
I've don't know what happens with PE or EE.

Optimization would need to be configured in the "Running Tests" part where users could change it by setting a value in the SAXON_CUSTOM_OPTIONS environment variable (this would override a setting in a new saxon config file that could be used although you could set it as a command line option after the SAXON_CUSTOM_OPTIONS).
My suggestion would be to stick with the default optimization and add a warning on the Code Coverage Wiki page that some elements may not appear to be hit because of optimization and possibly describe how to see what optimization does rather than try and explain why variables are hit when there is no optimization (outside of xspec they can set the config item traceOptimizerDecisions to display what Saxon does during optimization. Some examples of the optimization are "Eliminated trivial variable", "Eliminated unused variable", "Inlined constant variable" and these are what causes issues with xsl:variables traced or not).
This also eliminates the Saxonica warning that disabling optimization can affect streaming.

Tree Model

Based on #1920 and https://saxonica.plan.io/issues/6405 with the fact that using the LinkedTree when generating the Coverage Report gives better column number values my suggestion is that the coverage-report-config.xml file has the treeModel="LinkedTree" set in it (I can provide details of the difference it makes to my test set if you want).

Are you thinking about a new release of XSpec before a new release of Saxon or waiting until after the next release?

@birdya22
Copy link
Collaborator Author

It would help if I attached something.
20240526Update.zip

@galtm
Copy link
Member

galtm commented Jun 4, 2024

@birdya22 : I've reviewed all your per-element coverage tests and assessed the effects of #1918 on the results. I'll wait to hear from you about attribution before setting up a pull request with the per-element coverage tests. (I could upload a ZIP-file so you could see the files, but then you wouldn't have the Git commits' comments and groupings.)

In the meantime, there are parts of this conversation I still need to digest that would feed into subsequent pull requests.

@birdya22
Copy link
Collaborator Author

birdya22 commented Jun 5, 2024

You mentioned GitHub being a mystery, so if it is OK with you, I'll create a pull request with your files plus my modifications.

Yes.

To attribute your work to you, should I include your name and GitHub user name in the descriptions of commits that contain your code and in the pull request description? Let me know what you're comfortable with as far as visibility and privacy go.

I'm ok with that.

@galtm
Copy link
Member

galtm commented Jun 5, 2024

if it is OK with you, I'll create a pull request with your files plus my modifications.

Yes.

I created #1932. I can't add you the list of official code reviewers (due to permissions, I think), but you should be able to add comments in the body of the pull request or on a per-line basis in the "Files Changed" tab.

@galtm
Copy link
Member

galtm commented Jun 7, 2024

I was thinking about putting the XsltElementData.xml file (from your ZIP-file in #1921 (comment)) into this repository somewhere. Especially in the absence of a specification document for the code coverage reporting feature, your file could

  • Help XSpec contributors review existing behavior and discuss possible changes to it.
  • Help someone research the changes in the design over time. If a pull request that changes the way the report depicts a certain element also includes the corresponding update in this data file, then the history of the data file makes it easier to find related changes.
  • Serve as a reference for contributors and users.

@birdya22 : Do you think that makes sense? If so, should I use the file from that latest ZIP-file as a starting point?

@birdya22
Copy link
Collaborator Author

birdya22 commented Jun 7, 2024

Do you think that makes sense?

Yes I think it makes sense.

If so, should I use the file from that latest ZIP-file as a starting point?

I would like to tidy it up first:

  • ideally produce a set of named rules (with a separate description of what they mean. There will need to be an ElementSpecific rule)
  • provide an overall description

What do you think about the format? An XML file was easy to produce but not easy to view which is why I produced a xsl stylesheet to turn it into a set of tables. What would work best on here?

@galtm
Copy link
Member

galtm commented Jun 9, 2024

What do you think about the format? An XML file was easy to produce but not easy to view which is why I produced a xsl stylesheet to turn it into a set of tables. What would work best on here?

Good questions. I'd like something that is easy for contributors to modify, is easy for anyone to read, and leads to understandable diffs.

I think your existing XML/XSLT/HTML approach is a good choice. A disadvantage, though, is having to generate the HTML file each time the XML changes. While GitHub can automate that, I'm not sure (without further GitHub research) if such automation can be done in a way that makes the HTML persist indefinitely instead of having an expiration date. For things like log files, expiration dates make sense.

We can make the browser apply XSLT dynamically to avoid having to keep a generated HTML file in sync with the XML source. However, I believe that that works only in some circumstances. In the simple dynamic-XSLT experiments I tried, an HTML previewer like html-preview.github.io (example: rendered HTML vs. raw markup if you don't use the previewer) didn't work, and viewing the experimental pages locally required a local web server rather than the file: protocol. FYI, the 3 experiments I tried used: a PI like <?xml-stylesheet type="text/xsl" href="example.xsl"?> at the top of the XML file; JavaScript to call the browser's native XSLT 1.0 support; and Saxon-JS.

I suppose HTML could be a source format -- with one table, not the different ways of presenting the same data. The html-preview.github.io renderer should work for viewing, without anyone having to convert XML to HTML.

Markdown renders nicely in GitHub and Oxygen, but I'm concerned that it will be a pain to work with a Markdown table that has lengthy content in some of the cells. Even if we use a WYSIWYG Markdown editor to facilitate modifying the Markdown file, diffs will be ugly.

Unless you prefer HTML or Markdown as a source format, my suggestion is to stick to the path you're already on. If you want to take some time to think it over or try experiments, no problem.

@birdya22
Copy link
Collaborator Author

I've had a play with Markdown and restructured the output so each element is in it's own details/summary section with a small table (there aren't any headings in the table and it produces a couple of blank cells).
I moved the comments under a Comment header and into multiple lines so that they will be easier to diff.
I added some descriptions at the top.

I still need to do more checking on the content but what do you think of the format?
I've generated it from the XML data so it's easy to amend the format at the moment.
xsltElementDetails.md

@galtm
Copy link
Member

galtm commented Jun 10, 2024

I like the format. I think it mitigates my concerns about tables with lengthy cell contents.

A few suggestions, especially while it is easy to regenerate from your XML data:

  • Consider not using <details> and instead using the XSLT element name as a heading (e.g., ### xsl:accept). While <details> makes it pretty easy to find a particular XSLT element in the browser-rendered list, GitHub provides a navigation aid that accomplishes the same thing without the disadvantage of hiding content from the browser's "find in page" functionality. To see the navigation aid, view https://github.com/xspec/xspec/blob/master/README.md, click Preview in the Preview/Code/Blame choice at the top left, and then all the way to the right of that, click the icon on the top right that looks like a bulleted list.
  • This repo uses a tool called prettier to enforce consistent formatting of certain file types, including Markdown. For instance, I think it complains when there isn't a blank line between a paragraph and a bulleted list. If you have access to Linux, can you try putting your Markdown file in your clone of this repo and then running the script at test/ci/run-prettier.sh? I don't know how to run prettier on Windows, so if you find a way, please let me know! (I end up having to push files to GitHub, see what happens, fix any errors, and push the fixes to GitHub.) If you're not able to run prettier locally, don't worry about it.
  • In the top matter, do you need the first level of bullets for - Categories and - Rules?
  • There are a few typos, where "it's parent" and "it's own" should use "its".

@birdya22
Copy link
Collaborator Author

  • Consider not using <details> and instead using the XSLT element name as a heading (e.g., ### xsl:accept).

Done.

  • This repo uses a tool called prettier to enforce consistent formatting of certain file types, including Markdown.

I installed prettier as an extension for MS VSCode. As an extension it doesn't seem to support the command line options so my only way of checking the changes was to get it to format my md document and then diff the 2 files.
The other option was to install Nodejs and npm which I haven't done.
I've fixed most of the things prettier complains about except for the fact it expands the table columns to the width of the longest string in the column - I can do this in my stylesheet for the initial version but anyone making changes to the file is unlikely to do it as part of any update.
I've attached my generated md file and the one that prettier produced.

  • In the top matter, do you need the first level of bullets for - Categories and - Rules?

No.

  • There are a few typos, where "it's parent" and "it's own" should use "its".

Yes.

I haven't made any changes to the actual content yet.
xsltElementDetails.md
xsltElementDetailsFormatted.md

@galtm
Copy link
Member

galtm commented Jun 10, 2024

I installed prettier as an extension for MS VSCode. As an extension it doesn't seem to support the command line options so my only way of checking the changes was to get it to format my md document

Great idea, thanks. I've just installed that extension as well.

I don't have further comments about the Markdown formatting. When you work further on the actual content, maybe the main heading should be more specific, including something about code coverage. XSLT Element Code Coverage? Code Coverage for XSLT Elements?

@galtm
Copy link
Member

galtm commented Jun 11, 2024

I had a thought about the content of the Markdown document: Wherever you know that a rule in the document is a proposed behavior rather than the current behavior, can you indicate that, such as by writing "(Proposed)"? It'll help us see where discussions or code changes are still to come. For example, I believe the rule you wrote for xsl:accumulator-rule is a proposal that isn't currently reflected in the code coverage reports.

@birdya22
Copy link
Collaborator Author

I'll have a think about that because the outcome may be the same but it may be that the code changes e.g. for the "Always Ignore" rule the result will still be "ignored" but there might be a code change to introduce a new template that matches all the elements to be ignored.

On a separate point I noticed in a Saxonica issue unrelated to XSPec Michael Kay wrote "We're working on final testing of a 12.5 maintenance release at the moment."

@birdya22
Copy link
Collaborator Author

In terms of discussions I've put "Element Specific - TBD" as a rule where I haven't proposed anything.
In terms of code changes I think they will be significant and it may be worth considering almost re-writing the mode="coverage" section from scratch (we know there are a number of tests in there that aren't sensible). Which leads to the question of how to validate that the code changes and subsequent coverage-report are correct (which is what you are pointing towards in your xsl:accumulator-rule point isn't it?).
Would you like me to write down what changes I expect in the coverage reports for the individual tests, ahead of, or independent of, any code changes? Or I could take the current coverage reports and produce an amended copy to reflect what I think they will look like after the code changes (this should just be a case of amending the class value because the report structure shouldn't change).

@birdya22
Copy link
Collaborator Author

I've done as much as I am going to on the XSLT Element Code Coverage file for now.

  • The attached version is the same as the prettier version
  • I've added a comment about where the elements are tested if it isn't the same as the current element
  • Feel free to change the file name
  • I went through all the *-coverage.xml files you produced to check what I've written
  • As mentioned above there are Element Specific - TBD rules where I haven't suggested what to do
  • The file includes xsl:package related elements (are you ok with leaving them in?)
  • The file includes elements that require Saxon-EE and haven't got any tests (xsl:fork and xsl:import-schema)
  • The file includes potential XSLT 4.0 elements with a TBD rule and no tests.
  • The elements which provide a challenge with respect to a decision are xsl:sort/xsl:perform-sort, xsl:try, xsl:merge and also xsl:variable
  • Is there any benefit in putting the version of Saxon that the file relates to, in the file, or just include it in a commit comment?

The potential XSLT 4.0 elements and the Saxon Extension Instructions (XSLT Only) raise the question as to the scope that XSpec covers with respect to Saxon. And whether it is, or needs to be, defined/documented somewhere.

I discovered an issue related to xsl:param when going through the tests. In the xsl-map-01.xsl test the xsl:template/xsl:param on line 10 isn't in the coverage.xml file. I think there needs to be some additional tests using a sequence constructor but need to investigate a bit more.

xsltElementDetails.md

@galtm
Copy link
Member

galtm commented Jun 11, 2024

Which leads to the question of how to validate that the code changes and subsequent coverage-report are correct (which is what you are pointing towards in your xsl:accumulator-rule point isn't it?).

Yes.

Would you like me to write down what changes I expect in the coverage reports for the individual tests, ahead of, or independent of, any code changes? Or I could take the current coverage reports and produce an amended copy to reflect what I think they will look like after the code changes (this should just be a case of amending the class value because the report structure shouldn't change).

If you have time to spend on recording your thoughts, whatever approach you prefer will be helpful. Thank you for suggesting this.

  • The file includes xsl:package related elements (are you ok with leaving them in?)

Sure.

I haven't looked at your last .md file yet, but I think I can do so by Thursday.

@birdya22
Copy link
Collaborator Author

I didn't include a blank line between the lines in the Comments so here's a better one.
xsltElementDetails.md

I've looked at the xsl:param test and added more tests and it has got quite tricky for the xsl:template case. I included some details in the .md file but I'm not sure I explained it very well.
The main complication is that the default value (as a sequence constructor) is traced if it is used. But the xsl:param trace is set on the first sequence constructor element which can make the first element appear 'hit' and any subsequent elements appear 'missed' if the default value is not used.
I need to look at it some more.

@birdya22
Copy link
Collaborator Author

Here's my list of changes I expect to see in the Coverage Reports based on the rules in the xsltelementDetails.md file (some Report changes are TBD as the Rule is TBD).
Changes in Coverage Reports.txt

The xsl:param test does need updating to include additional tests. How will this be handled?

@galtm
Copy link
Member

galtm commented Jun 13, 2024

Thank you for the list of changes!

The xsl:param test does need updating to include additional tests. How will this be handled?

Do you mean how will it be handled in GitHub? Yesterday, you wrote, "I've looked at the xsl:param test and added more tests". Whenever you're ready to share your updates, I can set up a pull request with them. If you think a new GitHub issue would be a better vehicle for sharing updated files than adding to this issue, feel free to make a new one.

About changing the treatment of xsl:transform and xsl:stylesheet: Yes, one way to simplify diffs, files' histories, and code reviews would be to make a change like that in its own pull request.

galtm added a commit to galtm/xspec that referenced this issue Jun 14, 2024
Document written by user birdya22 (Adrian Bird).

From xsltElementDetails.md file in comment
xspec#1921 (comment)
and formatted with `prettier`
galtm added a commit to galtm/xspec that referenced this issue Jun 14, 2024
Document written by user birdya22 (Adrian Bird).

From xsltElementDetails.md file in comment
xspec#1921 (comment)
and formatted with `prettier`
galtm added a commit to galtm/xspec that referenced this issue Jun 14, 2024
Document written by user birdya22 (Adrian Bird).

From xsltElementDetails.md file in comment
xspec#1921 (comment)
and formatted with `prettier`
@galtm
Copy link
Member

galtm commented Jun 14, 2024

Your xsltElementDetails.md file became the basis for #1934.

@galtm
Copy link
Member

galtm commented Jun 14, 2024

  • Is there any benefit in putting the version of Saxon that the file relates to, in the file, or just include it in a commit comment?

I do think there is benefit in putting the Saxon version in the document. Code coverage reporting has such a tight connection with Saxon behaviors, and some of those behaviors have changed significantly across releases. I added the version number as a section of the document (not just a simple one-liner) in case we have more to say in the future, such as subtle differences among 12.4, 12.5, and 13.x.

@birdya22
Copy link
Collaborator Author

Whenever you're ready to share your updates, I can set up a pull request with them. If you think a new GitHub issue would be a better vehicle for sharing updated files than adding to this issue, feel free to make a new one.

I decided a new issue 1935 would be easier because of what I wanted to write about it.

As I can't attach .xsl or .xspec file to an issue, is it better to add .txt to the end (which is what I did in this case) or put them in a zip file, or do something else?

@galtm
Copy link
Member

galtm commented Jun 14, 2024

I decided a new issue 1935 would be easier because of what I wanted to write about it.

As I can't attach .xsl or .xspec file to an issue, is it better to add .txt to the end (which is what I did in this case) or put them in a zip file, or do something else?

Fantastic, thanks. Your .txt workaround is fine. If there were lots of files instead of only a couple, we'd probably find a ZIP-file easier.

galtm added a commit that referenced this issue Jun 28, 2024
* Per-element doc for code coverage behavior

Document written by user birdya22 (Adrian Bird).

From xsltElementDetails.md file in comment
#1921 (comment)
and formatted with `prettier`

* Editorial-type changes

- Rule descriptions state the rule first and then describe motivation
- Tried to clarify xsl:template xsl:param description
- Various wordsmithing changes

* Add mention of Saxon version

* Clarify xsl:stylesheet and xsl:transform comment

* Include link to Saxonica issue (template param)

Plus a few attempted clarifications also in the xsl:param section

* Use term "coverage status" instead of category
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants