Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boolean operations on nodes #1106

Closed
joewiz opened this issue Aug 27, 2016 · 17 comments · Fixed by #1229
Closed

Boolean operations on nodes #1106

joewiz opened this issue Aug 27, 2016 · 17 comments · Fixed by #1229
Labels
bug issue confirmed as bug
Milestone

Comments

@joewiz
Copy link
Member

joewiz commented Aug 27, 2016

As reported by @dariok in #1097 and confirmed by both @dariok and me using builds of the develop branch as current as 12f422d, boolean operations on a node produce different results based on the order that the node is selected.

We have two reproducible tests, which we tested with eXide (2.2.0):

Test 1: In-memory node

xquery version "3.1";

let $node := 
    <root>
        <foo/>
        <bar/>
    </root>
let $x := $node/blah
let $y := $node/blah
return 
    <test>
        <result>{$x/foo and $x/bar}</result>
        <result>{$y/foo and $y/bar}</result> 
    </test>

Expected result:

<test>
    <result>false</result>
    <result>false</result>
</test>

Actual result:

<test>
    <result/>
    <result>false</result>
</test>

Test 2: On-database node

xquery version "3.1";

let $node := 
    <root>
        <foo/>
        <bar/>
    </root>
let $create-collection := xmldb:create-collection('/db', 'test')
let $store := xmldb:store('/db/test', 'test.xml', $node)
let $x := doc('/db/test/test.xml')/root
let $y := doc('/db/test/test.xml')/root
return 
    <test>
        <result>{$x/foo and $x/bar}</result>
        <result>{$y/foo and $y/bar}</result> 
    </test>

Expected result:

<test>
    <result>true</result>
    <result>true</result>
</test>

Actual result:

<test>
    <result/>
    <result>true</result>
</test>
@jensopetersen
Copy link
Contributor

jensopetersen commented Aug 28, 2016

Very strange.

xquery version "3.1";

let $node := 
    <root>
        <foo/>
        <bar/>
    </root>
let $x := $node/blah
let $y := 'gibberish'
return 
    $x/foo and $x/bar

Commenting out the $y assignment (or swapping the two assignments) evaluates as one item, false(), but as it is this query returns no items, which is wrong, for a boolean evaluation should always result in one item.

If we return

(if ($x/foo) then true() else false()) and (if ($x/bar) then true() else false())

everything is fine, no matter what intervenes between $x and its evaluation, but this actually does not help explain what is going on (see below).

However, if we return

functx:sequence-type($x/foo and $x/bar)

we see the difference: $x/foo and $x/bar evaluates (wrongly) to empty-sequence() if the $y assignment intervenes, whereas it evaluates to xs:boolean (as it should) if the $y assignment is commented out.

(if ($x/foo) then true() else false()) and (if ($x/bar) then true() else false())

of course also comes out as expected, since the empty sequence and the value false here have the same effect.

@joewiz
Copy link
Member Author

joewiz commented Aug 28, 2016

Useful insights, Jens! Particularly relating to the arbitrariness of values assigned to $y.

@jensopetersen
Copy link
Contributor

The bug is not connected to the evaluation of booleans, but more generally to query execution when a variable is bound to an empty sequence. Try

for $z in ($x, $y) return functx:sequence-type($z)

The problem is that $x is discarded.

@joewiz
Copy link
Member Author

joewiz commented Aug 29, 2016

Wow, that's crazy! Thanks for your investigations and insights, Jens!

@joewiz
Copy link
Member Author

joewiz commented Aug 29, 2016

Wait, shouldn't an empty member of a sequence be discarded?

@joewiz
Copy link
Member Author

joewiz commented Aug 29, 2016

i.e., count(($x, $y)) = count($y)

@jensopetersen
Copy link
Contributor

jensopetersen commented Aug 29, 2016

Yes, $x/foo and $x/bar evaluates (wrongly) to empty-sequence(), and is discarded. It is not that an empty sequence is discarded that's the problem, but that the boolean gets mangled into an empty sequence. - I must say that this looks weird to me.

@jensopetersen
Copy link
Contributor

jensopetersen commented Aug 29, 2016

My example with for $z in ($x, $y) return functx:sequence-type($z)for $z in ($x, $y) return functx:sequence-type($z) doesn't tell anything, that's right, @joewiz.

@joewiz
Copy link
Member Author

joewiz commented Aug 29, 2016

Agreed, this is really weird. I wasn't able to see this phenomenon outside the context of boolean expressions, but who knows the extent of this issue?

@jensopetersen
Copy link
Contributor

Well, returning booleans is not that common; I think we can be pretty sure that booleans behave properly in conditionals.

@dariok
Copy link

dariok commented Aug 29, 2016

@jensopetersen I'm afraid, not. That's what brought me to my original question on SO and then to open the bug report. Cf. this example:

xquery version "3.1";

declare namespace tei = "http://www.tei-c.org/ns/1.0";

let $node := doc('/db/test/test.xml')/id('accolti_pietro')/tei:persName

let $tr := if ($node/tei:surname and $node/tei:forename) then ", " else ""
let $tr2 := if ($node/tei:surname and count($node/tei:forename)>0) then ", " else ""

return 
    <test>
        <result>{concat($node/tei:surname, $tr, $node/tei:forename)}</result>
        <result>{concat($node/tei:surname, $tr2, $node/tei:forename)}</result>
    </test>

with /db/test/test.xml as follows:

<TEI xmlns="http://www.tei-c.org/ns/1.0">
    <person xml:id="accolti_pietro">
        <persName>
            <forename>Pietro</forename>
            <surname>Accolti</surname>
        </persName>
    </person>
</TEI>

returns

<test>
    <result>AccoltiPietro</result>
    <result>Accolti, Pietro</result>
</test>

@jensopetersen
Copy link
Contributor

Well, I guess I was trying to comfort myself, but it still has nothing to do with booleans, for with

let $tr := if (doc('/db/test/test.xml')/id('accolti_pietro')/tei:persName/tei:surname and $node/tei:forename) then ", " else ""

you get the expected outcome. Booleans are a way to check what is going wrong, but the problem lies in how variables are handled in the flow of a query.

I now see Joe's point about the potential extent of this issue. We have to see how else we can test this problem.

@dariok
Copy link

dariok commented Aug 30, 2016

Let me know if there's anything I can do to help.

@jensopetersen
Copy link
Contributor

It is curious how little it takes to "wake up" the variable evaluated. You used count(), but it also works with

let $tr := if (boolean($node/tei:surname) and $node/tei:forename) then ", " else ""
let $tr := if (exists($node/tei:surname) and $node/tei:forename) then ", " else ""
let $tr := if (unordered($node/tei:surname) and $node/tei:forename) then ", " else ""
let $tr := if (root($node/tei:surname) and $node/tei:forename) then ", " else ""

@shabanovd
Copy link
Member

There are few bugs at OpAnd (and OpOr). I was able to fix in-memory nodes processing, stored nodes fixing in progress.

BTW, is it possible to write tests for or? Asking for it because I'm quite sure that it have similar bugs.

@joewiz
Copy link
Member Author

joewiz commented Aug 30, 2016

@shabanovd That's great news! Thank you for looking into this. I was able to find a bug with or in the following on-disk scenario:

xquery version "3.1";

let $node := 
    <root>
        <foo/>
        <bar/>
    </root>
let $create-collection := xmldb:create-collection('/db', 'test')
let $store := xmldb:store('/db/test', 'test.xml', $node)
let $x := doc('/db/test/test.xml')/root
let $y := doc('/db/test/test.xml')/root
return 
    <test>
        <result>{$x/foo or $x/bar}</result>
        <result>{$y/foo or $y/bar}</result> 
    </test>

Expected:

<test>
    <result>true</result>
    <result>true</result>
</test>

Actual:

<test>
    <result>false</result>
    <result>true</result>
</test>

I haven't been able to get the same effect with in-memory tests despite several variations of the original (e.g., binding $x and $y to $node, $node/blah, $node/self::root, etc.). These in-memory variants all produced expected results, not buggy ones.

@dizzzz dizzzz added this to the eXist-3.0 milestone Sep 10, 2016
wolfgangmm added a commit to wolfgangmm/exist that referenced this issue Jan 15, 2017
…necessary optimizations. Performance tests did not show any regression. Added tests for boolean operators. Fixes issue eXist-db#1106.
@joewiz
Copy link
Member Author

joewiz commented Jan 23, 2017

I can confirm that since @wolfgangmm's PR has been merged, the test cases I provided in this issue are all returning expected results. It looks like the PR contained a thorough set of tests to prevent future regressions. Many thanks, Wolfgang!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug issue confirmed as bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants