Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aggregate operators COUNT and SAMPLE should ignore NULL values #563

Closed
pchampin opened this issue Dec 23, 2015 · 1 comment · Fixed by #567
Closed

aggregate operators COUNT and SAMPLE should ignore NULL values #563

pchampin opened this issue Dec 23, 2015 · 1 comment · Fixed by #567
Labels
bug Something isn't working fix-in-progress SPARQL
Milestone

Comments

@pchampin
Copy link
Contributor

Consider the following query:

  SELECT ?x (COUNT(?y) as ?ys) (COUNT(?z) as ?zs) WHERE {
    VALUES (?x ?y ?z) {
      (2 6 UNDEF)
      (2 UNDEF 10)
      (3 UNDEF 15)
      (3 9 UNDEF)
    }
  }
  GROUP BY ?x

it should return the following tuples:

  2 1 1
  3 1 1

as, per the specification:

[COUNT] counts the number of times a given expression has a bound, and non-error value

But instead it returns the following tuples:

  2 2 2
  3 2 2

There is a similar problem with the SAMPLE operator.
I would expect that:

  SELECT ?x (SAMPLE(?y) as ?ys) (SAMPLE(?z) as ?zs) WHERE {
    VALUES (?x ?y ?z) {
      (2 6 UNDEF)
      (2 UNDEF 10)
      (3 UNDEF 15)
      (3 9 UNDEF)
    }
  }
  GROUP BY ?x

return the following tuples:

  2 6 10
  3 9 15

but instead I get

  2 6 _
  3 _ 15

(where _ means NULL).

Here the specification is not as explicit as how to handle NULL values,
but both Virtuoso and Corese give me the expected result, so there seem to be a consensus on the fact that SAMPLE should not return NULL values.

(in fact, when one sampled column contains only NULL values, both Virtuoso and Corese populate it with an artificial 0 value).

pchampin added a commit to pchampin/rdflib that referenced this issue Dec 23, 2015
@joernhees
Copy link
Member

good catch

@joernhees joernhees added this to the rdflib 4.2.2 milestone Dec 23, 2015
@joernhees joernhees added bug Something isn't working fix-in-progress SPARQL labels Dec 23, 2015
joernhees added a commit that referenced this issue Feb 15, 2016
* master:
  Revert "Made ClosedNamespace (and _RDFNamespace) inherit from Namespace"
  Revert "re-introduces special handling for DCTERMS.title and test for it"
  Read and discard server response for update queries, otherwise the socket will eventually block
  Remove an unused import
  sparql select nothing doesn't return a row anymore, fixes #554
  test for #554
  minor: isinstance instead of type-equals
  adapted agg_Sample to return None instead of 0 if no binding is found
  SAMPLE may return None if no binding is found
  made test_issue563 py26 compatible
  fixing issue #563
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fix-in-progress SPARQL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants