Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SKIPLIST index not used in AQL query #2116

Closed
1 task done
thierry-github opened this issue Oct 16, 2016 · 4 comments
Closed
1 task done

SKIPLIST index not used in AQL query #2116

thierry-github opened this issue Oct 16, 2016 · 4 comments

Comments

@thierry-github
Copy link

my environment running ArangoDB

I'm using the latest ArangoDB of the respective release series:

  • 3.0 (3.0.10)

On this operating system:

  • [x ] Windows, version: 10

I'm issuing AQL via:

  • [x ] web interface with this browser: CHROME running on this OS: WIN10

In the documentation, it's written :

A skiplist index will only be used if at least its first attribute is used in a FILTER condition
https://docs.arangodb.com/3.0/Manual/Indexing/IndexUtilization.html

i have on collection named "content" with the following index :

  • type : skiplist
  • unique : false
  • sparse : true
  • fields : [field_one , field_two]

i'have one record in the collection :
{
"field_one": "one",
"field_two": "two"
}

using this AQL query works correctly but doesn't involve index :
for record in content filter record.field_one == "one" return record
Associated explain result :

Query string:
for record in content filter record.field_one == "one" return record

Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 EnumerateCollectionNode 1 - FOR record IN content /* full collection scan /
3 CalculationNode 1 - LET #1 = (record.field_one == "one") /
simple expression / / collections used: record : content */
4 FilterNode 1 - FILTER #1
5 ReturnNode 1 - RETURN record

Indexes used:
none

Optimization rules applied:
none

using this query makes use of index :
for record in content FILTER record.field_one == "one" AND record.field_two == "two" return record
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 skiplist content false true n/a [ field_one, field_two ](%28record.field_one ==) && (record.field_two == "two"))

Am 'i missing somthing in index utilization ?

@DeShadow
Copy link

DeShadow commented Oct 16, 2016

@thierry-github You use sparse index.
The sparse index doesn't store documents with null values. That's why when you make second query for record in content FILTER record.field_one == "one" AND record.field_two == "two" return record
the optimiser knows that no documents with field_one == null or field_two == null will be in the result of query.

But when you make query for record in content filter record.field_one == "one" return record, optimiser knows that there can be record with field_two == null. But this record not in index, because this index is sparse. That's why full scan is required.

You can change your query to:

for record in content filter record.field_one == "one" && record._field_two != null return record

or

for record in content filter record.field_one == "one" && record._field_two > null return record

These two queries say to optimiser that there are no records with null values in result. And optimiser will use your index. :)

Or you can simply use non-sparse index, which indexes documents with null-values too.

@thierry-github
Copy link
Author

thierry-github commented Oct 16, 2016

just tried this
for record in content filter record.field_one == "one" AND record.field_two != null return record

but same result. Here is the explanation result

Query string:
for record in content filter record.field_one == "one" AND record.field_two != null return record

Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 EnumerateCollectionNode 1 - FOR record IN content /* full collection scan /
3 CalculationNode 1 - LET #1 = ((record.field_one == "one") && (record.field_two != null)) /
simple expression / / collections used: record : content */
4 FilterNode 1 - FILTER #1
5 ReturnNode 1 - RETURN record

Indexes used:
none

Optimization rules applied:
none
but the second one works
for record in content filter record.field_one == "one" && record.field_two > null return record
Query string:
for record in content filter record.field_one == "one" && record.field_two > null return record

Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
6 IndexNode 1 - FOR record IN content /* skiplist index scan */
5 ReturnNode 1 - RETURN record

Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 skiplist content false true n/a [ field_one, field_two ](%28record.field_one ==) && (record.field_two > null))

Optimization rules applied:
Id RuleName
1 use-indexes
2 remove-filter-covered-by-index
3 remove-unnecessary-calculations-2

@DeShadow
Copy link

@thierry-github Optimiser doesn't works good with expression !=, that's why it's not use index in obvious case. :( ArangoDB knows about it and will improve it :)

@thierry-github
Copy link
Author

@DeShadow As you said that ArangoDB knows about it, i close the issue. Thanks for your answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants