Skip to content
This repository was archived by the owner on Dec 13, 2023. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions 3.7/indexing-persistent.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,5 +184,3 @@ the sort order of those which are returned can be wrong (whenever the persistent
index is consulted).

To fix persistent indexes after a language change, delete and re-create them.
Skiplist indexes are not affected, because they are not persisted and
automatically rebuilt on every server start.
2 changes: 0 additions & 2 deletions 3.8/indexing-persistent.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,5 +184,3 @@ the sort order of those which are returned can be wrong (whenever the persistent
index is consulted).

To fix persistent indexes after a language change, delete and re-create them.
Skiplist indexes are not affected, because they are not persisted and
automatically rebuilt on every server start.
7 changes: 7 additions & 0 deletions 3.9/http/indexes-multi-dim.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
layout: default
---
Working with multi-dimensional Indexes
=======================================

{% docublock post_api_index_zkd %}
135 changes: 135 additions & 0 deletions 3.9/indexing-multi-dim.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
---
layout: default
description: A multi dimensional index allows to efficiently intersect multiple range queries
---
# Multi-dimensional indexes

The multi-dimensional index type (also called ZKD) provided by ArangoDB can be
used to efficiently intersect multiple range queries.

A multi-dimensional index is setup by setting the index type to `"zkd"`.
The `fields` attribute describes which fields are used as dimensions.
The value of each dimension has to be a numeric (double) value.

## Querying documents within a 3D box

Assume we have documents in a collection `points` of the form

```json
{"x": 12.9, "y": -284.0, "z": 0.02}
```

and we want to query all documents that contained within a box defined by
`[x0, x1] * [y0, y1] * [z0, z1]`.

To do so one creates a multi-dimensional index on the attributes `x`, `y` and
`z`, e.g. in _arangosh_:

```js
db.collection.ensureIndex({
type: "zkd",
fields: ["x", "y", "z"],
fieldValueTypes: "double"
});
```

Unlike for other indexes the order of the fields does not matter.

`fieldValueTypes` is required and the only allowed value is `"double"`.
Future extensions of the index will allow other types.

Now we can use the index in a query:

```js
FOR p IN points
FILTER x0 <= p.x && p.x <= x1
FILTER y0 <= p.y && p.y <= y1
FILTER z0 <= p.z && p.z <= z1
RETURN p
```

## Possible range queries

Having an index on a set of fields does not require you to specify a full range
for every field. For each field you can decide if you want to bound
it from both sides, from one side only (i.e. only an upper or lower bound)
or not bound it at all.

Furthermore you can use any comparison operator. The index supports `<=` and `>=`
naturally, `==` will be translated to the bound `[c, c]`. Strict comparison
is translated to their non-strict counterparts and a post-filter is inserted.

```js
FOR p IN points
FILTER 2 <= p.x && p.x < 9
FILTER y0 >= 80
FILTER p.z == 4
RETURN p
```

## Example Use Case

If you build a calendar using ArangoDB you could create a collection for each user
that contains her appointments. The documents would roughly look as follows:

```json
{
"from": 345365,
"to": 678934,
"what": "Dentist",
}
```

`from`/`to` are the timestamps when an appointment starts/ends. Having an
multi-dimensional index on the fields `["from", "to"]` allows you to query
for all appointments within a given time range efficiently.

### Finding all appointments within a time range

Given a time range `[f, t]` we want to find all appointments `[from, to]` that
are completely contained in `[f, t]`. Those appointments clearly satisfy the
condition

```
f <= from and to <= t
```

Thus our query would be:

```js
FOR app IN appointments
FILTER f <= app.from
FILTER app.to <= t
RETURN app
```

### Finding all appointments that intersect a time range

Given a time range `[f, t]` we want to find all appointments `[from, to]` that
intersect `[f, t]`. Two intervals `[f, t]` and `[from, to]` intersect if
and only if

```
a_2 <= b_1 and a_1 <= b_2
```

Thus our query would be:

```js
FOR app IN appointments
FILTER f <= app.to
FILTER app.from <= t
RETURN app
```

## Limitations

Currently there are a few limitations:

- Using array expansions for attributes is not possible (e.g. `array[*].attr`)
- The `sparse` property is not supported.
- You can only index numeric values that are representable as IEEE-754 double.
- A high number of dimensions (more than 5) can impact the performance considerably.
- The performance can vary depending on the dataset. Densely packed points can
lead to a high number of seeks. This behavior is typical for indexing using
space filling curves.
2 changes: 0 additions & 2 deletions 3.9/indexing-persistent.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,5 +184,3 @@ the sort order of those which are returned can be wrong (whenever the persistent
index is consulted).

To fix persistent indexes after a language change, delete and re-create them.
Skiplist indexes are not affected, because they are not persisted and
automatically rebuilt on every server start.
20 changes: 19 additions & 1 deletion 3.9/indexing-which-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,24 @@ different usage scenarios:
result TTL indexes will likely not be used for filtering and sort operations in user-land
AQL queries.

- **multi-dimensional index** (ZKD): a multi dimensional index allows to
efficiently intersect multiple range queries. Typical use cases are querying
intervals that intersect a given point or interval. For example, if intervals
are stored in documents like

```json
{ "from": 12, "to": 45 }
```

then you can create an index over `from, to` utilize it with this query:

```js
FOR i IN intervals FILTER i.from <= t && t <= i.to RETURN i
```

Currently only floating-point numbers (doubles) are supported as underlying
type for each dimension.

- **Geo index**: the geo index provided by ArangoDB allows searching for documents
within a radius around a two-dimensional earth coordinate (point), or to
find documents with are closest to a point. Document coordinates can either
Expand All @@ -110,7 +128,7 @@ different usage scenarios:
and a SORT or FILTER statement is used in conjunction with the distance
function.

- **fulltext index**: a fulltext index can be used to index all words contained in
- **fulltext index**: a fulltext index can be used to index all words contained in
a specific attribute of all documents in a collection. Only words with a
(specifiable) minimum length are indexed. Word tokenization is done using
the word boundary analysis provided by libicu, which is taking into account
Expand Down
1 change: 1 addition & 0 deletions 3.9/indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@ There are special sections for
- [Persistent Indexes](indexing-persistent.html)
- [TTL Indexes](indexing-ttl.html)
- [Fulltext Indexes](indexing-fulltext.html)
- [Multi-dimensional Indexes](indexing-multi-dim.html)
- [Geo-spatial Indexes](indexing-geo.html)
- [Vertex-centric Indexes](indexing-vertex-centric.html)
2 changes: 2 additions & 0 deletions _data/3.9-http.yml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@
href: indexes-persistent.html
- text: TTL
href: indexes-ttl.html
- text: Multi-dimensional
href: indexes-multi-dim.html
- text: Geo-Spatial
href: indexes-geo.html
- text: Fulltext
Expand Down
2 changes: 2 additions & 0 deletions _data/3.9-manual.yml
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,8 @@
href: indexing-ttl.html
- text: Fulltext Indexes
href: indexing-fulltext.html
- text: Multi-dimensional Indexes
href: indexing-multi-dim.html
- text: Geo-spatial Indexes
href: indexing-geo.html
- text: Vertex Centric Indexes
Expand Down