Skip to content

Commit

Permalink
Enable multiQuery optimization for has step [cql-tests] [tp-tests]
Browse files Browse the repository at this point in the history
Fixes #3244

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
  • Loading branch information
porunov committed Jun 6, 2023
1 parent d435987 commit 8e88b60
Show file tree
Hide file tree
Showing 31 changed files with 1,265 additions and 315 deletions.
4 changes: 4 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,10 @@ GraphBinary is now used as the default MessageSerializer.hb
`query.batch` is now a configuration namespace. Thus, previous `query.batch` configuration is replaced by `query.batch.enabled`.
`query.limit-batch-size` configuration option is changed to `query.batch.limited`.

`query.batch-property-prefetch` was replaced by a better configurable option. In case previous behaviour is desired then
use `query.batch.has-step-mode = none` as replacement for `query.batch-property-prefetch = false` or use
`query.batch.has-step-mode = all_properties` as replacement for `query.batch-property-prefetch = true`.

[Batch processing](https://docs.janusgraph.org/operations/batch-processing/) allows JanusGraph to fetch a batch of
vertices from the storage backend together instead of requesting each vertex individually which leads to a high number
of backend queries.
Expand Down
9 changes: 8 additions & 1 deletion docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,6 @@ Configuration options for query processing

| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.batch-property-prefetch | Whether to do a batched pre-fetch of all properties on adjacent vertices against the storage backend prior to evaluating a has condition against those vertices. Because these vertex properties will be loaded into the transaction-level cache of recently-used vertices when the condition is evaluated this can lead to significant performance improvement if there are many edges to adjacent vertices and there is a non-trivial latency to the backend. | Boolean | false | MASKABLE |
| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties | Boolean | true | MASKABLE |
| query.force-index | Whether JanusGraph should throw an exception if a graph query cannot be answered using an index. Doing so limits the functionality of JanusGraph's graph queries but ensures that slow graph queries are avoided on large graphs. Recommended for production use of JanusGraph. | Boolean | false | MASKABLE |
| query.hard-max-limit | If smart-limit is disabled and no limit is given in the query, query optimizer adds a limit in light of possibly large result sets. It works in the same way as smart-limit except that hard-max-limit is usually a large number. Default value is Integer.MAX_VALUE which effectively disables this behavior. This option does not take effect when smart-limit is enabled. | Integer | 2147483647 | MASKABLE |
Expand All @@ -363,6 +362,14 @@ Configuration options to configure batch queries optimization behavior
| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.batch.enabled | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. If `false` then all other configuration options under `query.batch` namespace are ignored. | Boolean | true | MASKABLE |
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` Pre-fetch all vertex properties on any property access<br>- `required_properties_only` Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps<br>- `required_and_next_properties` Prefetch the same properties as with `required_properties_only` mode, but also prefetch
properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, or `elementMap`.
In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.
In case the next step is one of the properties access steps with limited scope of properties, those properties will be
pre-fetched together in the same multi-query.
In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` Prefetch the same properties as with `required_properties_only`, but in case the next step is not
`values`, `properties,` `valueMap`, or `elementMap` then acts like `all_properties`.<br>- `none` Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
| query.batch.limited | Configure a maximum batch size for queries against the storage backend. This can be used to ensure responsiveness if batches tend to grow very large. The used batch size is equivalent to the barrier size of a preceding `barrier()` step. If a step has no preceding `barrier()`, the default barrier of TinkerPop will be inserted. This option only takes effect if `query.batch.enabled` is `true`. | Boolean | true | MASKABLE |
| query.batch.limited-size | Default batch size (barrier() step size) for queries. This size is applied only for cases where `LazyBarrierStrategy` strategy didn't apply `barrier` step and where user didn't apply barrier step either. This option is used only when `query.batch.limited` is `true`. Notice, value `2147483647` is considered to be unlimited. | Integer | 2500 | MASKABLE |
| query.batch.repeat-step-mode | Batch mode for `repeat` step. Used only when query.batch.enabled is `true`.<br>These modes are controlling how the child steps with batch support are behaving if they placed to the start of the `repeat`, `emit`, or `until` traversals.<br>Supported modes:<br>- `closest_repeat_parent` Child start steps are receiving vertices for batching from the closest `repeat` step parent only.<br>- `all_repeat_parents` Child start steps are receiving vertices for batching from all `repeat` step parents.<br>- `starts_only_of_all_repeat_parents` Child start steps are receiving vertices for batching from the closest `repeat` step parent (both for the parent start and for next iterations) and also from all `repeat` step parents for the parent start. | String | all_repeat_parents | MASKABLE |
Expand Down
2 changes: 1 addition & 1 deletion docs/operations/batch-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ when the query is accessing many vertices.
Batched query processing takes into account two types of steps:

1. Batch compatible step. This is the step which will execute batch requests. Currently, the list of such steps
is the next: `out()`, `in()`, `both()`, `inE()`, `outE()`, `bothE()`, `values()`, `properties()`.
is the next: `out()`, `in()`, `both()`, `inE()`, `outE()`, `bothE()`, `has()`, `values()`, `properties()`.
2. Parent step. This is a parent step which has local traversals with the same start. Such parent steps also implement the
interface `TraversalParent`. There are many such steps, but as for an example those could be: `and(...)`, `or(...)`,
`not(...)`, `order().by(...)`, `project("valueA", "valueB", "valueC").by(...).by(...).by(...)`, `union(..., ..., ...)`,
Expand Down
Loading

0 comments on commit 8e88b60

Please sign in to comment.