Enable multiQuery optimization for has step

Fixes #3244 Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
JanusGraph · May 9, 2023 · 6e04a8c · 6e04a8c
1 parent e332af6
commit 6e04a8c
Show file tree

Hide file tree

Showing 26 changed files with 1,040 additions and 300 deletions.
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -269,6 +269,12 @@ In case previous behaviour is desired then `storage.cql.executor-service.enabled
 but it's recommended to tune CQL queries parallelism using CQL driver configuration options (like `storage.cql.max-requests-per-connection`, 
 `storage.cql.local-max-connections-per-host`) and / or `storage.parallel-backend-ops.*` configuration options.  
 
+##### `query.batch-property-prefetch` configuration option is removed
+
+`query.batch-property-prefetch` was replaced by a better configurable option. In case previous behaviour is desired then 
+use `query.has-step-batch-mode = none` as replacement for `query.batch-property-prefetch = false` or use 
+`query.has-step-batch-mode = all_properties` as replacement for `query.batch-property-prefetch = true`.  
+
 ##### Removal of deprecated classes/methods/functionalities
 
 ###### Methods

diff --git a/docs/configs/janusgraph-cfg.md b/docs/configs/janusgraph-cfg.md
@@ -347,10 +347,17 @@ Configuration options for query processing
 | Name | Description | Datatype | Default Value | Mutability |
 | ---- | ---- | ---- | ---- | ---- |
 | query.batch | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. | Boolean | true | MASKABLE |
-| query.batch-property-prefetch | Whether to do a batched pre-fetch of all properties on adjacent vertices against the storage backend prior to evaluating a has condition against those vertices. Because these vertex properties will be loaded into the transaction-level cache of recently-used vertices when the condition is evaluated this can lead to significant performance improvement if there are many edges to adjacent vertices and there is a non-trivial latency to the backend. | Boolean | false | MASKABLE |
 | query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties | Boolean | true | MASKABLE |
 | query.force-index | Whether JanusGraph should throw an exception if a graph query cannot be answered using an index. Doing so limits the functionality of JanusGraph's graph queries but ensures that slow graph queries are avoided on large graphs. Recommended for production use of JanusGraph. | Boolean | false | MASKABLE |
 | query.hard-max-limit | If smart-limit is disabled and no limit is given in the query, query optimizer adds a limit in light of possibly large result sets. It works in the same way as smart-limit except that hard-max-limit is usually a large number. Default value is Integer.MAX_VALUE which effectively disables this behavior. This option does not take effect when smart-limit is enabled. | Integer | 2147483647 | MASKABLE |
+| query.has-step-batch-mode | Properties pre-fetching mode for `has` step. Used only when query.batch is enabled.<br>Supported modes:<br>- `all_properties` Pre-fetch all vertex properties on any property access<br>- `required_properties_only` Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps<br>- `required_and_next_properties` Prefetch the same properties as with `required_properties_only` mode, but also prefetch
+properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, or `elementMap`.
+In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.
+In case the next step is one of the properties access steps with limited scope of properties, those properties will be
+pre-fetched together in the same multi-query.
+In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
+behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` Prefetch the same properties as with `required_properties_only`, but in case the next step is not
+`values`, `properties,` `valueMap`, or `elementMap` then acts like `all_properties`.<br>- `none` Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
 | query.ignore-unknown-index-key | Whether to ignore undefined types encountered in user-provided index queries | Boolean | false | MASKABLE |
 | query.index-select-strategy | Name of the index selection strategy or full class name. Following shorthands can be used: <br>- `brute-force` (Try all combinations of index candidates and pick up optimal one)<br>- `approximate` (Use greedy algorithm to pick up approximately optimal index candidate)<br>- `threshold-based` (Use index-select-threshold to pick up either `approximate` or `threshold-based` strategy on runtime) | String | threshold-based | MASKABLE |
 | query.index-select-threshold | Threshold of deciding whether to use brute force enumeration algorithm or fast approximation algorithm for selecting suitable indexes. Selecting optimal indexes for a query is a NP-complete set cover problem. When number of suitable index candidates is no larger than threshold, JanusGraph uses brute force search with exponential time complexity to ensure the best combination of indexes is selected. Only effective when `threshold-based` index select strategy is chosen. | Integer | 10 | MASKABLE |