Skip to content

Commit

Permalink
Add multi-query support to label step
Browse files Browse the repository at this point in the history
Fixes #3813

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
  • Loading branch information
porunov committed Aug 18, 2023
1 parent 20abdde commit d8d40e5
Show file tree
Hide file tree
Showing 13 changed files with 296 additions and 2 deletions.
3 changes: 3 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,9 @@ use `query.batch.has-step-mode = none` as replacement for `query.batch-property-
In case previous behavior is desired, use `query.batch.properties-mode = required_properties_only` for `query.fast-property = false`
or use `query.batch.properties-mode = all_properties` for `query.fast-property = true`.

`label` step now uses pre-fetching strategy by default. Use `query.batch.label-step-mode = none` to disable pre-fetching
optimization for `label` step.

[Batch processing](https://docs.janusgraph.org/operations/batch-processing/) allows JanusGraph to fetch a batch of
vertices from the storage backend together instead of requesting each vertex individually which leads to a high number
of backend queries.
Expand Down
1 change: 1 addition & 0 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,6 +373,7 @@ pre-fetched together in the same multi-query.
In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not
`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` - Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
| query.batch.label-step-mode | Labels pre-fetching mode for `label()` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all` - Pre-fetch labels for all vertices in a batch.<br>- `none` - Skips vertex labels pre-fetching optimization.<br> | String | all | MASKABLE |
| query.batch.limited | Configure a maximum batch size for queries against the storage backend. This can be used to ensure responsiveness if batches tend to grow very large. The used batch size is equivalent to the barrier size of a preceding `barrier()` step. If a step has no preceding `barrier()`, the default barrier of TinkerPop will be inserted. This option only takes effect if `query.batch.enabled` is `true`. | Boolean | true | MASKABLE |
| query.batch.limited-size | Default batch size (barrier() step size) for queries. This size is applied only for cases where `LazyBarrierStrategy` strategy didn't apply `barrier` step and where user didn't apply barrier step either. This option is used only when `query.batch.limited` is `true`. Notice, value `2147483647` is considered to be unlimited. | Integer | 2500 | MASKABLE |
| query.batch.properties-mode | Properties pre-fetching mode for `values`, `properties`, `valueMap`, `propertyMap`, `elementMap` steps. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on non-singular property access (fetches all vertex properties in a single slice query). On single property access this mode behaves the same as `required_properties_only` mode.<br>- `required_properties_only` - Pre-fetch necessary vertex properties only (uses a separate slice query per each required property)<br>- `none` - Skips vertex properties pre-fetching optimization.<br> | String | required_properties_only | MASKABLE |
Expand Down
3 changes: 2 additions & 1 deletion docs/operations/batch-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ Batched query processing takes into account two types of steps:

1. Batch compatible step. This is the step which will execute batch requests. Currently, the list of such steps
is the next: `out()`, `in()`, `both()`, `inE()`, `outE()`, `bothE()`, `has()`, `values()`, `properties()`, `valueMap()`,
`propertyMap()`, `elementMap()`.
`propertyMap()`, `elementMap()`, `label()`.
2. Parent step. This is a parent step which has local traversals with the same start. Such parent steps also implement the
interface `TraversalParent`. There are many such steps, but as for an example those could be: `and(...)`, `or(...)`,
`not(...)`, `order().by(...)`, `project("valueA", "valueB", "valueC").by(...).by(...).by(...)`, `union(..., ..., ...)`,
Expand Down Expand Up @@ -330,3 +330,4 @@ access when direct vertex properties are requested (for example `vertex.properti
See configuration option `query.batch.has-step-mode` to control properties pre-fetching behaviour for `has` step.
See configuration option `query.batch.properties-mode` to control properties pre-fetching behaviour for `values`,
`properties`, `valueMap`, `propertyMap`, and `elementMap` steps.
See configuration option `query.batch.label-step-mode` to control labels pre-fetching behaviour for `label` step.
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertiesStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertyMapStep;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryHasStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryLabelStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryPropertiesStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryStrategyRepeatStepMode;
import org.janusgraph.graphdb.transaction.StandardJanusGraphTx;
Expand Down Expand Up @@ -217,6 +218,7 @@
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.HAS_STEP_BATCH_MODE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.IDS_STORE_NAME;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.INITIAL_JANUSGRAPH_VERSION;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.LABEL_STEP_BATCH_MODE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.LIMITED_BATCH;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.LIMITED_BATCH_SIZE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.LOG_BACKEND;
Expand Down Expand Up @@ -5008,6 +5010,7 @@ public void testLimitBatchSizeForMultiQuery() {

testLimitBatchSizeForHasStep(numV, barrierSize, limit, bs, cs);
testLimitBatchSizeForPropertySteps(numV, barrierSize, limit, cs);
testLimitBatchSizeForLabelStep(numV, barrierSize, limit, bs, cs);

// test batching for `out()`
profile = testLimitedBatch(() -> graph.traversal().V(bs).barrier(barrierSize).out());
Expand Down Expand Up @@ -6166,6 +6169,45 @@ private void testLimitBatchSizeForMultiQueryOfConnectiveSteps(JanusGraphVertex[]
assertEquals((int) Math.ceil((double) Math.min(bs.length, limit) / barrierSize), countOptimizationQueries(profile.getMetrics()));
}

private void testLimitBatchSizeForLabelStep(int numV, int barrierSize, int limit, JanusGraphVertex[] bs, JanusGraphVertex[] cs) {

TraversalMetrics profile;

// test batching for `label()`
profile = testLimitedBatch(() -> graph.traversal().V(cs).barrier(barrierSize).label(),
option(USE_MULTIQUERY), true, option(LIMITED_BATCH), true,
option(LABEL_STEP_BATCH_MODE), MultiQueryLabelStepStrategyMode.ALL.getConfigName());
assertEquals(3, countBackendQueriesOfSize(barrierSize, profile.getMetrics()));
assertEquals(1, countBackendQueriesOfSize((numV - 3 * barrierSize), profile.getMetrics()));

// test batching for `label()` with default labels
profile = testLimitedBatch(() -> graph.traversal().V(bs).barrier(barrierSize).label(),
option(USE_MULTIQUERY), true, option(LIMITED_BATCH), true,
option(LABEL_STEP_BATCH_MODE), MultiQueryLabelStepStrategyMode.ALL.getConfigName());
assertEquals( Math.ceil(bs.length / (double) barrierSize), countOptimizationQueries(profile.getMetrics()));

// test batching for `label()` in a parent step
profile = testLimitedBatch(() -> graph.traversal().V(cs).barrier(barrierSize).union(__.label()),
option(USE_MULTIQUERY), true, option(LIMITED_BATCH), true,
option(LABEL_STEP_BATCH_MODE), MultiQueryLabelStepStrategyMode.ALL.getConfigName());
assertEquals(3, countBackendQueriesOfSize(barrierSize, profile.getMetrics()));
assertEquals(1, countBackendQueriesOfSize((numV - 3 * barrierSize), profile.getMetrics()));

// test batching for `label()` with `limit`. In this case TinkerPop optimizes the usage by moving limit before
// `label` step.
profile = testLimitedBatch(() -> graph.traversal().V(cs).label().limit(limit),
option(USE_MULTIQUERY), true, option(LIMITED_BATCH), true,
option(LABEL_STEP_BATCH_MODE), MultiQueryLabelStepStrategyMode.ALL.getConfigName());
assertEquals(1, countBackendQueriesOfSize(limit, profile.getMetrics()));

// test disabled batching for `label()`
profile = testLimitedBatch(() -> graph.traversal().V(cs).barrier(barrierSize).label(),
option(USE_MULTIQUERY), true, option(LIMITED_BATCH), true,
option(LABEL_STEP_BATCH_MODE), MultiQueryLabelStepStrategyMode.NONE.getConfigName());
assertEquals(0, countBackendQueriesOfSize(barrierSize, profile.getMetrics()));
assertEquals(0, countOptimizationQueries(profile.getMetrics()));
}

@Test
public void testMultiSliceDBCachedRequests(){
clopen(option(DB_CACHE), false);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -198,4 +198,13 @@ public List<Long> getAdjacentVerticesLocalCounts() {
tx.rollback();
return result;
}

@Benchmark
public List<String> getLabels() {
JanusGraphTransaction tx = graph.buildTransaction()
.start();
List<String> result = tx.traversal().V().has("name", "inner").label().toList();
tx.rollback();
return result;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

import org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryHasStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryLabelStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryPropertiesStrategyMode;

import java.time.Instant;
Expand Down Expand Up @@ -167,6 +168,15 @@ public interface TransactionBuilder {
*/
TransactionBuilder setPropertiesStrategyMode(MultiQueryPropertiesStrategyMode propertiesStrategyMode);

/**
* Sets `label` step strategy mode.
* <p>
* Doesn't have any effect if multi-query was disabled via config `query.batch.enabled = false`.
*
* @return Object with the set labels strategy mode settings
*/
TransactionBuilder setLabelsStepStrategyMode(MultiQueryLabelStepStrategyMode labelStepStrategyMode);

/**
* Sets the group name for this transaction which provides a way for gathering
* reporting on multiple transactions into one group.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
import org.janusgraph.graphdb.query.index.BruteForceIndexSelectionStrategy;
import org.janusgraph.graphdb.query.index.IndexSelectionStrategy;
import org.janusgraph.graphdb.query.index.ThresholdBasedIndexSelectionStrategy;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryLabelStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryPropertiesStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryStrategyRepeatStepMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryHasStepStrategyMode;
Expand Down Expand Up @@ -376,6 +377,15 @@ public class GraphDatabaseConfiguration {
MultiQueryPropertiesStrategyMode.NONE.getConfigName()),
ConfigOption.Type.MASKABLE, MultiQueryPropertiesStrategyMode.REQUIRED_PROPERTIES_ONLY.getConfigName());

public static final ConfigOption<String> LABEL_STEP_BATCH_MODE = new ConfigOption<>(QUERY_BATCH_NS,"label-step-mode",
String.format("Labels pre-fetching mode for `label()` step. Used only when `"+USE_MULTIQUERY.toStringWithoutRoot()+"` is `true`.<br>" +
"Supported modes:<br>" +
"- `%s` - Pre-fetch labels for all vertices in a batch.<br>" +
"- `%s` - Skips vertex labels pre-fetching optimization.<br>",
MultiQueryLabelStepStrategyMode.ALL.getConfigName(),
MultiQueryLabelStepStrategyMode.NONE.getConfigName()),
ConfigOption.Type.MASKABLE, MultiQueryLabelStepStrategyMode.ALL.getConfigName());

// ################ SCHEMA #######################
// ################################################

Expand Down Expand Up @@ -1348,6 +1358,7 @@ public boolean apply(@Nullable String s) {
private String unknownIndexKeyName;
private MultiQueryHasStepStrategyMode hasStepStrategyMode;
private MultiQueryPropertiesStrategyMode propertiesStrategyMode;
private MultiQueryLabelStepStrategyMode labelStepStrategyMode;

private StoreFeatures storeFeatures = null;

Expand Down Expand Up @@ -1470,6 +1481,10 @@ public MultiQueryPropertiesStrategyMode propertiesStrategyMode() {
return propertiesStrategyMode;
}

public MultiQueryLabelStepStrategyMode labelStepStrategyMode() {
return labelStepStrategyMode;
}

public boolean adjustQueryLimit() {
return adjustQueryLimit;
}
Expand Down Expand Up @@ -1599,6 +1614,7 @@ private void preLoadConfiguration() {
repeatStepMode = selectExactConfig(REPEAT_STEP_BATCH_MODE, MultiQueryStrategyRepeatStepMode.values());
hasStepStrategyMode = selectExactConfig(HAS_STEP_BATCH_MODE, MultiQueryHasStepStrategyMode.values());
propertiesStrategyMode = selectExactConfig(PROPERTIES_BATCH_MODE, MultiQueryPropertiesStrategyMode.values());
labelStepStrategyMode = selectExactConfig(LABEL_STEP_BATCH_MODE, MultiQueryLabelStepStrategyMode.values());

indexSelectionStrategy = Backend.getImplementationClass(configuration, configuration.get(INDEX_SELECT_STRATEGY),
REGISTERED_INDEX_SELECTION_STRATEGIES);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
// Copyright 2023 JanusGraph Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package org.janusgraph.graphdb.tinkerpop.optimize.step;

import org.apache.tinkerpop.gremlin.process.traversal.Traverser;
import org.apache.tinkerpop.gremlin.process.traversal.step.Profiling;
import org.apache.tinkerpop.gremlin.process.traversal.step.map.LabelStep;
import org.apache.tinkerpop.gremlin.process.traversal.util.MutableMetrics;
import org.apache.tinkerpop.gremlin.structure.Element;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.BaseVertexQuery;
import org.janusgraph.graphdb.query.profile.QueryProfiler;
import org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder;
import org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryUtil;
import org.janusgraph.graphdb.tinkerpop.optimize.step.fetcher.LabelStepBatchFetcher;
import org.janusgraph.graphdb.tinkerpop.profile.TP3ProfileWrapper;
import org.janusgraph.graphdb.util.CopyStepUtil;
import org.janusgraph.graphdb.util.JanusGraphTraverserUtil;

/**
* This class extends the default TinkerPop's {@link LabelStep} and adds vertices multi-query optimization to this step.
* <p>
* Before this step is evaluated it usually receives multiple future vertices which might be processed next with this step.
* This step stores all these vertices which might be needed later for evaluation and whenever this step receives the
* vertex for evaluation which wasn't preFetched previously it sends multi-query for a batch of vertices to fetch their
* labels.
* <p>
* This step optimizes only access to Vertex properties and skips optimization for any other Element.
*/
public class JanusGraphLabelStep<S extends Element> extends LabelStep<S> implements Profiling, MultiQueriable<S,String> {

private boolean useMultiQuery = false;
private QueryProfiler queryProfiler = QueryProfiler.NO_OP;
private int batchSize = Integer.MAX_VALUE;
private LabelStepBatchFetcher labelStepBatchFetcher;

public JanusGraphLabelStep(LabelStep<S> originalStep){
super(originalStep.getTraversal());
CopyStepUtil.copyAbstractStepModifiableFields(originalStep, this);

if (originalStep instanceof JanusGraphLabelStep) {
JanusGraphLabelStep originalJanusGraphLabelStep = (JanusGraphLabelStep) originalStep;
setBatchSize(originalJanusGraphLabelStep.batchSize);
setUseMultiQuery(originalJanusGraphLabelStep.useMultiQuery);

Check warning on line 56 in janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java

View check run for this annotation

Codecov / codecov/patch

janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java#L54-L56

Added lines #L54 - L56 were not covered by tests
}
}

@Override
protected String map(final Traverser.Admin<S> traverser) {
if (useMultiQuery && traverser.get() instanceof Vertex) {
return labelStepBatchFetcher.fetchData(getTraversal(), (Vertex) traverser.get(), JanusGraphTraverserUtil.getLoops(traverser));
}
return super.map(traverser);
}

@Override
public void setUseMultiQuery(boolean useMultiQuery) {
this.useMultiQuery = useMultiQuery;
if(this.useMultiQuery && labelStepBatchFetcher == null){
labelStepBatchFetcher = new LabelStepBatchFetcher(this::makeLabelsQuery, batchSize);
}
}

private <Q extends BaseVertexQuery> Q makeLabelsQuery(Q query) {
return (Q) BasicVertexCentricQueryUtil.withLabelVertices((BasicVertexCentricQueryBuilder) query)
.profiler(queryProfiler);
}

@Override
public void setBatchSize(int batchSize) {
this.batchSize = batchSize;
if(labelStepBatchFetcher != null){
labelStepBatchFetcher.setBatchSize(batchSize);

Check warning on line 85 in janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java

View check run for this annotation

Codecov / codecov/patch

janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java#L85

Added line #L85 was not covered by tests
}
}

@Override
public void registerFirstNewLoopFutureVertexForPrefetching(Vertex futureVertex, int futureVertexTraverserLoop) {
if(useMultiQuery){
labelStepBatchFetcher.registerFirstNewLoopFutureVertexForPrefetching(futureVertex);

Check warning on line 92 in janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java

View check run for this annotation

Codecov / codecov/patch

janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java#L92

Added line #L92 was not covered by tests
}
}

Check warning on line 94 in janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java

View check run for this annotation

Codecov / codecov/patch

janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java#L94

Added line #L94 was not covered by tests

@Override
public void registerSameLoopFutureVertexForPrefetching(Vertex futureVertex, int futureVertexTraverserLoop) {
if(useMultiQuery){
labelStepBatchFetcher.registerCurrentLoopFutureVertexForPrefetching(futureVertex, futureVertexTraverserLoop);
}
}

@Override
public void registerNextLoopFutureVertexForPrefetching(Vertex futureVertex, int futureVertexTraverserLoop) {
if(useMultiQuery){
labelStepBatchFetcher.registerNextLoopFutureVertexForPrefetching(futureVertex, futureVertexTraverserLoop);

Check warning on line 106 in janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java

View check run for this annotation

Codecov / codecov/patch

janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java#L106

Added line #L106 was not covered by tests
}
}

Check warning on line 108 in janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java

View check run for this annotation

Codecov / codecov/patch

janusgraph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/step/JanusGraphLabelStep.java#L108

Added line #L108 was not covered by tests

@Override
public void setMetrics(MutableMetrics metrics) {
queryProfiler = new TP3ProfileWrapper(metrics);
}
}
Loading

0 comments on commit d8d40e5

Please sign in to comment.