Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-key slice async queries for CQL storage backend [cql-tests] [tp-tests] #3760

Merged
merged 5 commits into from
May 29, 2023

Conversation

porunov
Copy link
Member

@porunov porunov commented May 3, 2023

  • Adds multiQuery support to CQL storage backend.
  • Ensure storage.parallel-backend-ops thread pool is created only for storage backends which don't support multiQuery (multi-key slice operations).
  • Change purpose of storage.cql.executor-service to be used for results deserialization jobs only (not for IO operations).

Performance improvements for multi-queries using CQL storage backend:
#3760 (comment)
#3760 (comment)

Fixes #2406
Fixes #3747 (see comment)
Fixes #3759
Suppress #3170


Thank you for contributing to JanusGraph!

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there an issue associated with this PR? Is it referenced in the commit message?
  • Does your PR body contain #xyz where xyz is the issue number you are trying to resolve?
  • Has your PR been rebased against the latest commit within the target branch (typically master)?
  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you written and/or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE.txt file, including the main LICENSE.txt file in the root of this repository?
  • If applicable, have you updated the NOTICE.txt file, including the main NOTICE.txt file found in the root of this repository?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

@porunov porunov added this to the Release v1.0.0 milestone May 3, 2023
@janusgraph-bot janusgraph-bot added the cla: external Externally-managed CLA label May 3, 2023
@porunov porunov force-pushed the feature/cql-multi-key-slices branch from 651c969 to edc4106 Compare May 3, 2023 12:09
@porunov porunov force-pushed the feature/cql-multi-key-slices branch from edc4106 to d783b5d Compare May 3, 2023 16:01
@porunov

This comment was marked as resolved.

@porunov

This comment was marked as resolved.

@porunov porunov force-pushed the feature/cql-multi-key-slices branch from d783b5d to b2c669b Compare May 4, 2023 16:39
@porunov porunov force-pushed the feature/cql-multi-key-slices branch from b2c669b to d8c91ea Compare May 4, 2023 19:41
@porunov porunov changed the title Implement multi-key slice async queries for CQL storage backend Implement multi-key slice async queries for CQL storage backend [cql-tests] [tp-tests] May 4, 2023
@porunov
Copy link
Member Author

porunov commented May 4, 2023

Default Benchmarks

Execution commands:

mvn clean install -Pjanusgraph-benchmark -DskipTests=true --batch-mode --also-make -Dgpg.skip=true
mvn verify --projects janusgraph-benchmark

master branch:

Benchmark                                   (hardMaxLimit)  (numberOfVertices)  (size)  (useSmartLimit)  Mode  Cnt     Score      Error  Units
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000             true  avgt    5     7.206 ±    0.479  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000            false  avgt    5     5.258 ±    0.162  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000             true  avgt    5   233.100 ±   13.731  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000            false  avgt    5   255.417 ±   19.681  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000             true  avgt    5     7.442 ±    0.410  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000            false  avgt    5     5.112 ±    0.163  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000             true  avgt    5   222.019 ±    9.407  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000            false  avgt    5   135.902 ±    4.751  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A                1000     N/A              N/A  avgt    5   355.840 ±  440.654  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A               10000     N/A              N/A  avgt    5  1273.393 ± 1346.682  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A              100000     N/A              N/A  avgt    5  7182.857 ± 3092.063  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A                1000     N/A              N/A  avgt    5     1.015 ±    0.110  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A               10000     N/A              N/A  avgt    5    12.025 ±    0.575  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A              100000     N/A              N/A  avgt    5   189.415 ±   13.942  ms/op
MgmtOlapJobBenchmark.runClearIndex                     N/A                 N/A   10000              N/A  avgt    5   220.029 ±    3.415  ms/op
MgmtOlapJobBenchmark.runReindex                        N/A                 N/A   10000              N/A  avgt    5   272.134 ±   22.127  ms/op

Current PR:

Benchmark                                   (hardMaxLimit)  (numberOfVertices)  (size)  (useSmartLimit)  Mode  Cnt     Score      Error  Units
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000             true  avgt    5     6.908 ±    1.028  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A   10000            false  avgt    5     5.182 ±    0.154  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000             true  avgt    5   225.227 ±   10.261  ms/op
GraphCentricQueryBenchmark.getVertices              100000                 N/A  250000            false  avgt    5   263.993 ±   15.523  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000             true  avgt    5     7.188 ±    0.326  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A   10000            false  avgt    5     5.099 ±    0.109  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000             true  avgt    5   248.036 ±   13.761  ms/op
GraphCentricQueryBenchmark.getVertices          2147483647                 N/A  250000            false  avgt    5   134.410 ±    8.252  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A                1000     N/A              N/A  avgt    5   349.861 ±  435.534  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A               10000     N/A              N/A  avgt    5  1266.867 ± 1310.859  ms/op
JanusGraphSpeedBenchmark.basicAddAndDelete             N/A              100000     N/A              N/A  avgt    5  7616.284 ± 3593.961  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A                1000     N/A              N/A  avgt    5     0.991 ±    0.110  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A               10000     N/A              N/A  avgt    5    12.173 ±    0.381  ms/op
JanusGraphSpeedBenchmark.basicCount                    N/A              100000     N/A              N/A  avgt    5   185.954 ±    7.452  ms/op
MgmtOlapJobBenchmark.runClearIndex                     N/A                 N/A   10000              N/A  avgt    5   220.121 ±    2.997  ms/op
MgmtOlapJobBenchmark.runReindex                        N/A                 N/A   10000              N/A  avgt    5   275.552 ±   39.752  ms/op

Conclusion: Existing tests didn't show any difference (no regression and no improvement). Looks like there were no CQL Benchmark tests executed.

@porunov porunov force-pushed the feature/cql-multi-key-slices branch from d8c91ea to d491add Compare May 4, 2023 23:06
@porunov
Copy link
Member Author

porunov commented May 4, 2023

@li-boxuan do you know how to run CQL Benchmark tests? Looks like CQLMultiQueryBenchmark isn't executed by default. I have never used CcmBridge.Builder so not sure if I need to start Cassandra instance locally via some local JVM process or docker. I tried to spin up Cassandra locally but still CQL tests are not executed. I also tried to install pip install ccm but it didn't help either.
In case you have any script I could use to quickly execute CQLMultiQueryBenchmark that would be awesome. Otherwise I will try to figure it out this week.

@li-boxuan
Copy link
Member

li-boxuan commented May 5, 2023

@porunov This benchmark CI does not run on PR. You can find the result (posted by github-actions bot) on your own branch: porunov@d491add#commitcomment-111883622

If you want to run it locally, do the following (or see ci-benchmark.yml):

pip install git+https://github.com/li-boxuan/ccm.git
export CCM_CLUSTER_START_DEFAULT_TIMEOUT=300
export CCM_UPDATE_PID_DEFAULT_TIMEOUT=300
mvn verify --projects janusgraph-benchmark

Tested the above commands on my laptop. You don't need to start ccm or install Cassandra by yourself.

@porunov porunov force-pushed the feature/cql-multi-key-slices branch from d491add to bde81d2 Compare May 5, 2023 17:07
@porunov
Copy link
Member Author

porunov commented May 5, 2023

@porunov This benchmark CI does not run on PR. You can find the result (posted by github-actions bot) on your own branch: porunov@d491add#commitcomment-111883622

If you want to run it locally, do the following (or see ci-benchmark.yml):

pip install git+https://github.com/li-boxuan/ccm.git
export CCM_CLUSTER_START_DEFAULT_TIMEOUT=300
export CCM_UPDATE_PID_DEFAULT_TIMEOUT=300
mvn verify --projects janusgraph-benchmark

Tested the above commands on my laptop. You don't need to start ccm or install Cassandra by yourself.

Tried to execute this benchmark but couldn't. Probably something with my local environment. Need to investigate.

19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr> Traceback (most recent call last):
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>   File "/home/workenv/.local/bin/ccm", line 107, in <module>
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>     parser = cmd.get_parser()
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>              ^^^^^^^^^^^^^^^^
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>   File "/home/workenv/.local/lib/python3.11/site-packages/ccmlib/cmds/command.py", line 60, in get_parser
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>     get_remote_usage()
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>   File "/home/workenv/.local/lib/python3.11/site-packages/ccmlib/remote.py", line 32, in get_remote_usage
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>     return RemoteOptionsParser().usage()
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>   File "/home/workenv/.local/lib/python3.11/site-packages/ccmlib/remote.py", line 497, in usage
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>     usage = self.parser.format_help().split("optional arguments:")[1]
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr>             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
19:09:40 ERROR com.datastax.oss.driver.api.testinfra.ccm.CcmBridge$2.processLine - ccmerr> IndexError: list index out of range
Exception in thread "main" java.lang.RuntimeException: The command '[ccm, create, ccm_1, -i, 127.0.0., -n, 1:0, -v, 4.0.0, --config-dir=/tmp/ccm12649312140642473186]' failed to execute
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:389)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:342)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:326)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.create(CcmBridge.java:227)
	at org.janusgraph.BenchmarkRunner.runCqlBenchmarks(BenchmarkRunner.java:73)
	at org.janusgraph.BenchmarkRunner.main(BenchmarkRunner.java:105)
Caused by: org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
	at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
	at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
	at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:381)
	... 5 more
Exception in thread "Thread-0" java.lang.RuntimeException: The command '[ccm, remove, --config-dir=/tmp/ccm12649312140642473186]' failed to execute
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:389)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:342)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:326)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.remove(CcmBridge.java:287)
	at org.janusgraph.BenchmarkRunner.lambda$runCqlBenchmarks$0(BenchmarkRunner.java:69)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
	at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
	at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
	at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
	at com.datastax.oss.driver.api.testinfra.ccm.CcmBridge.execute(CcmBridge.java:381)
	... 5 more
[ERROR] Command execution failed.

@li-boxuan
Copy link
Member

Let me run the benchmark locally and update here.

@porunov
Copy link
Member Author

porunov commented May 5, 2023

Let me run the benchmark locally and update here.

Thank you. The PR is still WIP. CQL tests are failing right now. You could try executing tests for master branch and when the PR is ready I can ping you to execute benchmark tests for this PR.

Also, I think it could be helpful to change CQL benchmark tests to start Cassandra Testcontainer (or directly a docker image) and execute those tests against that Cassandra container, similarly as we do for other tests right now. If so, it will be possible to run tests with docker only requirement.

@li-boxuan
Copy link
Member

li-boxuan commented May 5, 2023

Also, I think it could be helpful to change CQL benchmark tests to start Cassandra Testcontainer (or directly a docker image) and execute those tests against that Cassandra container, similarly as we do for other tests right now. If so, it will be possible to run tests with docker only requirement.

The original thought was to be able to run benchmarks in a distributed setting and to follow what the official Cassandra testing suite does. I didn't expect it would be troublesome to run on different environments since the CI seems to be happy with it.

@porunov porunov force-pushed the feature/cql-multi-key-slices branch from bde81d2 to d68e117 Compare May 5, 2023 18:50
@porunov
Copy link
Member Author

porunov commented May 5, 2023

Also, I think it could be helpful to change CQL benchmark tests to start Cassandra Testcontainer (or directly a docker image) and execute those tests against that Cassandra container, similarly as we do for other tests right now. If so, it will be possible to run tests with docker only requirement.

The original thought was to be able to run benchmarks in a distributed setting and to follow what the official Cassandra testing suite does. I didn't expect it would be troublesome to run on different environments since the CI seems to be happy with it.

That's a good point. I think you are right. Let's leave it like it is. I will try to setup local environment and then add a documentation page on setting up the environment for the benchmark tests, just in case anyone has similar problems as I do.

BTW. I'm not 100% sure, but I think I fixed the tests. At least KeyColumnValueStoreTest are passing locally right now. If you can execute CQLMultiQueryBenchmark (only this test is needed) then it would be great. I suspect there should be no regression unless I missed something.

@li-boxuan
Copy link
Member

li-boxuan commented May 5, 2023

Result on master (5159135, Apr 27)

Benchmark                                (fanoutFactor)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5    374.157 ±  239.950  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  12169.690 ± 5296.357  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5    352.033 ±  111.833  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  11567.970 ± 5539.629  ms/op

Result on master (e332af6, May 3)

Benchmark                                (fanoutFactor)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5    414.251 ±  176.041  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  11035.765 ±  727.674  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5    356.241 ±   42.617  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  10531.771 ± 1442.964  ms/op

Result on your change (d68e117, May 5)

Benchmark                                (fanoutFactor)  Mode  Cnt     Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5   203.071 ±  120.440  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  7190.759 ± 1377.739  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5   194.245 ±   62.117  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  6756.798 ±  955.217  ms/op

Looks like significant performance improvement 👍

@porunov
Copy link
Member Author

porunov commented May 5, 2023

Result on master (5159135, Apr 27)

Benchmark                                (fanoutFactor)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5    374.157 ±  239.950  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  12169.690 ± 5296.357  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5    352.033 ±  111.833  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  11567.970 ± 5539.629  ms/op

Result on master (e332af6, May 3)

Benchmark                                (fanoutFactor)  Mode  Cnt      Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5    414.251 ±  176.041  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  11035.765 ±  727.674  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5    356.241 ±   42.617  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  10531.771 ± 1442.964  ms/op

Result on your change (d68e117, May 5)

Benchmark                                (fanoutFactor)  Mode  Cnt     Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5   203.071 ±  120.440  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  7190.759 ± 1377.739  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5   194.245 ±   62.117  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  6756.798 ±  955.217  ms/op

Looks like significant performance improvement +1

Thank you @li-boxuan for running the tests! I think the master branch tests could be affected by not enough threads for CQL IO operations. I think this branch shows CQL performance improvement just due to higher back pressure settings.

Would you be able to execute the same test for master branch again please but adding the next line into CQLMultiQueryBenchmark.getConfiguration() method?

config.set(GraphDatabaseConfiguration.PARALLEL_BACKEND_EXECUTOR_SERVICE_CORE_POOL_SIZE, 1024);

With this setting it will use 1024 threads for multi-query runs which should improve parallelism for some CQL IO operations and the results will be more fair after that.

@li-boxuan
Copy link
Member

Would you be able to execute the same test for master branch again please but adding the next line into CQLMultiQueryBenchmark.getConfiguration() method

Here you go:

Benchmark                                (fanoutFactor)  Mode  Cnt     Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5   429.718 ± 1010.393  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  9839.199 ± 1524.108  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5   429.245 ±  506.169  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  9330.174 ± 1406.735  ms/op

@porunov
Copy link
Member Author

porunov commented May 5, 2023

Would you be able to execute the same test for master branch again please but adding the next line into CQLMultiQueryBenchmark.getConfiguration() method

Here you go:

Benchmark                                (fanoutFactor)  Mode  Cnt     Score      Error  Units
CQLMultiQueryBenchmark.getNames                     100  avgt    5   429.718 ± 1010.393  ms/op
CQLMultiQueryBenchmark.getNames                     500  avgt    5  9839.199 ± 1524.108  ms/op
CQLMultiQueryBenchmark.getNeighborNames             100  avgt    5   429.245 ±  506.169  ms/op
CQLMultiQueryBenchmark.getNeighborNames             500  avgt    5  9330.174 ± 1406.735  ms/op

Thanks a lot! Indeed performance is better in this PR. I think the overhead of thread context switching is gone and it allowed for better overall performance.

@porunov porunov marked this pull request as ready for review May 6, 2023 18:35
@porunov porunov requested a review from a team May 6, 2023 18:35
@porunov
Copy link
Member Author

porunov commented May 6, 2023

The PR is now ready for review.

@li-boxuan pinging you as you showed interest in it earlier.

@cdegroc you previously showed interest in changing the KeyColumnValueStore interface to return CompletableFuture<EntryList> instead of EntryList. This PR doesn't introduce this breaking change, but I think this work can be used later for changing the interface because in this PR I changed CQLSliceFunction interface to return CompletableFuture<EntryList>.

Copy link
Member

@li-boxuan li-boxuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is a very interesting PR! I didn't get a chance to finish reviewing, but I'll leave some thoughts and questions first.

li-boxuan

This comment was marked as duplicate.

@porunov
Copy link
Member Author

porunov commented May 26, 2023

@li-boxuan I was wondering if you have a chance to review this PR. In case you don’t have enough capacity for the review, would you be OK with lazy consensus?
I’m also good to have a call in case you think it may help with the review.

@li-boxuan
Copy link
Member

I was wondering if you have a chance to review this PR

Sorry for my procrastination. I'll make sure to finish the review by the end of this weekend.

…tests] [tp-tests]

- Adds multiQuery support to CQL storage backend.
- Ensure `storage.parallel-backend-ops` thread pool is created only for storage backends which don't support multiQuery (multi-key slice operations).
- Change purpose of `storage.cql.executor-service` to be used for results deserialization jobs only (not for IO operations).

Fixes JanusGraph#2406
Fixes JanusGraph#3747
Fixes JanusGraph#3759
Related to JanusGraph#3170

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
…re tests

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
@li-boxuan li-boxuan requested a review from cdegroc May 26, 2023 16:32
Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
@porunov porunov force-pushed the feature/cql-multi-key-slices branch from 0134eb9 to 1e17c63 Compare May 26, 2023 19:59
@porunov
Copy link
Member Author

porunov commented May 26, 2023

A note on tests.
Codecov may complain that CompletableFutureUtil, QueryBackPressure, and QueryBackPressureBuilder are not covered by tests that's because it doesn't pickup some unit tests (CompletableFutureUtilTest, QueryBackPressureTest, QueryBackPressureBuilderTest).
Integration tests are added into CQLGraphTest to cover main functionality.
Other tests are backPressureLimitOverflowTest and testParallelBackendOps.

Copy link
Member

@li-boxuan li-boxuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally did one pass on this review. Great job @porunov ! I still have some confusion around the ChunkedJobDefinition and its usage, but overall it's a pretty solid implementation.

li-boxuan

This comment was marked as duplicate.

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
@porunov
Copy link
Member Author

porunov commented May 27, 2023

@li-boxuan pushed changes per your comments. Please, let me know if you have any doubts.

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
@porunov porunov force-pushed the feature/cql-multi-key-slices branch from e5902ad to e1b5434 Compare May 28, 2023 11:29
Copy link
Member

@li-boxuan li-boxuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love how you handled paging elegantly so that the CQL thread pool is now only for CPU workloads rather than I/O workloads!

I remember in my (unpublished) implementation, paging was a headache. I ended up using the CQL thread pool for pagination - which unfortunately meant that long queries (both deserialization and I/O workloads) could block short queries (only deserialization workloads). Your approach perfectly solved the problem 🎉

@porunov
Copy link
Member Author

porunov commented May 28, 2023

I love how you handled paging elegantly so that the CQL thread pool is now only for CPU workloads rather than I/O workloads!

I remember in my (unpublished) implementation, paging was a headache. I ended up using the CQL thread pool for pagination - which unfortunately meant that long queries (both deserialization and I/O workloads) could block short queries (only deserialization workloads). Your approach perfectly solved the problem 🎉

Thank you for your thoughtful review and help with running the benchmarks! I’m currently running benchmarks using external Cassandra (i.e. changing the hostname in the benchmark tests), but will figure out later how to use it directly.

@porunov
Copy link
Member Author

porunov commented May 28, 2023

@cdegroc please, ping me if you have plans on reviewing this PR. Otherwise I'm going to merge it tomorrow.

@porunov porunov merged commit 3a7ba53 into JanusGraph:master May 29, 2023
189 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: external Externally-managed CLA
Projects
3 participants