Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedicated Scylla backend #1778

Closed
FlorianHockmann opened this issue Sep 3, 2019 · 12 comments · Fixed by #3578
Closed

Dedicated Scylla backend #1778

FlorianHockmann opened this issue Sep 3, 2019 · 12 comments · Fixed by #3578

Comments

@FlorianHockmann
Copy link
Member

Scylla now has its own Java driver which is a fork of DataStax's CQL driver we currently use for the CQL backend with some Scylla specific optimizations. These optimizations are described in this presentation from the 2018 Scylla summit which hints at significant latency reductions thanks to shard awareness in the driver.

It would be great if we could use this driver together with our existing CQL backend, so we get the performance improvements without having to maintain yet another full backend.

@dorlaor
Copy link

dorlaor commented Sep 3, 2019

There is also BYPASS_CACHE flag that allows per query cache bypass so scans do not pollute the cache.

@FlorianHockmann
Copy link
Member Author

I have started working on this and got a first implementation that should make it possible to include JanusGraph embedded and use the Scylla driver instead of the DataStax driver. Some first JanusGraph tests are at least running and I get a log message from the driver telling that the "advanced" driver is used which is the Scylla driver.
But I don't know yet how to make this work for JanusGraph Server as that also has a dependency on our usual CQL backend which still uses the DataStax driver. So it has both drivers on its classpath and then probably the first one found on the classpath will be used, irrespective of whether cql or scylla is configured as the storage backend.

If anyone has an idea how we could select at runtime which driver we want to use, that would much appreciated.

@porunov
Copy link
Member

porunov commented Jul 31, 2020

@FlorianHockmann I'm not sure if it will work after we update DataStax driver to 4.x (#2169).
ScyllaDB driver is based on DataStax driver 3.x. DataStax driver 3 and 4 are binary incompatible. Most likely you will need to copy all content of the current CQLStoreManager into ScyllaStoreManager.

Also, I guess it would be interesting to compare DataStax 4.8.0 and Scylla 3.7.1-scylla-2 drivers. ScyllaDB java driver was a fork of DataStax Cassandra driver (I guess 3.7.1). There were many bug fixes and improvements after that in 3.x branch which Scylla driver 3.7.1-scylla-2 didn't get (~8 months of development. Changelog). Moreover, this year the driver 4.x is even more actively developed than the driver 3.x.
If ScyllaDB driver compared to DataStax 3.x branch, there are not many changes actually:
apache/cassandra-java-driver@3.x...scylladb:3.7.1-scylla

I didn't research them, but I guess the changes are related to improvement of the token-aware policy as described here https://docs.scylladb.com/using-scylla/scylla-java-driver/ and a possibility to set a BYPASS_CACHE flag as noted by @dorlaor . That's definitely great and I would love to use it but the fact that ScyllaDB driver doesn't provide bug fixes which are implemented in recent DataStax driver gives me some doubts.

Here are missing commits from DataStax 3.x branch which ScyllaDB Java driver misses.
scylladb/java-driver@3.7.1-scylla...datastax:3.x

That said, I'm totally fine with adding ScyllaDB driver into JanusGraph. I just think it could be useful if we could compare them.

@FlorianHockmann
Copy link
Member Author

Thanks for your detailed comment, @porunov! You definitely make some good points and I didn't follow the work on updating the driver to version 4 close so I wasn't aware of this.
In general I would really like to avoid having to copy lots of code just to get the Scylla driver.
But it looks like it's best right now for us to wait and see whether Scylla will also upgrade their driver to version 4 and also whether they will then maintain it, at least by merging in fixes / improvements from the upstream DataStax driver.

@dorlaor
Copy link

dorlaor commented Aug 4, 2020

@FlorianHockmann cheers for picking this up!
@porunov Good points, we should rebase the scylla driver over 4.x
@haaawk when can we rebase?

@haaawk
Copy link

haaawk commented Aug 4, 2020

I finished rebasing it last week, @dorlaor. If everything goes well the 4.x driver should be published to Maven repository this week.

@sergeymetallic
Copy link

Looks like there is a Scylla driver of version 4.9.0 released (https://github.com/scylladb/java-driver/commits/4.9.0-scylla-0) which is based on Datastax driver version 4.9.0. So theoretically there should not be any problem if we use it with the latest Janusgraph?

@porunov porunov added this to the Release v1.0.0 milestone Sep 7, 2021
porunov added a commit to porunov/janusgraph that referenced this issue Feb 11, 2023
Fixes JanusGraph#1778

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 11, 2023
Fixes JanusGraph#1778

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 11, 2023
- Move CQL Hadoop implementation to a separate module `janusgraph-cql-hadoop`
- Move CQL tests to `janusgraph-cql-testutils` to be able to share tests between different storage modules
- Add ScyllaDB support via `janusgraph-scylla`

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 11, 2023
- Move CQL Hadoop implementation to a separate module `janusgraph-cql-hadoop`
- Move CQL tests to `janusgraph-cql-testutils` to be able to share tests between different storage modules
- Move previous CQL implementation to `janusgraph-cql-base` to share it between different compatible storage modules (i.e. `janusgraph-cql` and `janusgraph-scylla`)
- Add ScyllaDB support via `janusgraph-scylla`
- `janusgraph-cql` is contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Move CQL Hadoop implementation to a separate module `janusgraph-cql-hadoop`
- Move CQL tests to `janusgraph-cql-testutils` to be able to share tests between different storage modules
- Move previous CQL implementation to `janusgraph-cql-base` to share it between different compatible storage modules (i.e. `janusgraph-cql` and `janusgraph-scylla`)
- Add ScyllaDB support via `janusgraph-scylla`
- `janusgraph-cql` is contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Move CQL Hadoop implementation to a separate module `janusgraph-cql-hadoop`
- Move CQL tests to `janusgraph-cql-testutils` to be able to share tests between different storage modules
- Move previous CQL implementation to `janusgraph-cql-base` to share it between different compatible storage modules (i.e. `janusgraph-cql` and `janusgraph-scylla`)
- Add ScyllaDB support via `janusgraph-scylla`
- `janusgraph-cql` is contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Move CQL Hadoop implementation to a separate module `janusgraph-cql-hadoop`
- Move CQL tests to `janusgraph-cql-testutils` to be able to share tests between different storage modules
- Move previous CQL implementation to `janusgraph-cql-base` to share it between different compatible storage modules (i.e. `janusgraph-cql` and `janusgraph-scylla`)
- Add ScyllaDB support via `janusgraph-scylla`
- `janusgraph-cql` is contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Move CQL tests to `janusgraph-cql-testutils` to be able to share tests between different storage modules
- Move previous CQL implementation to `janusgraph-cql-base` to share it between different compatible storage modules (i.e. `janusgraph-cql` and `janusgraph-scylla`)
- Add ScyllaDB support via `janusgraph-scylla`
- `janusgraph-cql` is contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 12, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Fixes JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 13, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.

Fixes JanusGraph#1778
Related to JanusGraph#2451
Related to JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 14, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes JanusGraph#1778
Related to JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 14, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes JanusGraph#1778
Related to JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 14, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes JanusGraph#1778
Related to JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 14, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes JanusGraph#1778
Related to JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 14, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes JanusGraph#1778
Related to JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit to porunov/janusgraph that referenced this issue Feb 15, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes JanusGraph#1778
Related to JanusGraph#2451
Fixes JanusGraph#2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
porunov added a commit that referenced this issue Feb 17, 2023
- Add ScyllaDB driver support via `janusgraph-scylla`
- `janusgraph-cql` contains tests for Cassandra3, Cassandra4, ScyllaDB. `janusgraph-scylla` contains tests for ScyllaDB only.
- Fix ScyllaDB tests (configs were not applied to ScyllaDB tests previously)

Fixes #1778
Related to #2451
Fixes #2505

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
@dorlaor
Copy link

dorlaor commented Feb 17, 2023 via email

@porunov
Copy link
Member

porunov commented Feb 17, 2023

Fantastic!

On Fri, Feb 17, 2023 at 2:20 AM Oleksandr Porunov @.> wrote: Closed #1778 <#1778> as completed via #3578 <#3578>. — Reply to this email directly, view it on GitHub <#1778 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANHURIZXBD47VL7WIZ6FXLWX273XANCNFSM4ITFJRQA . You are receiving this because you were mentioned.Message ID: @.>

We still have some work to do to improve user experience in JanusGraph Server (to easily switch between CQL and Scylla drivers), but the dedicated ScyllaDB driver is now available in JanusGraph and will be released in JanusGraph version 1.0.0 😄

@dorlaor
Copy link

dorlaor commented Feb 17, 2023 via email

@porunov
Copy link
Member

porunov commented Feb 17, 2023

Shard aware will be great. Also need to consider when to hint 'bypass cache' On Fri, Feb 17, 2023 at 11:48 AM Oleksandr Porunov @.> wrote:

Fantastic! … <#m_-3350026947929479706_> On Fri, Feb 17, 2023 at 2:20 AM Oleksandr Porunov @.
> wrote: Closed #1778 <#1778> <#1778 <#1778>> as completed via #3578 <#3578> <#3578 <#3578>>. — Reply to this email directly, view it on GitHub <#1778 (comment) <#1778 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANHURIZXBD47VL7WIZ6FXLWX273XANCNFSM4ITFJRQA https://github.com/notifications/unsubscribe-auth/AANHURIZXBD47VL7WIZ6FXLWX273XANCNFSM4ITFJRQA . You are receiving this because you were mentioned.Message ID: @.
> We still have some work to do to improve user experience in JanusGraph Server (to easily switch between CQL and Scylla drivers), but the dedicated ScyllaDB driver is now available in JanusGraph and will be released in JanusGraph version 1.0.0 — Reply to this email directly, view it on GitHub <#1778 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANHURP32JS2HWGOVJOBWXTWX5CPFANCNFSM4ITFJRQA . You are receiving this because you were mentioned.Message ID: @.
**>

Definitely makes sense to bypass cache on full-scan queries (which might happen during re-indexing or simply when full-scan is needed for users). Thus, opened the next PR to track that feature: #3582

@dorlaor
Copy link

dorlaor commented Feb 17, 2023

Will be nice to benchmark, see the results and analyze the grafana dashboards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants