-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade TinkerPop to 3.5.0 [tp-tests] #2619
Conversation
7c2780f
to
014a5bc
Compare
...rg/janusgraph/graphdb/tinkerpop/optimize/strategy/JanusGraphLocalQueryOptimizerStrategy.java
Show resolved
Hide resolved
3630c22
to
3eb6c6a
Compare
f89bbc9
to
63ce476
Compare
b4b8ffd
to
e8738f5
Compare
This comment has been minimized.
This comment has been minimized.
@farodin91 Do you want to create a separate PR for that clean-up? |
It requires tp 3.5. I can wait that this PR is merged and create a new PR. I think should be part of jg 0.6. |
Got it, I think that would be better. Just want to avoid unnecessary changes in this PR since it's already large. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update mkdocs.yml and docs/changelog.md?
73a1e74
to
78323ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for upgrading TP?
I just have some questions.
@@ -269,9 +270,11 @@ public static Configuration getJobConf(List<SliceQuery> queries, Long modulus) { | |||
} | |||
|
|||
public static Configuration getJobConf(List<SliceQuery> queries, Long modulus, Long modVal) { | |||
BaseConfiguration baseConfiguration = new BaseConfiguration(); | |||
baseConfiguration.setListDelimiterHandler(new DefaultListDelimiterHandler(',')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need this setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maintain backward compatibility. In the old version of commons-configuration, comma-separated values are parsed into a list of values, while in the new version (configuration2), commas are read as they are.
ref:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want the old behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't do baseConfiguration.setListDelimiterHandler(new DefaultListDelimiterHandler(','));
, then comma-delimited values, e.g. "host1,host2,host3" will be read as a single string, rather than a string array ["host1","host2","host3"] (which is the old behavior).
In theory, we can read it as a single string. When we need it, we split it into a string array. The drawback is, this string splitting might happen many times on runtime and thus inefficient. Another problem is, I am not sure if I could implement that change correctly and not miss any place.
janusgraph-dist/src/assembly/static/conf/gremlin-server/gremlin-server-berkeleyje.yaml
Show resolved
Hide resolved
@@ -88,7 +88,8 @@ | |||
<scylladb.version>1.7.1</scylladb.version> | |||
<!-- align with org.apache.hbase:hbase --> | |||
<jackson1.version>1.9.13</jackson1.version> | |||
<jackson2.version>2.12.2</jackson2.version> | |||
<!-- align with org.apache.spark:spark-core_2.12 --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should upgrade spark in a next pr to 3 or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I got you. spark-gremlin 3.5.0 depends on org.apache.spark:spark-core_2.12 3.0.0: https://mvnrepository.com/artifact/org.apache.tinkerpop/spark-gremlin/3.5.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.12 for jackson or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spark package name is org.apache.spark:spark-core_2.12. The version of it is 3.0.0. It requires Jackson 2.10.0, otherwise, it throws runtime exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the 2.12 is for the support scale version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean scala right? Yes, the "2.12" suffix of "spark_core" is seemingly to match the scala version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @li-boxuan for this upgrade!
I have just some nitpicks mostly
janusgraph-core/src/main/java/org/janusgraph/core/JanusGraphFactory.java
Show resolved
Hide resolved
janusgraph-core/src/main/java/org/janusgraph/core/JanusGraphFactory.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/janusgraph/diskstorage/configuration/backend/CommonsConfiguration.java
Outdated
Show resolved
Hide resolved
...ph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/JanusGraphTraversalUtil.java
Show resolved
Hide resolved
janusgraph-core/src/main/java/org/janusgraph/graphdb/transaction/StandardJanusGraphTx.java
Show resolved
Hide resolved
...ph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/JanusGraphTraversalUtil.java
Show resolved
Hide resolved
janusgraph-core/src/main/java/org/janusgraph/graphdb/vertices/AbstractVertex.java
Outdated
Show resolved
Hide resolved
janusgraph-core/src/main/java/org/janusgraph/util/system/ConfigurationLint.java
Outdated
Show resolved
Hide resolved
eac5087
to
1a6d8cc
Compare
1. Bump TinkerPop dependency from 3.4.10 to 3.5.0 and resolve dependency conflicts. Several jackson packages are ignored from requireUpperBoundDeps maven enforce plugin now, because Spark 3.0.0 requires a lower version of jackson (2.10.0) while other libraries, including CQL driver, require a higher version of jackson. 2. Support null value mutation semantics to comply with TinkerPop 3.5.0. property(single, prop, null) removes the property while property(list/set, prop, null) is simply ignored. 3. Introduce a workaround for traversal strategies which can be removed once TINKERPOP-2568 is fixed. 4. Upgrade apache common configuration to configuration2 to comply with the same TinkerPop breaking change. Now users won't experience the problem reported at JanusGraph#1447 anymore, since configuration2 by default does not read comma delimited string as a list of values. 5. Other minor changes mostly caused by breaking changes in TinkerPop. Closes JanusGraph#2611, JanusGraph#2624, JanusGraph#1447 Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk>
1a6d8cc
to
e016e0d
Compare
...ph-core/src/main/java/org/janusgraph/graphdb/tinkerpop/optimize/JanusGraphTraversalUtil.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you @li-boxuan for this upgrade!
Anyone who is still reviewing, or intends to review, please let me know within 24 hours. Otherwise, I'll merge this PR after that. |
The version compatibility matrix should list TP |
I think it makes sense to add a separate commit which won't trigger full tp tests because it's just a documentation change. |
This adds test cases to demonstrate that comma-delimited values can be loaded by HadoopGraph in Spark usecase. Prior to TinkerPop 3.5.0 upgrade, if config file has `gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph`, which is typically the case for MapReduce/Spark traversals, then `GraphFactory.open(filename)` would first read all comma delimited values into a list, and then only retain the last value. A workaround is to create a `PropertiesConfiguration` config object, disable comma delimiter, and then call `GraphFactory.open(config)`. This is auto-resolved by JanusGraph#2619 because since TinkerPop 3.5.0, comma-delimited values are by default loaded as they are. This commit adds some test cases to demonstrate this behaviour. The same tests fail without JanusGraph#2619. Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk>
This adds test cases to demonstrate that comma-delimited values can be loaded by HadoopGraph in Spark usecase. Prior to TinkerPop 3.5.0 upgrade, if config file has `gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph`, which is typically the case for MapReduce/Spark traversals, then `GraphFactory.open(filename)` would first read all comma delimited values into a list, and then only retain the last value. A workaround is to create a `PropertiesConfiguration` config object, disable comma delimiter, and then call `GraphFactory.open(config)`. This is auto-resolved by JanusGraph#2619 because since TinkerPop 3.5.0, comma-delimited values are by default loaded as they are. This commit adds some test cases to demonstrate this behaviour. The same tests fail without JanusGraph#2619. Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk>
This adds test cases to demonstrate that comma-delimited values can be loaded by HadoopGraph in Spark usecase. Prior to TinkerPop 3.5.0 upgrade, if config file has `gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph`, which is typically the case for MapReduce/Spark traversals, then `GraphFactory.open(filename)` would first read all comma delimited values into a list, and then only retain the last value. A workaround is to create a `PropertiesConfiguration` config object, disable comma delimiter, and then call `GraphFactory.open(config)`. This is auto-resolved by #2619 because since TinkerPop 3.5.0, comma-delimited values are by default loaded as they are. This commit adds some test cases to demonstrate this behaviour. The same tests fail without #2619. Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk>
dependency conflicts. Several jackson packages are ignored from
requireUpperBoundDeps maven enforce plugin now, because Spark 3.0.0
requires a lower version of jackson (2.10.0) while other libraries,
including CQL driver, require a higher version of jackson.
property(single, prop, null) removes the property while
property(list/set, prop, null) is simply ignored.
once TINKERPOP-2568 is fixed.
with the same TinkerPop breaking change. Now users won't experience the
problem reported at ConfigurationGraphFactory configuration with multiple storage.hostname not working #1447 anymore, since configuration2 by default does
not read comma delimited string as a list of values.
Closes #2611, #2624, #1447
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
master
)?For code changes:
For documentation related changes: