Improved contains check for bulkset with elements#2425
Improved contains check for bulkset with elements#2425vkagamlyk merged 2 commits intoapache:3.6-devfrom
Conversation
|
javascript build errors seem unrelated: |
Cole-Greer
left a comment
There was a problem hiding this comment.
Thanks for the submission @steigma. I see this approach as a good step towards recovering some of the efficiency of the old comparison semantics. I left a few comments with minor tweaks to simplify it slightly.
| ((BulkSet<?>)second).allContainedElementsSameClass() && | ||
| ((BulkSet<?>)second).getAllContainedElementsClass() != null && | ||
| Element.class.isAssignableFrom(((BulkSet<?>)second).getAllContainedElementsClass()) && |
There was a problem hiding this comment.
Could these checks be removed? I believe they are redundant. If any of these checks fail, then the final first.getClass() == ((BulkSet<?>)second).getAllContainedElementsClass() check would also fail.
| ((BulkSet<?>)second).allContainedElementsSameClass() && | |
| ((BulkSet<?>)second).getAllContainedElementsClass() != null && | |
| Element.class.isAssignableFrom(((BulkSet<?>)second).getAllContainedElementsClass()) && |
There was a problem hiding this comment.
Thanks, I think you are right, these checks are redundant. Removed them in new revision.
| boolean hadNull = false; | ||
| for (final S key : this.map.keySet()) { | ||
| if ((key == null || key.getClass() == null)) { | ||
| if (allContainedElementsClass != null) { | ||
| allContainedElementsClass = null; | ||
| break; | ||
| } | ||
| hadNull = true; | ||
| } else if (hadNull) { | ||
| break; |
There was a problem hiding this comment.
Can be simplified. If a single null is found in the set, then we can return false.
| boolean hadNull = false; | |
| for (final S key : this.map.keySet()) { | |
| if ((key == null || key.getClass() == null)) { | |
| if (allContainedElementsClass != null) { | |
| allContainedElementsClass = null; | |
| break; | |
| } | |
| hadNull = true; | |
| } else if (hadNull) { | |
| break; | |
| for (final S key : this.map.keySet()) { | |
| if ((key == null || key.getClass() == null)) { | |
| allContainedElementsClass = null; | |
| return false; |
There was a problem hiding this comment.
Thanks, simplified it in new revision as you suggested.
| allContainedElementsClassChecked = true; | ||
| boolean hadNull = false; | ||
| for (final S key : this.map.keySet()) { | ||
| if ((key == null || key.getClass() == null)) { |
There was a problem hiding this comment.
In what situation can key.getClass() == null be true?
There was a problem hiding this comment.
Thanks, getClass probably never returns null, removed it in new revision.
| } | ||
|
|
||
| @Test | ||
| public void shouldNotHaveSameClassForNull() { |
There was a problem hiding this comment.
Missing test case for null in the middle.
Something like Vertex1, null, Vertex2
There was a problem hiding this comment.
Thanks, I added a second case where null is added in the "middle", but not sure whether it is testing something different as it is stored in different "random" order in the bulkset.
| allContainedElementsClass = null; | ||
| break; | ||
| } | ||
| hadNull = true; |
There was a problem hiding this comment.
probably hadNull = true; should be before if in line 89
There was a problem hiding this comment.
Thanks, simplified it as you suggested below.
|
Thanks for opening the PR with the improvement. I don't have much to add to what's already been commented, but having a CHANGELOG entry would be helpful. |
1588ad8 to
a7781f5
Compare
…ibute that represents whether all elements are of same type/class
a7781f5 to
7bdc66d
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## 3.6-dev #2425 +/- ##
=============================================
+ Coverage 75.16% 76.17% +1.01%
- Complexity 12316 13135 +819
=============================================
Files 1057 1084 +27
Lines 63470 65044 +1574
Branches 6936 7264 +328
=============================================
+ Hits 47706 49548 +1842
+ Misses 13191 12799 -392
- Partials 2573 2697 +124 ☔ View full report in Codecov by Sentry. |
|
Thanks for contribution @steigma. VOTE+1 |
|
Thanks @steigma, VOTE +1 |
Improved within test check for bulkset with elements (i.e., Vertex, Edge, VertexProperty) by using contains method. Due to changes w.r.t. Gremlin comparison semantics (cf. https://tinkerpop.apache.org/docs/3.7.0/dev/provider/#gremlin-semantics-concepts) this check was no longer done efficiently, which led to some regressions (see query/example below). In some cases, we can however ensure that the contains of the bulkset (using hash code and Object.equals) leads to the same results as the GremlinValueComparator.COMPARABILITY.equals. In fact, for elements, both checks are only be done with the ids of these elements.
This change re-enables an efficient check for elements (if the bulkset also contains these elements and only contains these kind of elements). This is realized via a transient attribute (allContainedElementsSameClass) in the bulkset class that represents whether all elements are of same type/class, which is checked by the within test method. Tje attribute is computed lazily when accessed to avoid overhead if the information is not required.
Pseudo code for sample data:
Sample query (which is very inefficiently executed without this change):
The query is obviously not optimally formulated, but reproduces the issue