-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an interface to add conditions in ScanAll #889
Conversation
@@ -34,6 +37,7 @@ public class Scan extends Selection { | |||
private Optional<Key> endClusteringKey; | |||
private boolean endInclusive; | |||
private int limit; | |||
private final Set<Conjunction> conjunctions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally, we use a disjunctive set of conjunctions (i.e., as a DNF) to manage whole conditions.
scan.withConjunctions( | ||
Sets.cartesianProduct(disjunctions).stream() | ||
.map(Scan.Conjunction::of) | ||
.collect(Collectors.toSet())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When building a where clause with ".and().and()...", the builder holds a (conjunctive) set of disjunctions, and its conjunction should be empty. In this case, we convert CNF to DNF according to the distributive law.
.map(Scan.Conjunction::of) | ||
.collect(Collectors.toSet())); | ||
} else { | ||
scan.withConjunctions(conjunctions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In contrast, when building a where clause with ".or().or()...", the builder naturally holds conditions as a DNF. So, it just adds them.
*/ | ||
@Deprecated | ||
@Override | ||
public ScanAll withOrdering(Ordering ordering) { | ||
throw new UnsupportedOperationException(); | ||
return (ScanAll) super.withOrdering(ordering); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike a current ScanAll, a relational scan accept orderings. So, now we need to implement this, although we would like to use this internally for the builder and users do not have to use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not guaranteed that super.withOrdering()
returns ScanAll in terms of the type signature and it seems an unsafe downcast to me. I'm wondering if any type check is needed (or applying Generics type parameter, but it takes effort...) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was safe because the caller object of super.withOrdering()
here is always ScanAll
(and we are already doing the same in other similar methods).
@brfrn169 What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, it is not safe. If we change the withOrdering()
method of the superclass Scan
so that it does not return this
, a ClassCastException
may be thrown. However, at present, Scan.withOrdering()
does return this
, and since we have control over the Scan
class, I am comfortable with the downcast.
@komamitsu What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it seems there is no reasonable solution to it. I think we can go with the implementation as is.
I wish Java had a special generics type THIS
or something which represents the concrete class type itself so that we could write the method like public THIS withOrdering()
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brfrn169 @komamitsu Thank you! I left this as is.
*/ | ||
@Deprecated | ||
@Override | ||
public ScanAll withOrdering(Ordering ordering) { | ||
throw new UnsupportedOperationException(); | ||
return (ScanAll) super.withOrdering(ordering); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not guaranteed that super.withOrdering()
returns ScanAll in terms of the type signature and it seems an unsafe downcast to me. I'm wondering if any type check is needed (or applying Generics type parameter, but it takes effort...) ?
} | ||
|
||
@Immutable | ||
public static class AndConditionSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, Conjunction
and AndConditionSet
sound similar to me. Could you add a comment to clarify what they are ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Essentially, both are the same. But I intentionally used the different classes for the internal representation (Conjunction
) and user API (AndConditionSet
) for ease of understanding since the conjunction would be unfamiliar for many users. From the developer's perspective, I just prefer the term "conjunction" since it's short and technically correct to use many places internal modules, including gRPC proto. Do you think "internal use" or something like that is enough here? Or should we use AndConditionSet
also for the internal representation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe we should move AndConditionSet
and OrConditionSet
and their builder classes into ScanBuilder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks for the explanation. Either option, adding a comment that describes the relationship between Conjunction
and AndConditionSet
and the usages or using AndConditionSet
for internal use, sounds fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@komamitsu @brfrn169 Thanks for the comments. I will move both sets to the builder and describe the relationship and usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in d3bb194.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed it briefly and left some comments. As we should examine it thoroughly, let me review it again later.
Scan withConjunctions(Collection<Conjunction> conjunctions) { | ||
this.conjunctions.addAll(conjunctions); | ||
return this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are going to make Scan
an immutable class, please don't add a mutation method or make it deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstand something, but this method is package private. Is "deprecated" still necessary? I imagined current public withFooBar() mutation methods would be changed to package private ones like this function in v5.0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brfrn169 Just a confirmation. Will those mutation methods themselves be deleted in v5.0.0? In addition, for example, Scan
will have ScanBuilder
, and the builder can directly access the private member of Scan
without using withFooBar()
. That's why we should make it deprecated even if a package private method, is my guess correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Since the method in question is package-private and not public, there's no need to remove it or mark it as deprecated. Let's keep it as it is. If further changes or considerations are needed in the future, we can revisit this decision.
* @return set of {@code ConditionalExpression} which this conjunction is composed of | ||
*/ | ||
public Set<ConditionalExpression> getConditions() { | ||
return ImmutableSet.copyOf(conditions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the private constructor always gets ImmutableSet from the static factory methods, we don't need to copy the conditions here.
return ImmutableSet.copyOf(conditions); | |
return conditions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing it out. Fixed in d3bb194.
|
||
@Immutable | ||
public static class Conjunction { | ||
private final Set<ConditionalExpression> conditions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the private constructor always gets ImmutableSet
from the static factory methods, we might be able to make it ImmutableSet
to make it more explicit.
private final Set<ConditionalExpression> conditions; | |
private final ImmutableSet<ConditionalExpression> conditions; |
@komamitsu What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It's a good improvement in terms of type consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing it out. Fixed in d3bb194.
|
||
private final Set<ConditionalExpression> conditions; | ||
|
||
AndOrConditionSetBuilder(Set<ConditionalExpression> conditions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, we can make it private
:
AndOrConditionSetBuilder(Set<ConditionalExpression> conditions) { | |
private AndOrConditionSetBuilder(Set<ConditionalExpression> conditions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thank you! Fixed in d3bb194.
*/ | ||
@Deprecated | ||
@Override | ||
public ScanAll withOrdering(Ordering ordering) { | ||
throw new UnsupportedOperationException(); | ||
return (ScanAll) super.withOrdering(ordering); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, it is not safe. If we change the withOrdering()
method of the superclass Scan
so that it does not return this
, a ClassCastException
may be thrown. However, at present, Scan.withOrdering()
does return this
, and since we have control over the Scan
class, I am comfortable with the downcast.
@komamitsu What do you think?
// private Set<Conjunction> prepareConjunctions() { | ||
// return ImmutableSet.of( | ||
// RelationalScan.Conjunction.conjunction( | ||
// ConditionBuilder.column(ANY_NAME_1).isEqualToText(ANY_TEXT_1)), | ||
// RelationalScan.Conjunction.conjunction( | ||
// ConditionBuilder.column(ANY_NAME_1).isEqualToText(ANY_TEXT_2))); | ||
// } | ||
// | ||
// private Set<Conjunction> prepareConjunctionsWithDifferentOrder() { | ||
// return ImmutableSet.of( | ||
// RelationalScan.Conjunction.conjunction( | ||
// ConditionBuilder.column(ANY_NAME_1).isEqualToText(ANY_TEXT_2)), | ||
// RelationalScan.Conjunction.conjunction( | ||
// ConditionBuilder.column(ANY_NAME_1).isEqualToText(ANY_TEXT_1))); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this commented out code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. Fixed in d3bb194.
private BuildableScanAllWithWhere(String namespaceName, String tableName) { | ||
this.namespaceName = namespaceName; | ||
this.tableName = tableName; | ||
this.condition = null; | ||
} | ||
|
||
private BuildableScanAllWithWhere( | ||
String namespaceName, String tableName, ConditionalExpression condition) { | ||
this.namespaceName = namespaceName; | ||
this.tableName = tableName; | ||
this.condition = condition; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private BuildableScanAllWithWhere(String namespaceName, String tableName) { | |
this.namespaceName = namespaceName; | |
this.tableName = tableName; | |
this.condition = null; | |
} | |
private BuildableScanAllWithWhere( | |
String namespaceName, String tableName, ConditionalExpression condition) { | |
this.namespaceName = namespaceName; | |
this.tableName = tableName; | |
this.condition = condition; | |
} | |
private BuildableScanAllWithWhere(String namespaceName, String tableName) { | |
this(namespaceName, tableName, null); | |
} | |
private BuildableScanAllWithWhere( | |
String namespaceName, String tableName, @Nullable ConditionalExpression condition) { | |
this.namespaceName = namespaceName; | |
this.tableName = tableName; | |
this.condition = condition; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the advice. Fixed in d3bb194.
throw new UnsupportedOperationException( | ||
"This operation is supported only when no conditions are specified at all."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of UnsupportedOperationException
, I think IllegalStateException
is proper in this case. And I think we can suggest using clearConditions()
to clear the conditions in the message:
throw new UnsupportedOperationException( | |
"This operation is supported only when no conditions are specified at all."); | |
throw new IllegalStateException( | |
"This operation is supported only when no conditions are specified at all. " | |
+ "If you want to modify the condition, please use clearConditions() to remove all existing conditions first."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. Fixed in d3bb194.
} | ||
|
||
@Immutable | ||
public static class AndConditionSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So maybe we should move AndConditionSet
and OrConditionSet
and their builder classes into ScanBuilder
?
@brfrn169 @komamitsu Thank you for the valuable feedback. I fixed all other than this. PTAL again when you get a chance. |
protected final Set<Scan.Conjunction> conjunctions = new HashSet<>(); | ||
protected final List<Set<ConditionalExpression>> disjunctions = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for this delayed question. Why the types of conjunctions
and disjunctions
are different (disjunctions
needs an order? disjunctions
should be an internal class as well as conjunctions
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the question. Actually, disjunctions don't need the order information, but Sets.cartesianProduct()
, which is used for converting a CNF, requires a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. If the reason of having different types for conjunctions
and disjunctions
is only for Sets.cartesianProduct()
, we can write like this?
Sets.cartesianProduct(
disjunctions.stream().map(Disjunction::getConditions).collect(Collectors.toList()))
.stream()
.map(Conjunction::of)
.collect(Collectors.toSet()));
It would be great if the types of conjunctions
and disjunctions
are the same-level abstraction because I guess the readers of this code expect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's better for readers. Thank you. I think we can do just like the following. BTW, I'm not going to introduce Scan.Disjunction
since the CNF and disjunctions are only used in the ScanBuilder
internally at least at this moment. Even so, is it weird and should we have it?
scan.withConjunctions(
Sets.cartesianProduct(new ArrayList<>(disjunctions)).stream()
.map(Scan.Conjunction::of)
.collect(Collectors.toSet()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I'm not going to introduce Scan.Disjunction since the CNF and disjunctions are only used in the ScanBuilder internally at least at this moment. Even so, is it weird and should we have it?
@komamitsu Or, how about using Set<Set<ConditionalExpression>>
for conjunctions when building and converting them at the last, similar to the disjunctions? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about using Set<Set> for conjunctions
It means like this?
protected final Set<Set<ConditionalExpression>> conjunctions = new HashSet<>();
protected final Set<Set<ConditionalExpression>> disjunctions = new HashSet<>();
If so, completely agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's what I meant. Fixed in a64862d. PTAL when you get a chance.
Dear all, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
scan.withConjunctions(ImmutableSet.of(Scan.Conjunction.of(condition))); | ||
} else if (conjunctions.isEmpty()) { | ||
scan.withConjunctions( | ||
Sets.cartesianProduct(new ArrayList<>(disjunctions)).stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry for the delayed comment. I thought I put the following comment here, but it's gone...)
This seems to generate O x A conditions (O sets of A conditions), where O is the number of ORs and A is the number of ANDs. I'm not sure how these sets of conditions would be used in next PRs, but I guess ScalarDB fetches all entries and then filter the fetched records using these conditions set.
(This would be a silly question..., but) it's not possible to pushdown original conditions basically as-is to underlying RDBMS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the question. It will be used in something like the following way. We don't filter results using the conditions (at least for JDBC storage) in the ScalarDB layer.
private String relationalConditionSqlString() {
(snip)
List<String> conjunctionList = new ArrayList<>();
conjunctions.forEach(
conjunction -> {
List<String> conditions = new ArrayList<>();
conjunction
.getConditions()
.forEach(
condition ->
conditions.add(condition.getColumn().getName() + convert(condition.getOperator())));
conjunctionList.add(String.join(" AND ", conditions));
});
return " WHERE " + String.join(" OR ", conjunctionList);
}
(This would be a silly question..., but) it's not possible to pushdown original conditions basically as-is to underlying RDBMS?
Do you mean pushing down both DNF and CNF as is? It's technically possible, and it sometimes would perform better, but I think it trades off a development effort and maintainability for two internal representations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the answer. It sounds good!
What I meant was...
converting the following Scan
Scan.newBuilder()
.namespace("my_ns")
.table("my_tbl")
.all()
.where(ConditionBuilder.column("c1").isEqualToInt(10))
.and(ConditionBuilder.column("c2").isEqualToBigInt(20))
.and(
ConditionSetBuilder.condition(
ConditionBuilder.column("c3").isEqualToInt(30))
.or(ConditionBuilder.column("c4").isEqualToInt(40))
.build())
:
.build();
into the following SQL
SELECT ...
FROM my_ns.my_tbl
WHERE c1 = 10 AND c2 = 20 AND (c3 = 30 OR c4 = 40)
although I'm not sure it works well 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That's pushing down a CNF as is. Yeah, as I said, technically, it works if we make Scan
always have either conjunctions or disjunctions. But my concern is that we probably need to do the same for the gRPC proto and take care of which one should be used in many places. Anyway, this is kind of an internal part; we can revisit this in the future. Any additional discussions are welcome. Thanks for the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
(Very sorry for the late review 🙇 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
(Very sorry for the late review 🙇 )
This PR adds an interface to add arbitrary conditions in ScanAll, i.e., to support a relational scan for JDBC databases. We decided to add them to the existing ScanAll instead of adding a new RelationalScan to keep the backward compatibility and simplicity of the current class inheritance. Actual relational scan processing will be introduced by another PR.
By this PR, users can add a kind of "where()" clause after calling "all()" in a Scan builder. We currently support only a disjunctive normal form (DNF) or a conjunctive normal form (CNF), like the following patterns.
Users can use
ConditionSetBuilder
to buildAndConditionSet
andOrCondtionSet
, which is a set of conjunctive (a && b && ...) or disjunctive (a || b || ...) conditions. Note that users need to callclearConditions()
before updating a where clause in an existing ScallAll object.See also inline comments for details.