-
Notifications
You must be signed in to change notification settings - Fork 3k
Expression base operations #4466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rdblue I think I've made the changes you suggested and also referred to the link for some further modifications. I have some CI issues to work out, but if I'm on the right or wrong track with these changes please let me know |
|
Fixed my failing tox checks. Had a few logical errors and a mistakenly overwritten file. |
| return left | ||
| return super().__new__(cls) | ||
|
|
||
| def __init__(self, left: BooleanExpression, right: BooleanExpression, *rest: BooleanExpression): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this ensure that rest is empty since it should only be handled using __new__? If someone calls this directly, it would currently discard rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was wrong. If rest is non-empty then it ignores left and right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rest is non empty then new uses reduce to construct the object with only left and right specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not in this method. If this method is called directly then it would do the wrong thing. That could happen in a subclass, right?
rdblue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks great. I think there are a couple of slight problems to fix that I noted.
|
Thanks @dramaticlly I think it's good now in that respect. @rdblue are we all set? |
| ): | ||
| if not rest: | ||
| self.left = left | ||
| self.right = right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still need to be fixed. Even if we don't expect calls to __init__ with both left/right and rest, we should not ignore the possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was meant to be a convenience feature and I saw it in the Java file you shared which is why it is here. If you don't think this is necessary I can remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a great idea to handle this in __new__. But __init__ still exists and can be called directly. I don't think it should assume it is always called through __new__. I think the only thing that you need to do is add an else case that throws ValueError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__init__still exists and can be called directly. I don't think it should assume it is always called through__new__.
Would you mind elaborating on what you mean by this? To my knowledge, there is no calling init directly without implicitly calling new since new gives you the base object and init essentially adds/modifies the attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But __init__ still exists and can be called directly.
I don't think there's a way to do this while avoiding a call to __new__. The one way I wasn't sure of was if a child class calls init via super but I did this quick test and even that calls __new__.
class Foo:
def __new__(cls):
print("New called")
return cls
def __init__(self, val):
self.val = val
class Bar(Foo):
def __init__(self):
super().__init__(5)
Bar() # "New called"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this could use some doc-strings describing this behavior though. Specifically mentioning __new__
"""AND operation expression - logical conjunction
Args:
left(BooleanExpression): The left hand boolean expression
right(BooleanExpression): The right hand boolean expression
*rest(BooleanExpression): Additional boolean expressions to fold into the AND expression
Note:
Additional boolean expressions provided to `rest` will be folded/reduced to create an
object with only `left` and `right` specified (see the `__new__` constructor) . For
example, And(a, b, c) wil result in the equivalent of And(And(a, b), c).
"""
Once predicates are added we could include an Example: section here too without failing the pydoc tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling Foo.__init__(obj) directly doesn't call __new__:
f = Foo() # "New called"
Foo.__init__(f, 6) # avoids __new__I'm not saying that these are likely cases, I'm just saying that we should prefer throwing exceptions for cases like this rather than failing silently.
def __init__(self, left, right, *rest):
if not rest:
self.left = left
self.right = right
else:
raise ValueError("Expected only left and right operands")It isn't that intrusive to have the else case and it will catch crazy cases because we can't prevent people from calling __init__.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can also avoid calling __new__ by overriding it in the subclass:
class Bar(Foo):
def __new__(cls):
print("Subclass new called")
Bar() # "Subclass new called"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, yeah I agree it's a small addition to protect against these cases. It also makes the init easier to understand.
| right: BooleanExpression, | ||
| *rest: BooleanExpression, | ||
| ): | ||
| if not rest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the tests are failing on the new ValueError. Maybe updating this to if not rest or len(rest) <= 0:?
|
Hey @rdblue I wanted to circle back on the original without raising the error. I threw it in without considering the implications based on your suggestion, but I feel as though we still don't need such a thing. When one creates an class A:
def __new__(cls):
return B()
def __init__(self):
print('A init')
class B:
def __init__(self):
print('B init')
A()
>>> "B init"
So A's |
|
@CircArgs, the method can still be called with bad arguments. Since we cannot hide it entirely, I think throwing an exception is a reasonable step to ensure it isn't misused. It's also interesting that we're actually hitting the Why is that case getting hit? I thought that we were controlling the inputs to |
|
The ValueError getting raised comes from the fact that However for the Two other alternatives I can think of is:
class And(BooleanExpression):
"""AND operation expression - logical conjunction"""
def __new__(cls, left: BooleanExpression, right: BooleanExpression, *rest: BooleanExpression):
if rest:
return reduce(And, (left, right, *rest))
if left is AlwaysFalse() or right is AlwaysFalse():
return AlwaysFalse()
elif left is AlwaysTrue():
return right
elif right is AlwaysTrue():
return left
_instance = super().__new__(cls)
_instance.args = (left, right)
vars(_instance)["left"] = left
vars(_instance)["right"] = right
return _instance
def and_expression(left: BooleanExpression, right: BooleanExpression, *rest: BooleanExpression) -> Union[And, AlwaysTrue, AlwaysFalse]:
if rest:
return reduce(And, (left, right, *rest))
if left is AlwaysFalse() or right is AlwaysFalse():
return AlwaysFalse()
elif left is AlwaysTrue():
return right
elif right is AlwaysTrue():
return left
return And(left, right)A note on option 1, someone can still override A note about option 2 is that it may make checking if something is an instance of from iceberg.expressions.base import and_expression, And
expr = and_expression(left, right, rest)
isinstance(expr, And)as opposed to: from iceberg.expressions.base import And
expr = And(left, right, rest)
isinstance(expr, And)(sorry for the super long comment...) |
|
Thanks, @CircArgs! |
Thank you for all the helpful commentary |
Created the basic boolean expression types.