-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic utility for expression traversal #1336
Conversation
| self.validator = ExprValidator([self.parent.table]) | ||
|
|
||
| self.valid = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this for caching purposes? If so then I won't remove.
|
|
||
|
|
||
| # these could be callables instead | ||
| proceed = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let 's make these be functions with the following signature:
def default_proceed(expr, result):
return True
def default_halt(expr, result):
return not default_proceed(expr, result)The the signature of traverse becomes:
def traverse(fn, expr, type=ir.Expr, proceed=default_proceed, halt=default_halt, container=Stack):
passThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result = fn(expr) in my function signatures above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what You're thinking of, but I've sketched the new versions in the comments.
Do I understand it correctly?
ibis/expr/analysis.py
Outdated
| if isinstance(expr, ir.TableExpr): | ||
| return lin.stop, expr | ||
| else: | ||
| return lin.proceed, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this would look like the following?
def finder(expr):
if isinstance(expr, ir.TableExpr):
return expr
def halt(expr, result):
return isinstance(expr, ir.TableExpr):
lin.traverse(finder, expr.op().flat_args(), halt=halt)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, now I'm not so sure I like my suggestion. This duplicates the condition which isn't good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep your API for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, should I rename stop to halt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, then will merge
ibis/expr/analysis.py
Outdated
| if isinstance(expr.op(), ops.And): | ||
| return lin.proceed, None | ||
| else: | ||
| return lin.stop, expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lin.traverse(
lambda expr: expr if not isinstance(expr.op(), ops.And) else None,
expr,
proceed=lambda expr, result: isinstance(expr.op(), ops.And),
type=ir.BooleanColumn
)| data = None | ||
| return lin.proceed, data | ||
|
|
||
| return collections.OrderedDict(lin.traverse(finder, expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def fn(expr):
op = expr.op()
if hasattr(op, 'source'):
return (op, op.source.dictionary.get(op.name, None))
elif isinstance(op, ir.Literal):
return (op, op.value)
collections.OrderedDict(lin.traverse(fn, expr))| op = expr.op() | ||
| if isinstance(op, ops.TableColumn): | ||
| return lin.proceed, self._validate_column(expr) | ||
| return lin.proceed, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one would be simpler.
def validate(expr):
op = expr.op()
if isinstance(op, ops.TableColumn):
return self._validate_column(expr)
lin.traverse(validate, expr, type=ir.ValueExpr)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
ibis/expr/analysis.py
Outdated
| if isinstance(expr, ir.TableExpr): | ||
| return lin.stop, expr | ||
| else: | ||
| return lin.proceed, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def finder(expr):
if isinstance(expr, ir.TableExpr):
return expr
def proceed(expr, result):
return not isinstance(expr, ir.TableExpr)
lin.traverse(finder, expr, proceed=proceed)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ehh, You need to convince me about the advantages :)
I favor the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of lineage function the callable must return the next traversable expressions.
|
@kszucs can you rebase? |
|
@cpcloud rebase done |

It could be further improved to support more specific cases, e.g. with passing callable as control variable. Probably we should avoid specific scenarios (like
_get_args) and refactor the underlying structure instead.