Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic utility for expression traversal #1336

Closed
wants to merge 4 commits into from

Conversation

kszucs
Copy link
Member

@kszucs kszucs commented Feb 6, 2018

It could be further improved to support more specific cases, e.g. with passing callable as control variable. Probably we should avoid specific scenarios (like _get_args) and refactor the underlying structure instead.

self.validator = ExprValidator([self.parent.table])

self.valid = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this for caching purposes? If so then I won't remove.

@cpcloud cpcloud self-requested a review February 6, 2018 15:06
@cpcloud cpcloud self-assigned this Feb 6, 2018
@cpcloud cpcloud added feature Features or general enhancements expressions Issues or PRs related to the expression API labels Feb 6, 2018
@cpcloud cpcloud added this to the 0.13 milestone Feb 6, 2018


# these could be callables instead
proceed = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let 's make these be functions with the following signature:

def default_proceed(expr, result):
    return True


def default_halt(expr, result):
    return not default_proceed(expr, result)

The the signature of traverse becomes:

def traverse(fn, expr, type=ir.Expr, proceed=default_proceed, halt=default_halt, container=Stack):
    pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result = fn(expr) in my function signatures above.

Copy link
Member Author

@kszucs kszucs Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what You're thinking of, but I've sketched the new versions in the comments.
Do I understand it correctly?

if isinstance(expr, ir.TableExpr):
return lin.stop, expr
else:
return lin.proceed, None
Copy link
Member Author

@kszucs kszucs Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would look like the following?

def finder(expr):
    if isinstance(expr, ir.TableExpr):
        return expr

def halt(expr, result):
    return isinstance(expr, ir.TableExpr):

lin.traverse(finder, expr.op().flat_args(), halt=halt)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, now I'm not so sure I like my suggestion. This duplicates the condition which isn't good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep your API for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, should I rename stop to halt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, then will merge

if isinstance(expr.op(), ops.And):
return lin.proceed, None
else:
return lin.stop, expr
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lin.traverse(
    lambda expr: expr if not isinstance(expr.op(), ops.And) else None, 
    expr, 
    proceed=lambda expr, result: isinstance(expr.op(), ops.And),
    type=ir.BooleanColumn
)

data = None
return lin.proceed, data

return collections.OrderedDict(lin.traverse(finder, expr))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def fn(expr):
    op = expr.op()
    if hasattr(op, 'source'):
        return (op, op.source.dictionary.get(op.name, None))
    elif isinstance(op, ir.Literal):
        return (op, op.value)

collections.OrderedDict(lin.traverse(fn, expr))

op = expr.op()
if isinstance(op, ops.TableColumn):
return lin.proceed, self._validate_column(expr)
return lin.proceed, None
Copy link
Member Author

@kszucs kszucs Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one would be simpler.

def validate(expr):
    op = expr.op()
    if isinstance(op, ops.TableColumn):
        return self._validate_column(expr)

lin.traverse(validate, expr, type=ir.ValueExpr)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

if isinstance(expr, ir.TableExpr):
return lin.stop, expr
else:
return lin.proceed, None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def finder(expr):
    if isinstance(expr, ir.TableExpr):
        return expr

def proceed(expr, result):
    return not isinstance(expr, ir.TableExpr)

lin.traverse(finder, expr, proceed=proceed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that looks good to me.

Copy link
Member Author

@kszucs kszucs Feb 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ehh, You need to convince me about the advantages :)
I favor the previous one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of lineage function the callable must return the next traversable expressions.

@cpcloud
Copy link
Member

cpcloud commented Feb 8, 2018

@kszucs can you rebase?

@kszucs
Copy link
Member Author

kszucs commented Feb 8, 2018

@cpcloud rebase done

@cpcloud cpcloud closed this in e1928ed Feb 9, 2018
@cpcloud
Copy link
Member

cpcloud commented Feb 9, 2018

@kszucs kszucs deleted the traverse branch May 30, 2022 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expressions Issues or PRs related to the expression API feature Features or general enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants