Implement generic AST pattern matcher helper #699

JukkaL · 2015-05-31T23:47:53Z

Currently the mypy implementation has a lot of ad-hoc code that checks whether some part of an AST looks like something specific. For example, we have code that detects various special forms:

N = NamedTuple('N', [('x', int), ...])
T = TypeVar('T', int, str)

It would be better to have a generic utility that makes it easy to write code to detect cases like the above.

A straw man proposal:

pattern = AstPattern('NameExpr = "typing.NamedTuple"(StrExpr, [(StrExpr, Type), ...])')
result = pattern.match(ast_node)
if result:
    name, strlit, tuples = result

The pattern syntax would be a subset of Python syntax to make the implementation easier (but with different semantics).

JukkaL · 2015-05-31T23:49:55Z

Something like this could be useful when implementing #698.

elazarg · 2016-09-03T18:47:16Z

@JukkaL I felt this need too - I wanted to suggest that, but now I see there's an issue for this.

I think I'd like to work on it, if nobody else did it already.

gvanrossum · 2016-09-03T18:48:51Z

Go for it!

elazarg · 2016-09-14T15:19:35Z

I have something. The matcher looks like

Or(StringOfIds(_names), MatchList(_names, etc))

Where _names is a handle, bound through mutation.

A code excerpt (note that it uses inspect inside call.bind(), but that's an orthogonal issue):

def check_namedtuple(self, node: Node, var_name: str = None) -> TypeInfo:
    ...
    _typename, _names, _module, _types = Identifier(), Identifier(), Identifier(), WildCard()
    if callee.fullname == 'collections.namedtuple':
        def namedtuple(typename, field_names, *, verbose = ..., rename = ...): ...
        args = call.bind(namedtuple)
        match = Or(StringOfIds(_names), MatchList(_names, etc))
        bad_fields = match.find_mismatch(args.get('field_names'))
        bad_module = None
    elif callee.fullname == 'typing.NamedTuple':
        def NamedTuple(typename, fields, *, verbose = ..., rename = ..., module = ...): ...
        args = call.bind(NamedTuple)
        match = MatchList(MatchTuple(_names, _types), etc)
        bad_fields = match.find_mismatch(args.get('fields'))
        bad_module = _module.find_mismatch(args.get('module'))
    else:
        return None
    if not args:
        return self.build_namedtuple_typeinfo('namedtuple', [], [])
    if _typename.find_mismatch(args['typename']):
        self.fail("namedtuple() expects a string literal as the first argument", call)
    ...

It could have been cleaner if it need not be typechecked...

Together with the inspection, it saves about 40 lines in the main logic. It's paid back with interest, but mostly in a dedicated file, and it should be reusable (although I only tried it on namedtuple so far).

gvanrossum · 2016-09-14T18:06:28Z

@JukkaL, can you gude @elazarg here?

JukkaL · 2016-09-14T19:09:29Z

Sure! I'm going to PyCon UK tomorrow so I may not have much time until next week.

JukkaL · 2016-09-20T18:57:20Z

The proposed syntax feels too magical for me. It's shorter than the current code but I had to read through it several times before I understood what is going on.

The real challenge seems to be to come up with something that is both easy to understand and compact (and not too difficult to implement, of course). Each line of code will be read many times but written only once, so ease of understanding is the key thing.

Here's another idea (again, not convinced it's a good idea, but here goes anyway):

if callee.fullname == 'collections.namedtuple':
    pattern = CallPattern('def _(typename, field_names, *, verbose=..., rename=...): ...')
    pattern.describe_arg('typename', StrExpr)
    pattern.describe_arg('field_names', Either(StrExpr, ListPattern(StrExpr)))
    result = pattern.match(call, self.fail)  # Generate error on failure
elif callee.fullname == 'typing.NamedTuple':
    pattern = CallPattern('def _(typename, fields, *, verbose=..., rename=..., module=...): ...')
    pattern.describe_arg('typename', StrExpr)
    pattern.describe_arg('fields', ListPattern(TuplePattern(StrExpr, Type)))
    result = pattern.match(call, self.fail)
else:
    return None
if not result:
    return ...  # Error
# use result[<argname>] to access things
...

Automating the generation of meaningful error messages would also be nice.

elazarg · 2016-09-20T19:21:51Z

Your API is indeed more readable.

On first impression: one problem with CallPattern was writing the parser. I guess we can simply use existing facilities, but inspect will not be so easily applicable (and I want it - I don't want to reimplement argument matching). Besides, the example I gave has an actual implementation. Anyway, I may give CallPattern suggestion a try later on.

The reasons for making the mutation-based patterns was to be able to find and report the exact point of error, differentiate between ListPattern(TuplePattern(StrExpr, Type)) (a single element) and ListPattern(TuplePattern(StrExpr, Type), ...), and yet keep working if the error is not very serious. I'm not sure now these cannot be accommodated with less magic; I will think about it.

elazarg · 2016-09-20T19:37:35Z

Another decision is what to leave to the type checker. I don't like the idea of rejecting valid code just because it's not a string literal.

JukkaL · 2018-05-17T22:37:19Z

I don't think that is worth implementing, since we aren't adding new special forms that often any more, and just reviewing (not to mention implementing) a generic solution could take more effort than we'd save using it.

JukkaL added the needs discussion label May 31, 2015

gvanrossum added refactoring Changing mypy's internals topic-named-tuple labels Sep 14, 2016

elazarg mentioned this issue Dec 12, 2016

Proposal: Refactor extract class AssignmentStatementAnalyzer from SemanticAnalyzer #2198

Closed

JukkaL closed this as completed May 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement generic AST pattern matcher helper #699

Implement generic AST pattern matcher helper #699

JukkaL commented May 31, 2015

JukkaL commented May 31, 2015

elazarg commented Sep 3, 2016 •

edited by gvanrossum

Loading

gvanrossum commented Sep 3, 2016

elazarg commented Sep 14, 2016

gvanrossum commented Sep 14, 2016

JukkaL commented Sep 14, 2016

JukkaL commented Sep 20, 2016

elazarg commented Sep 20, 2016

elazarg commented Sep 20, 2016

JukkaL commented May 17, 2018

Implement generic AST pattern matcher helper #699

Implement generic AST pattern matcher helper #699

Comments

JukkaL commented May 31, 2015

JukkaL commented May 31, 2015

elazarg commented Sep 3, 2016 • edited by gvanrossum Loading

gvanrossum commented Sep 3, 2016

elazarg commented Sep 14, 2016

gvanrossum commented Sep 14, 2016

JukkaL commented Sep 14, 2016

JukkaL commented Sep 20, 2016

elazarg commented Sep 20, 2016

elazarg commented Sep 20, 2016

JukkaL commented May 17, 2018

elazarg commented Sep 3, 2016 •

edited by gvanrossum

Loading