Skip to content
This repository has been archived by the owner on Mar 8, 2020. It is now read-only.

Migration to the new DSL #154

Merged
merged 4 commits into from
Jun 1, 2018
Merged

Migration to the new DSL #154

merged 4 commits into from
Jun 1, 2018

Conversation

juanjux
Copy link
Contributor

@juanjux juanjux commented Apr 19, 2018

Signed-off-by: Juanjo Alvarez juanjo@sourced.tech

@juanjux juanjux self-assigned this Apr 19, 2018
@juanjux juanjux requested a review from dennwc April 19, 2018 15:56
@dennwc
Copy link
Member

dennwc commented Apr 20, 2018

We will need a second PR for Python driver to remove additional logic from the native driver (synthetic tokens, field names, etc)


Reviewed 110 of 110 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks broke.


driver/fixtures/fixtures_test.go, line 25 at r1 (raw file):

		Code:   normalizer.Code,
	},
	BenchName: "json",

This name should correspond to a largest .py file (with no extension) in fixtures


driver/normalizer/annotation.go, line 249 at r1 (raw file):

	AnnotateType("TryFinally", nil, role.Try, role.Finally, role.Statement),

	// FIXME: Check if "args" are correctly inside the Call of the "exc" field

It seems like this is a new FIXME. Is it already addressed?


driver/normalizer/annotation.go, line 290 at r1 (raw file):

	// FIXME: Incomplete for lacking of an Async role
	functionAnnotate("AsyncFunctionDef", role.Function, role.Declaration, role.Name, role.Identifier, role.Incomplete),
	AnnotateType("FunctionDef.body", nil, role.Function, role.Declaration, role.Body),

This should be in FunctionDef rule


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 20, 2018

Mmm, could you be more specific? I would prefer to fix everything in this PR if possible.


Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks broke.


driver/fixtures/fixtures_test.go, line 25 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

This name should correspond to a largest .py file (with no extension) in fixtures

Fixed, thanks.


driver/normalizer/annotation.go, line 249 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

It seems like this is a new FIXME. Is it already addressed?

Yes, is one of those that I add while developing and supposedly remove before PR (except that I forgot to remove that one).


driver/normalizer/annotation.go, line 290 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

This should be in FunctionDef rule

They already are, so I'm removing these two.


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 20, 2018

Sure, will add more comments inline.


Reviewed 2 of 2 files at r2.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks broke.


native/python_package/python_driver/astimprove.py, line 361 at r1 (raw file):

_TOKEN_KEYS = set(
    ("module", "name", "id", "attr", "arg", "LiteralValue", "s", "n")

These token fields can be processed in DSL


native/python_package/python_driver/astimprove.py, line 364 at r1 (raw file):

)

_SYNTHETIC_TOKENS = {

These tokens can be added in DSL


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

        if isinstance(node, dict):
            node_type = node["ast_type"]
            if "ctx" in node:

I would say we can drop this logic


native/python_package/python_driver/astimprove.py, line 496 at r1 (raw file):

        # else in the AST (dictionaries, properties or list of objects) so we
        # convert those names to Name objects
        names_as_nodes = [self.visit({"ast_type": "Name",

Can be done in DSL


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

        return node

    def visit_other_field(self, node: Node) -> VisitResult:

Ideally, this should be the only visitor function, and everything else can be done on Go side (to be able to reverse it later)


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 20, 2018

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks broke.


a discussion (no related file):
SDK now implements Unannotated - please don't forget to test/include it.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 23, 2018

Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


native/python_package/python_driver/astimprove.py, line 361 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

These token fields can be processed in DSL

This is used for the tokenizer that sync positions of nodes with tokens. This is done before the DSL, so it must remain.


native/python_package/python_driver/astimprove.py, line 364 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

These tokens can be added in DSL

Ditto (needed by the tokenizer to sync positions).


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

I would say we can drop this logic

The ctx field It's not used by anything in the annotations so I would prefer to leave this simple transformation here to keep the go transformations on topic about things related to the UAST.


native/python_package/python_driver/astimprove.py, line 496 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Can be done in DSL

I don't think so (correct me if I'm wrong) since the DSL/SDK doesn't seem to support arrays of strings (had to do something else in the native driver to avoid the "is not a node or array" error).


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Ideally, this should be the only visitor function, and everything else can be done on Go side (to be able to reverse it later)

Doing a visitor is the recommended way to access the Python AST so the result of this is the native AST, not the input of this visitor (the input are binary python objects with python types that always have to be processed in some way or other to export outside of Python). Anyway what this method does is to call another visit methods so it can't really operate by itself without the others.


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 23, 2018

Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


native/python_package/python_driver/astimprove.py, line 361 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

This is used for the tokenizer that sync positions of nodes with tokens. This is done before the DSL, so it must remain.

Can we move position fixer to DSL as well then?


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

The ctx field It's not used by anything in the annotations so I would prefer to leave this simple transformation here to keep the go transformations on topic about things related to the UAST.

The reason why I propose to drop it is that it deforms the native AST for no specific reason.


native/python_package/python_driver/astimprove.py, line 496 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

I don't think so (correct me if I'm wrong) since the DSL/SDK doesn't seem to support arrays of strings (had to do something else in the native driver to avoid the "is not a node or array" error).

It supports everything, including arrays of strings. We should check the problem you mentioned to get the real cause.


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Doing a visitor is the recommended way to access the Python AST so the result of this is the native AST, not the input of this visitor (the input are binary python objects with python types that always have to be processed in some way or other to export outside of Python). Anyway what this method does is to call another visit methods so it can't really operate by itself without the others.

What do you mean by "binary python objects"? They still support reflection, right? So we can make a single visitor that justs dump any object by visiting all fields and appends the ast_type field at the end.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 23, 2018

Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


native/python_package/python_driver/astimprove.py, line 361 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Can we move position fixer to DSL as well then?

We would need to implement a python tokenizer in Go so not realistic.


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

The reason why I propose to drop it is that it deforms the native AST for no specific reason.

The native AST will get it's pass once


native/python_package/python_driver/astimprove.py, line 496 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

It supports everything, including arrays of strings. We should check the problem you mentioned to get the real cause.

Ok


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

What do you mean by "binary python objects"? They still support reflection, right? So we can make a single visitor that justs dump any object by visiting all fields and appends the ast_type field at the end.

Yes, but then you lost the comments (will have to add them again in Go, which would be much more complicated) and the sync of positions (ditto). Not worth it IMHO.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 23, 2018

Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


a discussion (no related file):

Previously, dennwc (Denys Smirnov) wrote…

SDK now implements Unannotated - please don't forget to test/include it.

Done.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 23, 2018

Review status: 54 of 110 files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


native/python_package/python_driver/astimprove.py, line 496 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Ok

Since I'm having the same problem with the PHP driver I'll revisit this: it fails when the native driver has any property with a list of strings value like:

"parts": ["foo", "bar"]: 

like in: https://github.com/bblfsh/php-driver/blob/master/fixtures/comparison.php.native#L340

The error produced is:

Warning: parse request for  fixtures/variable.php returned errors:
argument should be a node or a list, got: uast.String


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 24, 2018

Reviewed 56 of 56 files at r3.
Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


a discussion (no related file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Done.

And the fun thing is that we now decided to revert it, it seems.


driver/normalizer/annotation.go, line 321 at r3 (raw file):

	// Formal Arguments
	// FIXME: opt: true + arr: true seems to cause a crash in the SDK

I think it's not a crash, rather a "friendly" notice that you are doing something wrong - the text of the panic indicates that arrays cannot be optional, they always act like this.


fixtures/issue62.py.uast, line 25 at r3 (raw file):

.  .  .  }
.  .  .  Children: {
.  .  .  .  0: ImportFrom.level {

Why? Does it deserves a separate node?


fixtures/issue62.py.uast, line 104 at r3 (raw file):

.  .  .  .  .  .  0: alias.asname {
.  .  .  .  .  .  .  Roles: Identifier,Import,Pathname
.  .  .  .  .  .  .  TOKEN "numpy"

Something strange here - there was another token (np) here. Are you sure you are not overriding tokens for this node?


fixtures/issue76.py.uast, line 10 at r1 (raw file):

.  .  0: With {
.  .  .  Roles: Block,Scope,Statement
.  .  .  TOKEN "with"

Was this intentional?


fixtures/u2_import_module_alias.py.uast, line 27 at r3 (raw file):

.  .  .  .  .  .  0: alias.asname {
.  .  .  .  .  .  .  Roles: Identifier,Import,Pathname
.  .  .  .  .  .  .  TOKEN "x"

This is definitely an overwrite of the token. Be careful!


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 24, 2018

Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

The native AST will get it's pass once

...and this pass is irreversible because of such manipulations.

In any case, considering that everything other transformation cannot be moved to DSL, this does not matter that much.


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Yes, but then you lost the comments (will have to add them again in Go, which would be much more complicated) and the sync of positions (ditto). Not worth it IMHO.

Checked the tokenizer again, and yes, I see no other way how to solve it right now. But it's something that we should keep in mind in case we will embed Python to the driver.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 24, 2018

Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks broke.


a discussion (no related file):

Previously, dennwc (Denys Smirnov) wrote…

And the fun thing is that we now decided to revert it, it seems.

hahaha


driver/normalizer/annotation.go, line 321 at r3 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

I think it's not a crash, rather a "friendly" notice that you are doing something wrong - the text of the panic indicates that arrays cannot be optional, they always act like this.

Mmm, I saved part of the text but I don't remember that message, what I saved was:

juanjux, [23.04.18 12:02]
gopkg.in/bblfsh/sdk.v1/uast/transformer.AnnotateTypeFieldsCustom(0xa17d18, 0x9f488c, 0x9, 0xc4201a77d0, 0x0, 0x0, 0xc420019f28, 0x4, 0x4, 0x0, ...)
  /go/src/gopkg.in/bblfsh/sdk.v1/uast/transformer/ast.go:370 +0x650
gopkg.in/bblfsh/sdk.v1/uast/transformer.AnnotateTypeFields(0x9f488c, 0x9, 0xc4201a77d0, 0xc420019f28, 0x4, 0x4, 0x0, 0x0, 0x0, 0x0, ...)
  /go/src/gopkg.in/bblfsh/sdk.v1/uast/transformer/ast.go:385 +0xbe
github.com/bblfsh/python-driver/driver/normalizer.init()
  /go/src/github.com/bblfsh/python-driver/driver/normalizer/annotation.go:303 +0x5d26

fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Why? Does it deserves a separate node?

The only properties supported by the UAST (in theory) are the position ones, the roles and the internalType, any others are not part of it so to get the level we need a child node holding the number as token.


fixtures/issue62.py.uast, line 104 at r3 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Something strange here - there was another token (np) here. Are you sure you are not overriding tokens for this node?

Yes, it definitely wrong (numpy should be the token of the Import node and np of the alias).


fixtures/issue76.py.uast, line 10 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Was this intentional?

Yes. Statements should have the statement text as token (that whats the SyntheticTokens map was in the old ToNoder). And now I noticed I miss a few of them like "for", "if", etc...


fixtures/u2_import_module_alias.py.uast, line 27 at r3 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

This is definitely an overwrite of the token. Be careful!

Yep.


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

...and this pass is irreversible because of such manipulations.

In any case, considering that everything other transformation cannot be moved to DSL, this does not matter that much.

Ups, I didn't finished writing the comment before clicking "publish" I was saying that the native AST will get some love and movement of stuff to the driver when it makes sense (like in this case) but not on this first PR that I want focused on migrating what was being annotated with the old DSL to the new one.


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Checked the tokenizer again, and yes, I see no other way how to solve it right now. But it's something that we should keep in mind in case we will embed Python to the driver.

I don't think embedding will be a problem (it will just run the astimprove module instead of the higher level one that calls it, the same with the embedded interpreter), but soon I'll think about refactoring the native part once I'm more confident with the new DSL.


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 24, 2018

Review status: all files reviewed at latest revision, 7 unresolved discussions, some commit checks broke.


driver/normalizer/annotation.go, line 321 at r3 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Mmm, I saved part of the text but I don't remember that message, what I saved was:

juanjux, [23.04.18 12:02]
gopkg.in/bblfsh/sdk.v1/uast/transformer.AnnotateTypeFieldsCustom(0xa17d18, 0x9f488c, 0x9, 0xc4201a77d0, 0x0, 0x0, 0xc420019f28, 0x4, 0x4, 0x0, ...)
  /go/src/gopkg.in/bblfsh/sdk.v1/uast/transformer/ast.go:370 +0x650
gopkg.in/bblfsh/sdk.v1/uast/transformer.AnnotateTypeFields(0x9f488c, 0x9, 0xc4201a77d0, 0xc420019f28, 0x4, 0x4, 0x0, 0x0, 0x0, 0x0, ...)
  /go/src/gopkg.in/bblfsh/sdk.v1/uast/transformer/ast.go:385 +0xbe
github.com/bblfsh/python-driver/driver/normalizer.init()
  /go/src/github.com/bblfsh/python-driver/driver/normalizer/annotation.go:303 +0x5d26

It misses the very beginning of the panic - there should be a message like:

field should either be a list or optional

fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

The only properties supported by the UAST (in theory) are the position ones, the roles and the internalType, any others are not part of it so to get the level we need a child node holding the number as token.

But was is the reasoning behind this? Any design documents or minutes that explain this decision?

Because it makes a total sense in graphs (multiple references to the same token), but it makes no sense for trees (it can reference it directly from the property).


fixtures/issue76.py.uast, line 10 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Yes. Statements should have the statement text as token (that whats the SyntheticTokens map was in the old ToNoder). And now I noticed I miss a few of them like "for", "if", etc...

But the last commit removes the token.


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Ups, I didn't finished writing the comment before clicking "publish" I was saying that the native AST will get some love and movement of stuff to the driver when it makes sense (like in this case) but not on this first PR that I want focused on migrating what was being annotated with the old DSL to the new one.

And this is exactly what I mentioned a week ago - that this PR looks good with fixes, but AST changes can wait until the next PR. I remember you said that you want everything in this PR :)

So yes, nevermind, marking this as resolved.


native/python_package/python_driver/astimprove.py, line 535 at r1 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

I don't think embedding will be a problem (it will just run the astimprove module instead of the higher level one that calls it, the same with the embedded interpreter), but soon I'll think about refactoring the native part once I'm more confident with the new DSL.

The nice thing about embedding is that it can run both parser and tokenizer directly since both are C packages. So even if part of parser will be in Python, we can still call tokenizer from Go.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented Apr 24, 2018

Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks broke.


fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

But was is the reasoning behind this? Any design documents or minutes that explain this decision?

Because it makes a total sense in graphs (multiple references to the same token), but it makes no sense for trees (it can reference it directly from the property).

It was already decided when I joined and the company didn't do minutes then like we do now, we could ask @smola and @mcuadros but it's the way it currently is and has been since its inception. I guess the intention is that the user doesn't have to guess language dependent names but instead can just follow the children and check the nodes.

AFAIK it should NOT have effect on the high level structures whose concept is precisely to have an standart set of structs (now Objects) with specific properties that are different for every type but the same for all languages.


fixtures/issue76.py.uast, line 10 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

But the last commit removes the token.

Ups.


native/python_package/python_driver/astimprove.py, line 445 at r1 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

And this is exactly what I mentioned a week ago - that this PR looks good with fixes, but AST changes can wait until the next PR. I remember you said that you want everything in this PR :)

So yes, nevermind, marking this as resolved.

Oh yes, my fault, I read about "synthetic tokens" in your comment and tough it was specifically about the Go part (where the synthetic tokens are/were defined), totally skipping the important "native" word. Still I got a pretty useful review so there is that :)


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented Apr 24, 2018

:lgtm: if all integration tests pass


Reviewed 25 of 25 files at r4.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks broke.


fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

It was already decided when I joined and the company didn't do minutes then like we do now, we could ask @smola and @mcuadros but it's the way it currently is and has been since its inception. I guess the intention is that the user doesn't have to guess language dependent names but instead can just follow the children and check the nodes.

AFAIK it should NOT have effect on the high level structures whose concept is precisely to have an standart set of structs (now Objects) with specific properties that are different for every type but the same for all languages.

OK, since it won't block the progress on high-level structures, marking this as resolved and waiting for more info from @smola and @mcuadros.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 3, 2018

Updated imports and API to SDK.v2.

@juanjux
Copy link
Contributor Author

juanjux commented May 3, 2018

Build fixed, @dennwc PTAL and re-approve y LGTyou (or comment if not).

@smola
Copy link
Member

smola commented May 3, 2018

fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

OK, since it won't block the progress on high-level structures, marking this as resolved and waiting for more info from @smola and @mcuadros.

Maybe I just don't remember, but I don't think there was any hard decision preventing us from using custom properties. What we initially aimed for was at trying to use nodes for everything present in the language. For example, having operands and operators all as nodes, instead of having things like a "Operator" node with the operator itself stored in a property. We prefer an "Operator" node with the operator itself as a token. Rationale was keeping the amount of concepts to keep in mind low and reduce divergence between languages.

Here this "level" node seems problematic, since the "0" token itself is not something that appears in the code, right?


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 4, 2018

Review status: 95 of 119 files reviewed at latest revision, 1 unresolved discussion.


fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, smola (Santiago M. Mola) wrote…

Maybe I just don't remember, but I don't think there was any hard decision preventing us from using custom properties. What we initially aimed for was at trying to use nodes for everything present in the language. For example, having operands and operators all as nodes, instead of having things like a "Operator" node with the operator itself stored in a property. We prefer an "Operator" node with the operator itself as a token. Rationale was keeping the amount of concepts to keep in mind low and reduce divergence between languages.

Here this "level" node seems problematic, since the "0" token itself is not something that appears in the code, right?

Well, I remember a standup where I proposed to use properties for stuff like scope indicators and you told me that right now only roles and types would be used :) Right now, that's the way is stated in the docs on the node formats (there are properties, but are considered language independent). But I'm all for expanding this into using more properties, we just need to make sure it can be done in a way that the user can programmatically explore what these properties are without needing him knowing language specifics. In this specific case about the Python level it will probably be fixed with the high level import object.


Comments from Reviewable

@smola
Copy link
Member

smola commented May 4, 2018

fixtures/issue62.py.uast, line 25 at r3 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

Well, I remember a standup where I proposed to use properties for stuff like scope indicators and you told me that right now only roles and types would be used :) Right now, that's the way is stated in the docs on the node formats (there are properties, but are considered language independent). But I'm all for expanding this into using more properties, we just need to make sure it can be done in a way that the user can programmatically explore what these properties are without needing him knowing language specifics. In this specific case about the Python level it will probably be fixed with the high level import object.

Just for the record: we continued this discussion offline. I still thing that we should try to reduce the use of properties when possible, but I would prefer not adding fake tokens. So the token for an "import level" should be "." or ".." (not "1" and "2"). I think we should try to limit tokens (not nodes) to what actually appears in the source code.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 7, 2018

@smola change implemented (check fixtures/u2_import_relativepath.py.uast for example).

@dennwc please review the level2Dots function since it's the first Op func I wrote and I'm unsure it is totally kosher.

@dennwc
Copy link
Member

dennwc commented May 10, 2018

Reviewed 5 of 6 files at r5, 6 of 18 files at r6, 2 of 2 files at r7, 12 of 12 files at r8.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


driver/normalizer/annotation.go, line 99 at r8 (raw file):

			ival := int(i64val)
			for i := 0; i < ival; i++ {
				buffer.WriteString(".")

strings.Repeat(".", ival)


fixtures/u2_import_relativepath.py.uast, line 10 at r4 (raw file):

.  .  0: ImportFrom {
.  .  .  Roles: Declaration,Import,Statement
.  .  .  TOKEN "import from"

"Import from" tokens are now gone. Was it intentional?


fixtures/u2_import_relativepath.py.uast, line 81 at r8 (raw file):

.  .  .  .  0: ImportFrom.level {
.  .  .  .  .  Roles: Import,Incomplete
.  .  .  .  .  TOKEN ".."

Should it be ./../../?


fixtures/u2_import_subsymbols_namespaced.py.uast, line 24 at r8 (raw file):

.  .  .  }
.  .  .  Children: {
.  .  .  .  0: ImportFrom.level {

This node seems unused. Also, level property value is lost here.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 11, 2018

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


fixtures/u2_import_relativepath.py.uast, line 10 at r4 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

"Import from" tokens are now gone. Was it intentional?

Yes, I removed that "synthetic token" since it didn't have direct correspondence with the string in the code (that would be import x from y).


fixtures/u2_import_relativepath.py.uast, line 81 at r8 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Should it be ./../../?

No, the Python code uses just dots.


fixtures/u2_import_subsymbols_namespaced.py.uast, line 24 at r8 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

This node seems unused. Also, level property value is lost here.

It's because the level property in the source code goes trough the level2dots() func and thus the original numeric value is saved converted to dots and, as happens with other properties that are processed (positions, comment tokens, etc) they don't go into the Properties object, anyway it should be reversible but I we wanted to keep the numeric level I'm not sure how I could do it since once the level property is converted to dots is saved as such in the shared state, any way I could save the same property twice with this DSL?


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 11, 2018

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


driver/normalizer/annotation.go, line 99 at r8 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

strings.Repeat(".", ival)

Thanks!


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented May 11, 2018

Reviewed 1 of 1 files at r9.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


fixtures/u2_import_subsymbols_namespaced.py.uast, line 24 at r8 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

It's because the level property in the source code goes trough the level2dots() func and thus the original numeric value is saved converted to dots and, as happens with other properties that are processed (positions, comment tokens, etc) they don't go into the Properties object, anyway it should be reversible but I we wanted to keep the numeric level I'm not sure how I could do it since once the level property is converted to dots is saved as such in the shared state, any way I could save the same property twice with this DSL?

It could be saved multiple times as long as variable value is the same.

But the question is: in this particular case there is no "level" property and no "token" with dots. Just wondering why.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 11, 2018

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


fixtures/u2_import_subsymbols_namespaced.py.uast, line 24 at r8 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

It could be saved multiple times as long as variable value is the same.

But the question is: in this particular case there is no "level" property and no "token" with dots. Just wondering why.

How would I save the numeric level in this specific case?

	MapAST("ImportFrom", Obj{
		"module": Var("module"),
		"level":  level2Dots("level"),
		"names":  Var("names"),
	}, Obj{...}

The level doesn't have dots in this example because is level "0" that is not represented by any dots (actually, the level2dots return an empty string in this case but it seems in these representation that is skipped).


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented May 11, 2018

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


fixtures/u2_import_subsymbols_namespaced.py.uast, line 24 at r8 (raw file):

Previously, juanjux (Juanjo Alvarez Martinez) wrote…

How would I save the numeric level in this specific case?

	MapAST("ImportFrom", Obj{
		"module": Var("module"),
		"level":  level2Dots("level"),
		"names":  Var("names"),
	}, Obj{...}

The level doesn't have dots in this example because is level "0" that is not represented by any dots (actually, the level2dots return an empty string in this case but it seems in these representation that is skipped).

Hmm, it could be that legacy representation ignores empty properties.

Regarding int level I would suggest accepting two variable names to level2Dots function and store an int value in one variable and dots in another.


Comments from Reviewable

@juanjux
Copy link
Contributor Author

juanjux commented May 16, 2018

Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


fixtures/u2_import_subsymbols_namespaced.py.uast, line 24 at r8 (raw file):

Previously, dennwc (Denys Smirnov) wrote…

Hmm, it could be that legacy representation ignores empty properties.

Regarding int level I would suggest accepting two variable names to level2Dots function and store an int value in one variable and dots in another.

Ok, looks like if I need to save two vars I can't use the predefined ValueConv function which use ValueFunc types and returns the opValueConv struct, I'll investigate how to define custom versions of all these to work with two variables.


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented May 16, 2018

Annotations looks good, but few changes required in the code for a custom transform.


Reviewed 12 of 12 files at r10.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


driver/normalizer/annotation.go, line 96 at r10 (raw file):

func num2dots(n uast.Value) (uast.Value, error) {
	if intval, ok := n.(uast.Int); ok {
		if i64val, ok := intval.Native().(int64); ok {

intval.Native().(int64) -> int(intval)


driver/normalizer/annotation.go, line 109 at r10 (raw file):

func (op opLevelDotsNumConv) Check(st *State, n uast.Node) (bool, error) {
	v, ok := n.(uast.Value)

it should cast to uast.Int directly


driver/normalizer/annotation.go, line 114 at r10 (raw file):

	}
	nv, err := num2dots(v)
	if ErrUnexpectedType.Is(err) {

but num2dots doesn't throw this kind of error


driver/normalizer/annotation.go, line 123 at r10 (raw file):

	if err != nil {
		return false, err
	}

it should fail fast if !res1


driver/normalizer/annotation.go, line 139 at r10 (raw file):

	}

	v, ok := n.(uast.Value)

cast to uast.Int to be more specific


driver/normalizer/annotation.go, line 461 at r10 (raw file):

opLevelDotsNumConv{Var("level"), Var("origlevel")}

This should specify names for fields that are initialized. In other cases changing an order of fields in this struct will break this code.


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented May 16, 2018

:lgtm:


Reviewed 1 of 1 files at r11.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed.


Comments from Reviewable

@dennwc
Copy link
Member

dennwc commented May 16, 2018

Annotations looks good, now we need to make Sign-off bot happy squashing/signing commits. Also, CI should be fixed.

@juanjux
Copy link
Contributor Author

juanjux commented May 16, 2018

No idea why CI is failing with that message.

Juanjo Alvarez added 3 commits May 16, 2018 18:20
Updated python skeleton driver to the latest changes

Completed covering all the node types, added some missing tokens

Fixed the problem with comment annotations

Remove some outdates FIXME reminders and other unneeded stuff

Address some issues from review

Add the Unannotated transformer. Fix some unannotated nodes

Fix import and alias tokens, add more synth tokens.

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Update integration tests

Fixed build with v2

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Use strings.repeat instead of reinventing the wheel

Keep the numeric level as internal property in the UAST

Remove the now unneeded dots2num and simplify Construct

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@smola
Copy link
Member

smola commented May 18, 2018

Successfully tagged bblfsh/python-driver:dev-f8f92a2
CONTAINER_ID=`docker run -d \
		-p 39432:9432 \
		bblfsh/python-driver:dev-f8f92a2-dirty`; \
	echo "CONTAINER_ID: $CONTAINER_ID"; \
	/home/travis/gopath/bin/bblfsh-sdk-tools test --endpoint localhost:39432 || true; \
	docker kill $CONTAINER_ID;
Unable to find image 'bblfsh/python-driver:dev-f8f92a2-dirty' locally
docker: Error response from daemon: manifest for bblfsh/python-driver:dev-f8f92a2-dirty not found.
See 'docker run --help'.

@juanjux I guess this means that the sources were modified during build (maybe with code generation), so the tag for the contianer was camputed at two different points with two different results (dev-f8f92a2 and dev-f8f92a2-dirty).

You can check the build locally and see if git status is clean.

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>

gitignore

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux
Copy link
Contributor Author

juanjux commented Jun 1, 2018

Build fixed, after much searching I found the problem was the python caches marking the build as "dirty", adding them to the .gitignore solved the problem, merging!

@juanjux juanjux merged commit 19ec088 into bblfsh:master Jun 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants