Skip to content
This repository has been archived by the owner on Mar 8, 2020. It is now read-only.

QUALIFIED_IDENTIFIER is not SIMPLE_IDENTIFIER and duplication of CALL_CALLEE role #51

Closed
zurk opened this issue Aug 3, 2017 · 15 comments
Assignees

Comments

@zurk
Copy link

zurk commented Aug 3, 2017

I found that QUALIFIED_IDENTIFIER is not SIMPLE_IDENTIFIER, but @vmarkovtsev say that it is supposed to be.
Also, I found duplication of CALL_CALLEE role.

How to reproduce:

from bblfsh.client import BblfshClient

filepath = "./matplotlib_example.py"
bc = BblfshClient("0.0.0.0:9432")
res = bc.parse(filepath, language='Python')
print(res)

matplotlib_example.py:

from matplotlib import pyplot as plt
plt.figure()

Output (lines 76-83):

       token: "figure"
        start_position {
          line: 2
          col: 1
        }
        roles: CALL_CALLEE
        roles: CALL_CALLEE
        roles: QUALIFIED_IDENTIFIER

The problem is in figure token. It means that we do not take into account function names during our machine learning analysis.

P.S.: Moved from bblfsh/bblfshd#82 because it is a python specific problem.

@smola
Copy link
Member

smola commented Aug 3, 2017

In principle plt.figure would be a QUALIFIED_IDENTIFIER with two children plt and figure which would be SIMPLE_IDENTIFIER. At least, that was the original intent when defining these roles, but I don't know how that fits Python AST.

@juanjux juanjux self-assigned this Aug 3, 2017
@juanjux
Copy link
Contributor

juanjux commented Aug 3, 2017

Python doesn't make a distinction on the AST level from method calls and module calls (since modules are really object instances), so CallCallee I think is correct in this case, specially considering that is a child of a Call (assigned to the plt node).

The definition of CallCallee is:

	// CallCallee is the callable being called. It might be the name of a
	// function or procedure, it might be a method, it might a simple name
	// or qualified with a namespace.

QualifiedIdentifier is used for the Python AST Attribute node that it uses for everything dotted. I think with some annotation-fu it could also have the SimpleIdentifier role as @smola said. Now, adding QualifiedIdentifier to the parent (plt) that have already SimpleIdentifier could be tricky, but I'll take a look.

@zurk
Copy link
Author

zurk commented Aug 4, 2017

As I can understand:

  1. CallCallee is ok in this case, but, duplication is not necessary.
  2. I should get figure as SimpleIdentifier , but it is not possible now.

Please, correct me if I am wrong.
By the way, do we have some strong definitions of all roles?

@juanjux
Copy link
Contributor

juanjux commented Aug 4, 2017

  1. Yes and yes.
  2. Yes and it should be fixable changing the annotations code.

@abeaumont
Copy link
Contributor

@zurk https://godoc.org/github.com/bblfsh/sdk/uast#Role is the reference documentation for roles

@juanjux
Copy link
Contributor

juanjux commented Aug 4, 2017

@zork my current task involves working with the annotations so I'll try to shoehorn this too on the next release of the driver (ETA: today or monday).

@juanjux
Copy link
Contributor

juanjux commented Aug 4, 2017

All problems reported should be fixed by #54

@juanjux
Copy link
Contributor

juanjux commented Aug 4, 2017

Fix is in version v0.4.5, thanks for the good report!

@zurk
Copy link
Author

zurk commented Aug 4, 2017

It seems not to work.

What I do to update docker:

$ docker rmi -f bblfsh/server
Untagged: bblfsh/server:latest
Untagged: bblfsh/server@sha256:995edbd4bc3f1304059670892322492d37c625ce99d749a9bd6eba60df8d3fc0
Deleted: sha256:d1d3218cfb85213b8ab865b93b2eb996e21b22b8de3081e1c8dd3d5dd8a9645d
Deleted: sha256:52960978090142da160beb362760fd3b735c6b4d65b6383993ef17d8b6df4a62
Deleted: sha256:5532368a31059f5aabdfa6834c46364d9d16fe3c985f857fb26f17a4d05cc912
Deleted: sha256:5bef08742407efd622d243692b79ba0055383bbce12900324f75e56f589aedb0
$ docker rmi -f bblfsh/server
Error response from daemon: No such image: bblfsh/server:latest

then run it.

docker run --privileged -p 9432:9432 --name bblfsh bblfsh/server

bblfsh logs:

Unable to find image 'bblfsh/server:latest' locally
latest: Pulling from bblfsh/server
88286f41530e: Pull complete
5f6338622042: Pull complete
2e6d5a920060: Pull complete
Digest: sha256:995edbd4bc3f1304059670892322492d37c625ce99d749a9bd6eba60df8d3fc0
Status: Downloaded newer image for bblfsh/server:latest
time="2017-08-04T16:54:59Z" level=info msg="binding to 0.0.0.0:9432"
time="2017-08-04T16:54:59Z" level=info msg="initializing runtime at /tmp/bblfsh-runtime"
time="2017-08-04T16:54:59Z" level=info msg="setting maximum size for sending and receiving messages to 104857600"
time="2017-08-04T16:54:59Z" level=info msg="starting gRPC server"
time="2017-08-04T16:55:22Z" level=info msg="container started bblfsh/python-driver:latest (01BPQ2RGZV0W197S3XSGHM70MF)"
time="2017-08-04T16:55:22Z" level=info msg="parsing matplotlib_example.py (50 bytes)"
proto: no encoder for Filename string [GetProperties]
proto: no encoder for Language string [GetProperties]
proto: no encoder for Content string [GetProperties]
proto: no encoder for Encoding protocol.Encoding [GetProperties]

And the output of example in the first msg is the same:

        token: "figure"
        start_position {
          line: 2
          col: 1
        }
        roles: CALL_CALLEE
        roles: CALL_CALLEE
        roles: QUALIFIED_IDENTIFIER

May be it is the same driver image but I have no idea how to check it.

@abeaumont
Copy link
Contributor

@zurk could you upload the python file you're parsing somewhere?

@abeaumont abeaumont reopened this Aug 7, 2017
@zurk
Copy link
Author

zurk commented Aug 7, 2017

@abeaumont It is in the first message.

from matplotlib import pyplot as plt
plt.figure()

@abeaumont
Copy link
Contributor

@zurk oh, you're right, I missed that. Thanks!

@juanjux
Copy link
Contributor

juanjux commented Aug 7, 2017

@zurk this works for me to workaround the latest tag not updating:

BBLFSH_DRIVER_IMAGES="python=docker://bblfsh/python-driver:v0.4.5" docker run -e BBLFSH_DRIVER_IMAGES --privileged -p 9432:9432 --name bblfsh bblfsh/server

@zurk
Copy link
Author

zurk commented Aug 7, 2017

Yes, it works. Babelfish overrides image for python:

$ BBLFSH_DRIVER_IMAGES="python=docker://bblfsh/python-driver:v0.4.5" docker run -e BBLFSH_DRIVER_IMAGES --privileged -p 9432:9432 --name bblfsh bblfsh/server
Unable to find image 'bblfsh/server:latest' locally
latest: Pulling from bblfsh/server
88286f41530e: Pull complete
5f6338622042: Pull complete
2e6d5a920060: Pull complete
Digest: sha256:995edbd4bc3f1304059670892322492d37c625ce99d749a9bd6eba60df8d3fc0
Status: Downloaded newer image for bblfsh/server:latest
time="2017-08-07T09:09:18Z" level=info msg="binding to 0.0.0.0:9432"
time="2017-08-07T09:09:18Z" level=info msg="initializing runtime at /tmp/bblfsh-runtime"
time="2017-08-07T09:09:18Z" level=info msg="Overriding image for python: docker://bblfsh/python-driver:v0.4.5"
time="2017-08-07T09:09:18Z" level=info msg="setting maximum size for sending and receiving messages to 104857600"
time="2017-08-07T09:09:18Z" level=info msg="starting gRPC server"

Thank you @juanjux, @abeaumont.

@juanjux
Copy link
Contributor

juanjux commented Aug 8, 2017

No need for the workaround anymore: bblfsh/sdk#159

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants