proposals: Add new BIP-003 proposal, Agglutinative Roles language #82

abeaumont · 2017-09-04T16:39:48Z

juanjux · 2017-09-05T11:01:27Z

proposals/bip-003.md

+
+This presents some issues:
+* It doesn't scale well.
+  For a set of N properties, in the worst case 2^N roles would be needed.


Missing comma after case.

juanjux · 2017-09-05T11:02:54Z

proposals/bip-003.md

+
+This combination of roles is already done to some extent,
+since a preincrement operator would actually be annotated with 2 roles: `Expression`, `OpPreIncrement`.
+Current proposal just deepens this property separation into roles.


The current proposal.

juanjux · 2017-09-05T11:05:49Z

proposals/bip-003.md

+arithmetic operators, would have an easier way to filter the `UAST` to find the interesting nodes.
+
+Additionally, this agglutination of roles makes unsupported node types degrade more gracefully.
+For example, a `log` operator, which currently lacks an specific role,


Nitpicking: at first I tough about logging, maybe change for natural logarithm operator?

lacks a specific

Changed to ** (pow), to avoid potential confusion.

juanjux · 2017-09-05T11:06:58Z

proposals/bip-003.md

+The set of Roles are changed by this proposal.
+It's limited to partition the multiple property roles currently defined,
+leaving the potential addition of new property roles
+(`Arithmetic`, `Comparsion`, `Loop`, ...) to a future BIP.


Comparison

juanjux · 2017-09-05T11:07:27Z

proposals/bip-003.md

+
+## Impact
+
+Imcompatible changes to the Role set are proposed, in order to do that,


Incompatible

juanjux · 2017-09-05T11:08:50Z

proposals/bip-003.md

+* Versioning should be added to the [SDK](https://github.com/bblfsh/sdk/),
+  to allow existing server and drivers work with a previous version of the SDK.
+* Roles should be updated in the SDK.
+* Protobuf generated code should be updated for [server](https://github.com/bblfsh/server) and [python](https://github.com/bblfsh/client-python) and [go](https://github.com/bblfsh/client-go) clients


Python and Go (uppercase).

juanjux · 2017-09-05T11:09:39Z

proposals/bip-003.md

+* `VisibleFromInstance`: `Visibility`, `Instance`
+* `VisibleFromType`: `Visibility`, `Type`
+* `VisibleFromSubtype`: `Visibility`, `Subtype`
+* `VisibleFromPackage`: `Visiblity`, `Package`


juanjux · 2017-09-05T11:09:58Z

Good job!

About possible improvements, I miss a note in the Impact section about the current users of the UAST having to update their code and the way they interpret the roles.

Also, a third alternative could be the "two-level UAST" I proposed some time ago, where the first level roles are the more generic (loop, branch, procedure) and a second level would have a more concrete semantic meaning (foreach, if, coroutine). But it is probably tangential for this proposal (both levels could use agglutinative roles) so it doesn't need to be added.

Other than that, this BIP has my ACK for the main proposal.

abeaumont · 2017-09-06T09:34:23Z

Updated with typo and grammar fixes from the review.

mcuadros · 2017-09-06T10:28:30Z

proposals/bip-003.md

+* `Import`
+* `Path`
+* `Alias`
+* `Function`


we don't have a Class role?

That would be TypeDeclaration or Type+Declaration with the new roles.

ajnavarro · 2017-09-06T10:31:17Z

proposals/bip-003.md

+would be just left as: `Expression`, `Incomplete`.
+With the new language, the node could still retain most of the information:
+
+* `Expression`


In this or other cases, can be a possibility that a role with specific properties can match one by one properties from another role and be nodes slightly different?

For nodes with incomplete roles, that can certainly be the case. You could have unsupported operators (let's say pow and log), that you know they're arithmetic operators, but cannot distinguish between them. You'd have to go and check the token, if available, in that case.

eiso · 2017-09-06T13:52:58Z

First of all, thank you for the work on bip-3. It's a really interesting and very well written proposal. Having been giving it some thought today it makes a lot of sense to me (de)composing the roles, for the reasons that you state.

The one concern that I have, is around higher level abstractions on the roles for easy usability (what @juanjux I believe calls 2 level roles), in particular in the spark-api project. Since that project will be the way a large # of our intended audience will use babelfish, with the composed roles I can imagine some difficulties.

Here is some pseudo code from the Spark API design document.

//DS - PySpark
//for 1000 Python repositories (repos.txt or LanguageDataset demo)
src-d.select()

//clone to local FS (or use .siva files files in hdfs://)
src-d.clone(repos.txt, “/path/to/cloned/repos”)

//get UASTs for HEAD
uasts = src-d.read.gitLocal(“/path/to/cloned/repos”)
  .getReferences().filter($“reference_name” === “HEAD”)
  .getFiles()
  .applyEnry()
  .applyBblfsh()

//extract SimpleIdentifier roles (ids)
uasts.filter("uast uast_lib('//*[@class=SimpleIdentifier]')")

uast_lib is essentially libuast as you can see here. If instead of being able to extract all simple identifiers, I know need a rule based system for roles (e.g. include x, but excludes y, z). I can imagine the syntax becoming difficult, less easy to get started with and also more intense to process on the Spark side. That's why I am tagging @ajnavarro @bzz and @erizocosmico here as well.

The above could be partially solved with the two-level UAST suggestion of @juanjux .

abeaumont · 2017-09-06T15:06:12Z

@eiso that is a very valid concern, and I see now that I forgot to give a proper answer to @juanjux suggestion, sorry.
I've two comments on this topic:

I think what both @juanjux and @eiso are suggesting is just a particular case of the alternative approach I included at https://github.com/bblfsh/documentation/pull/82/files#diff-b3182c12132bb09dddfe38e712ae248bR289, which just talks about a categorization in general. The current proposal doesn't deepen in that approach for the reasons commented there, but if you consider otherwise, we can go deeper into that.
It's true that with this proposal search may become more complex but apart from categorization there are two additional tools we can use to solve them:

Use appropiate roles to facilitate code analysis. It may be true that a code analyst may want to look for simple identifiers and doesn't want to get the qualified identifiers. So let's say that's the case for discussion's sake. Then we could add a Simple role to handle this properly and make analyst's life easier. I think the new approach makes this easier. Look for example at the arithmetic operator example I used, it would make it easier to look for arithmetic operators instead of all of them. The same way we can add roles for say, relational operators, control flow constructs, looping constructs, etc.
Use the power of xpath syntax to do the search. Even if we missed some valid searches needed by code analysts with our roleset, I think it would still be a reasonable approach (at the usability level) to use something like: //*[@roleIdentifier and not(@roleQualified)]. Not sure about performance, we may need to check that.

Note that these points are presented to have a wider view of the possibilities, not to discourage a categorization (of two levels or otherwise), which I consider a valid approach.

eiso · 2017-09-06T18:27:34Z

@abeaumont regarding topic 1, could you take your suggested approach and have a 'category type' role that gets added in the same manner. Or would this be mixing concepts?

abeaumont · 2017-09-07T09:38:26Z

@eiso I'm not sure I understand what you mean, do you mean having a special field/attribute in a node named type which would contain the main category of a node? Something like:

internalType: SimpleIdentifier
type: Identifier
roles: Simple, OtherRole, ...

If that's what you mean, yes, that could be a way to do it. If not, please elaborate a bit on your appoach.

eiso · 2017-09-07T09:58:54Z

@abeaumont the options I meant were:

Option 1. When ForEach becomes For, Iterator, adding Loop as a sub-role. So having (For, Iterator, Loop). Where Loop is defined in the spec as a higher level role.

Option 2. ...or having (For, Iterator, Loop) but typed as:

role: For
type: level2

role: Iterator
type: level2

role: Loop
type: level1

Please ignore the terrible naming.

abeaumont · 2017-09-07T10:13:42Z

@eiso ok, I understand now. I think both options would be similar from a Babelfish point of view. I think we could add automatic support for option 2, to make role 'level' explicit, without much work, it would just be more verbose.

So the main question would be if this approach would be of any use for code analysis. I think an analyst would need to know the roles and their categories beforehand anyway, but you surely have a better code analysis perspective and I guess you may have some use case in mind where this kind of annotation would be of help?

juanjux · 2017-09-12T07:47:49Z

proposals/bip-003.md

+* `FunctionDeclarationName`: `Function`, `Declaration`, `Identifier`
+* `FunctionDeclarationReceiver`: `Function`, `Declaration`, `Receiver`
+* `FunctionDeclarationArgument`: `Function`, `Declaration`, `Argument`
+* `FunctionDeclarationArgumentName`: `Function`, `Declaration`, `Argument`, `Name, `


, at the end.

proposals: Add new BIP-003 proposal, Agglutinative Roles language

3dd9255

abeaumont requested a review from juanjux September 4, 2017 16:39

abeaumont mentioned this pull request Sep 4, 2017

Use an agglutinative language for Roles bblfsh/sdk#167

Closed

5 tasks

juanjux suggested changes Sep 5, 2017

View reviewed changes

mcuadros approved these changes Sep 6, 2017

View reviewed changes

proposals: Fix grammar and typos

6dffbad

juanjux approved these changes Sep 6, 2017

View reviewed changes

juanjux requested a review from eiso September 6, 2017 09:36

abeaumont requested review from ajnavarro and vmarkovtsev September 6, 2017 09:37

mcuadros reviewed Sep 6, 2017

View reviewed changes

ajnavarro reviewed Sep 6, 2017

View reviewed changes

ajnavarro approved these changes Sep 6, 2017

View reviewed changes

juanjux reviewed Sep 12, 2017

View reviewed changes

proposals: Fix typos

cad7566

juanjux approved these changes Sep 12, 2017

View reviewed changes

Changed status to "accepted"

e943c6e

juanjux merged commit 320ec3b into bblfsh:master Sep 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposals: Add new BIP-003 proposal, Agglutinative Roles language #82

proposals: Add new BIP-003 proposal, Agglutinative Roles language #82

abeaumont commented Sep 4, 2017 •

edited

Loading

juanjux Sep 5, 2017

juanjux Sep 5, 2017

juanjux Sep 5, 2017

juanjux Sep 5, 2017

abeaumont Sep 6, 2017

juanjux Sep 5, 2017

juanjux Sep 5, 2017

juanjux Sep 5, 2017

juanjux Sep 5, 2017

juanjux commented Sep 5, 2017

abeaumont commented Sep 6, 2017

mcuadros Sep 6, 2017

juanjux Sep 6, 2017

ajnavarro Sep 6, 2017 •

edited

Loading

abeaumont Sep 6, 2017

eiso commented Sep 6, 2017

abeaumont commented Sep 6, 2017

eiso commented Sep 6, 2017

abeaumont commented Sep 7, 2017

eiso commented Sep 7, 2017 •

edited

Loading

abeaumont commented Sep 7, 2017

juanjux Sep 12, 2017


		## Impact

		Imcompatible changes to the Role set are proposed, in order to do that,

proposals: Add new BIP-003 proposal, Agglutinative Roles language #82

proposals: Add new BIP-003 proposal, Agglutinative Roles language #82

Conversation

abeaumont commented Sep 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juanjux commented Sep 5, 2017

abeaumont commented Sep 6, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajnavarro Sep 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eiso commented Sep 6, 2017

abeaumont commented Sep 6, 2017

eiso commented Sep 6, 2017

abeaumont commented Sep 7, 2017

eiso commented Sep 7, 2017 • edited Loading

abeaumont commented Sep 7, 2017

Choose a reason for hiding this comment

abeaumont commented Sep 4, 2017 •

edited

Loading

ajnavarro Sep 6, 2017 •

edited

Loading

eiso commented Sep 7, 2017 •

edited

Loading