-
Notifications
You must be signed in to change notification settings - Fork 30
proposals: Add new BIP-003 proposal, Agglutinative Roles language #82
proposals: Add new BIP-003 proposal, Agglutinative Roles language #82
Conversation
proposals/bip-003.md
Outdated
|
||
This presents some issues: | ||
* It doesn't scale well. | ||
For a set of N properties, in the worst case 2^N roles would be needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comma after case.
proposals/bip-003.md
Outdated
|
||
This combination of roles is already done to some extent, | ||
since a preincrement operator would actually be annotated with 2 roles: `Expression`, `OpPreIncrement`. | ||
Current proposal just deepens this property separation into roles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current proposal.
proposals/bip-003.md
Outdated
arithmetic operators, would have an easier way to filter the `UAST` to find the interesting nodes. | ||
|
||
Additionally, this agglutination of roles makes unsupported node types degrade more gracefully. | ||
For example, a `log` operator, which currently lacks an specific role, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking: at first I tough about logging, maybe change for natural logarithm
operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lacks a specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to **
(pow), to avoid potential confusion.
proposals/bip-003.md
Outdated
The set of Roles are changed by this proposal. | ||
It's limited to partition the multiple property roles currently defined, | ||
leaving the potential addition of new property roles | ||
(`Arithmetic`, `Comparsion`, `Loop`, ...) to a future BIP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparison
proposals/bip-003.md
Outdated
|
||
## Impact | ||
|
||
Imcompatible changes to the Role set are proposed, in order to do that, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incompatible
proposals/bip-003.md
Outdated
* Versioning should be added to the [SDK](https://github.com/bblfsh/sdk/), | ||
to allow existing server and drivers work with a previous version of the SDK. | ||
* Roles should be updated in the SDK. | ||
* Protobuf generated code should be updated for [server](https://github.com/bblfsh/server) and [python](https://github.com/bblfsh/client-python) and [go](https://github.com/bblfsh/client-go) clients |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python and Go (uppercase).
proposals/bip-003.md
Outdated
* `VisibleFromInstance`: `Visibility`, `Instance` | ||
* `VisibleFromType`: `Visibility`, `Type` | ||
* `VisibleFromSubtype`: `Visibility`, `Subtype` | ||
* `VisibleFromPackage`: `Visiblity`, `Package` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Visibility
Good job! About possible improvements, I miss a note in the Also, a third alternative could be the "two-level UAST" I proposed some time ago, where the first level roles are the more generic ( Other than that, this BIP has my ACK for the main proposal. |
Updated with typo and grammar fixes from the review. |
* `Import` | ||
* `Path` | ||
* `Alias` | ||
* `Function` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have a Class role?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be TypeDeclaration
or Type+Declaration
with the new roles.
would be just left as: `Expression`, `Incomplete`. | ||
With the new language, the node could still retain most of the information: | ||
|
||
* `Expression` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this or other cases, can be a possibility that a role with specific properties can match one by one properties from another role and be nodes slightly different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For nodes with incomplete roles, that can certainly be the case. You could have unsupported operators (let's say pow
and log
), that you know they're arithmetic operators, but cannot distinguish between them. You'd have to go and check the token, if available, in that case.
First of all, thank you for the work on bip-3. It's a really interesting and very well written proposal. Having been giving it some thought today it makes a lot of sense to me (de)composing the roles, for the reasons that you state. The one concern that I have, is around higher level abstractions on the roles for easy usability (what @juanjux I believe calls 2 level roles), in particular in the spark-api project. Since that project will be the way a large # of our intended audience will use babelfish, with the composed roles I can imagine some difficulties. Here is some pseudo code from the Spark API design document. //DS - PySpark
//for 1000 Python repositories (repos.txt or LanguageDataset demo)
src-d.select()
//clone to local FS (or use .siva files files in hdfs://)
src-d.clone(repos.txt, “/path/to/cloned/repos”)
//get UASTs for HEAD
uasts = src-d.read.gitLocal(“/path/to/cloned/repos”)
.getReferences().filter($“reference_name” === “HEAD”)
.getFiles()
.applyEnry()
.applyBblfsh()
//extract SimpleIdentifier roles (ids)
uasts.filter("uast uast_lib('//*[@class=SimpleIdentifier]')")
The above could be partially solved with the two-level UAST suggestion of @juanjux . |
@eiso that is a very valid concern, and I see now that I forgot to give a proper answer to @juanjux suggestion, sorry.
Note that these points are presented to have a wider view of the possibilities, not to discourage a categorization (of two levels or otherwise), which I consider a valid approach. |
@abeaumont regarding topic 1, could you take your suggested approach and have a 'category type' role that gets added in the same manner. Or would this be mixing concepts? |
@eiso I'm not sure I understand what you mean, do you mean having a special field/attribute in a node named internalType: SimpleIdentifier
type: Identifier
roles: Simple, OtherRole, ... If that's what you mean, yes, that could be a way to do it. If not, please elaborate a bit on your appoach. |
@abeaumont the options I meant were: Option 1. When Option 2. ...or having (
Please ignore the terrible naming. |
@eiso ok, I understand now. I think both options would be similar from a Babelfish point of view. I think we could add automatic support for option 2, to make role 'level' explicit, without much work, it would just be more verbose. So the main question would be if this approach would be of any use for code analysis. I think an analyst would need to know the roles and their categories beforehand anyway, but you surely have a better code analysis perspective and I guess you may have some use case in mind where this kind of annotation would be of help? |
proposals/bip-003.md
Outdated
* `FunctionDeclarationName`: `Function`, `Declaration`, `Identifier` | ||
* `FunctionDeclarationReceiver`: `Function`, `Declaration`, `Receiver` | ||
* `FunctionDeclarationArgument`: `Function`, `Declaration`, `Argument` | ||
* `FunctionDeclarationArgumentName`: `Function`, `Declaration`, `Argument`, `Name, ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
,
at the end.
Part of bblfsh/sdk#167