Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise tagging mechanism. Add the notion of @ProvidedTags. #123

Merged
merged 9 commits into from Mar 29, 2016

Conversation

chumer
Copy link
Member

@chumer chumer commented Mar 14, 2016

Since there was bad feedback on the SourceSection#hasTag solution, I tried to come up with a more flexible approach to tagging. I changed to a protected method Node#isTaggedWith(String). So its now up to the language how tagging is implemented.

Also new and noteworthy:

  • ProvidedTags to specify for a language which tags are provided by a language. (isTaggedWith is never called with other tags than that)
  • RequiredTags to specify which tags are required by an instrument. One cannot create queries for tags they do not require anymore.
  • PolyglotEngine#Instrument#isCompatibleWith(Language) to check wheter an instrument is compatible to a language. (contains all tags)
  • Instrumenter#isNodeTaggedWith(Node, String) to query other nodes not instrumented for a specific tag.

Since the behavior for provided and required tags is incompatible with the original version, I've introduced a temporary compatibility mode flag -Dtruffle.instrumentation.compatibility=true. For a language to verify that it is compatible to the new required/provided tags schema you need to run all you tests with compatibility set to false.

Also please note that in compatibility mode this change is fully compatible also SourceSection#hasTag is still supported.

@mickjordan @chrisseaton @woess please review for use in language
@mlvdv @jtulach please review for everything else.

@mlvdv
As you requested: here is a sketch to display all tags of a node:

    public static void displayAllTags(PolyglotEngine engine, Instrumenter instrumenter, Node node) {
        Set<String> allTags = new HashSet<>();
        for (Instrument instrument : engine.getInstruments().values()) {
            allTags.addAll(instrument.getRequiredTags());
        }
        for (Language language : engine.getLanguages().values()) {
            allTags.addAll(language.getProvidedTags());
        }

        for (String tag : allTags) {
            if (instrumenter.isNodeTaggedWith(node, tag)) {
                System.out.println(String.format("Node %s is tagged with %s.", node, tag));
            }
        }
    }

@jtulach
Copy link
Contributor

jtulach commented Mar 14, 2016

I don't understand why one should be able to display all tags of a node.

@Override
protected boolean isTaggedWith(String tag) {
switch (tag) {
case Debugger.HALT_TAG:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't this be solved much simpler now, by just checking whether the parent is a sequence node?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to check for the parent beeing a wrapper. Also isTaggedWith is defined as unchanged after AST creation, its hard to gurantee that if the parent is changing. Also its not true that each statement of a block is tagged halt in SL. For conditionals you want to step before the condition and not the full if expression. So the boolean solution is a lot easier to fine-tune.

@smarr
Copy link
Contributor

smarr commented Mar 14, 2016

In responds to @lukasstadler question about 'just checking a parent' for something.
This brings up a problem, which might or might not have been discussed before, don't remember:

How much support is needed/desired to 'just check' a node, a parent, or child for something in the potential presence of wrapper nodes?

Do wrapper nodes delegate questions about tags to their wrapped nodes? When asking a parent node, this could give very confusing results. Do you expect every usage side to guard with node instanceof WrapperNode? I got quite a number of these checks in SOMns, and I am not sure I put all in that are necessary.

@chumer
Copy link
Member Author

chumer commented Mar 14, 2016

@smarr Yes this is the downside of using wrappers. The alternative would be to make each leaf node class instrumentable individually without wrappers.

Do wrapper nodes delegate questions about tags to their wrapped nodes?

Yes since wrappers derive the automatically inherit the tags behavior. But wrapper nodes are never asked for tags by the instrumentation framework.

Do you expect every usage side to guard with node instanceof WrapperNode?

I don't think its good practice to rely on the parent/childs type. If you do, eg for pattern matching, you can work with helpers that filter wrapper nodes.

@chumer
Copy link
Member Author

chumer commented Mar 14, 2016

@jtulach @mlvdv wants to have that for instrument debugging.

@woess
Copy link
Member

woess commented Mar 14, 2016

I believe relying on a parent node is fragile and thus should be avoided.

@lukasstadler
Copy link
Member

I believe relying on a parent node is fragile and thus should be avoided.

Why? as long as there's a way to identify the next "real" parent, it should be fine.
Are you thinking about situations where the parent of still-executing code is changed to null?

@smarr
Copy link
Contributor

smarr commented Mar 14, 2016

Method-level recursion and AST-level self-optimization can lead to very surprising results when relying on certain invariants wrt. parent nodes.

This bit me because some information I need for tagging only becomes available during execution, i.e., after self-optimization in dynamic languages. So, the tags-are-stable-after-AST creation is very optimistic.

@chumer
Copy link
Member Author

chumer commented Mar 14, 2016

@lukasstadler no parent do ever become null after adoption. this is necessary also for multithreading.
I agree with @smarr that the parent is always that stable is optimistic. It might lead to hard to debug errors and its hard to verify. Wheras having a simple state in the node is easy to verify unchanged after AST creation, but it needs state. But for FastR it might make sense to do that. The good thing is that each language can now choose how to do it.

@woess
Copy link
Member

woess commented Mar 14, 2016

@lukasstadler Simply because there's no real guarantee that your parent is always going to be what you think it will be. You're right, it can work if you can identify the "real" parent. I'm thinking more about potential refactoring hazards: someone might later change the AST structure a bit or introduce a new parent node and forget to update the tag detection. (e.g., you might skip the SequenceNode for a single statement and suddenly the statement tagging is gone.)
Obviously, these aren't obstacles, and it's totally up to the guest language implementation.

@chrisseaton
Copy link
Contributor

Works for JRuby jruby/jruby@truffle-head...truffle-new-tagging

@woess
Copy link
Member

woess commented Mar 16, 2016

It would have been good to split this into 2 separate changes (which I believe are mostly independent):

  1. isTaggedWith
  2. ProvidedTags, RequiredTags
  3. I like it and I think this is the right approach.
  4. doesn't really allow for optional tags, does it? I'm not sure all these restrictions are really necessary and if not implementing one tag should already mean that the instrument is incompatible.

Please also add a changelog entry. Thanks!

@chumer
Copy link
Member Author

chumer commented Mar 16, 2016

@woess ad 2: Its never going to get as easy as now to integrate this. I think adding @OptionalTags as well could solve most of our problems, right?

Ad changelog. Will add it.

@woess
Copy link
Member

woess commented Mar 16, 2016

@chumer I'm more concerned that we're introducing something that we don't actually need. That said I don't object to the annotations.

@ghost
Copy link

ghost commented Mar 16, 2016

In FastR we have some utililty functions to "unwrap" both down and up that takes care of wrapper nodes.

@jtulach
Copy link
Contributor

jtulach commented Mar 18, 2016

Objections:

  1. instrumentation presents itself as an optional feature - it used to be once (in early January) - but since then it sneaked into core API and now it even has its central method in Node. Something is rotten in the kingdom of Denmark!
  2. while the list of supported tags per language/instrument Require/Provide tags is nice, we don't need it for anything right now. E.g. there is some truth on Andreas @woess comment that this could be separated into two changes - and only one of them is incompatible -e.g. needs to be done in somoe hurry.
  3. discussion about parent is completely out of scope of this change as that aspect isn't influenced at all.
  4. listing all tags is a non-goal - e.g. Instrument.getRequiredTags shouldn't be public. If we really want to use that for verification of compatibility between language/instrument, such methods should be at most protected.

@chumer
Copy link
Member Author

chumer commented Mar 18, 2016

  1. Its still optional in the sense that you don't need to implement anything at all. In the current state its sneaking into the SourceSection API which is an even more central API. Do you have any suggestions on how to fix that? I think its a good compromise. This solution comes with no overhead (as opposed to having a SourceSection#tags field)
  2. Will separate into separate PR. Will reuse this PR for the tagging mechanism. I don't agree that any change in here is incompatible to the previous Truffle version. Since we introduce an all new instrumentation framework its now that we can introduce such an api like require/provide very easily. Later its going to be more difficult.
  3. Its not at all out of scope as the languages want to know how they should implement isTaggedWith. How getParent() behaves is very relevant to this question.
  4. Agreed. Will remove Instrument#getRequiredTags from the API.

@woess
Copy link
Member

woess commented Mar 18, 2016

(1.) Yes, it's better than the SourceSection solution. But we could extract it into an interface.
(4.) Agreed. There's no reason to get all tags.

@chumer
Copy link
Member Author

chumer commented Mar 20, 2016

@woess
(1.) So you suggest an interface InstrumentableNode? We had this in the first prototype. @jtulach is that better in your opinion as well?

@jtulach
Copy link
Contributor

jtulach commented Mar 21, 2016

It is really interesting to see how the design walks in spiral (hopefully it is a spiral, not a circle).

In December we had relatively working design based on annotations. Then we found it too constraining (Ruby had a node which might or might not be a statement) and switched to string tags to give us flexibility. Now we see that strings are bit too flexible, and we need a way to find out if language/instrument can agree on a set of tags. Thus we build a new protocol (provides/requires annotation) to find out if the tags match.

This whole round trip makes me ask: What was the core reason for rejecting annotations? My answer: the inflexibility shown by a Ruby wanting to represent statement/non-statement by a single class.

Well, that can easily be fixed by Node.isTaggedWith or similar. Then the dynamism is there, yet we can keep using standard Java meta programming facility - e.g. annotations. Here is my rewrite of the API: jtulach@7b30201 - it is not fully polished, but following works:

truffle$ mx clean && mx build && mx unittest sl.test && mx javadoc

The test execution includes SLDebugTest which makes me believe it is working correctly for the debugger usecase and adopting languages to this API will be straightforward.

If we switch to the annotations again, we'll get static typing between instrument and a language and ability to perform static analysis during compilation - for example find out which annotations a language nodes support automatically. We'll be able to use standard IDE tools as home made protocols like provides/requires tags will no longer be needed or will operate on typed annotations.

@chrisseaton
Copy link
Contributor

My answer: the inflexibility shown by a Ruby wanting to represent statement/non-statement by a single class.

The problem isn't that the Ruby team is just stubborn about adding a new class to represent statements. I can still do that if it's wanted. I would just wrap each expression that is also a statement with a new StatementNode, which could be annotated with the right tags.

The problem is that this new node is essentially a wrapper node - a node that just takes up space and slows down partial evaluation - and this is what the original rewrite of instrumentation was supposed to remove. That was my only objection.

If we want to go back to wrapper nodes and annotations on nodes, that's fine I can do that any time.

@smarr
Copy link
Contributor

smarr commented Mar 21, 2016

From my experience with the dynamic metrics tool, annotations alone are not sufficient. You don't want to restructure your AST nodes just because you want to support a new tool. Also, it is unclear that tools are not going to have conflicting requirements when it comes to tags, so, we need more flexibility there.

@woess
Copy link
Member

woess commented Mar 21, 2016

@chumer: It's a possibility if @jtulach thinks it should not be part of the Node API. However, I don't think it has to be extracted, now that it's just one method querying a tag.

@lukasstadler
Copy link
Member

From the FastR perspective, what constitutes a statement cannot be tied to specific node types, because it depends on where something is, not what it is.
Whether the tag itself is a string or an annotation type should be decided based upon whether the tighter coupling of class references as opposed to matching stings is desired and/or possible.
The main difference is that you actually have to compile against the source of whoever provides the tags, right?

@jtulach
Copy link
Contributor

jtulach commented Mar 21, 2016

Hello @chrisseaton, no I don't want to re-introduce wrappers into Ruby. The system that I am proposing should have the same level of flexibility. See https://github.com/jtulach/jruby/commit/8aaed63d02eb5407a46e9cae19f0701bd4d4b724

Good point, @lukasstadler. The compilation makes the difference. Please note that in both systems (e.g. @ProvidedTags vs. annotations) the list of tags supported by a language is hardcoded during compilation of the language binary. Thus I think @ProvidedTags and annotations have the same flexibility.

@chrisseaton
Copy link
Contributor

isAnnotationPresent is a new method - this isn't related to Java's Class#isAnnotationPresent is it? And I guess there's a default that does the obvious thing (presumably using Class#isAnnotationPresent)?

Seems fine to me.

@woess
Copy link
Member

woess commented Mar 21, 2016

@jtulach: seems like a good compromise, although it's a bit of an abuse of annotations.

@chumer
Copy link
Member Author

chumer commented Mar 25, 2016

@chrisseaton think of it as: you need to be able to run code that uses local variables at that location. If you would do that before the prolog your expression might not be evaluatable.

That is actually true for any @Instrumentable source location.

@lukasstadler
Copy link
Member

tags can be defined by what the debugger and other tools do with this information, or at an abstract language level, e.g., create a definition for what a statement is.
the former requires tool writers to collaborate on creating a set of tags, while the latter additionally requires involvement form the language writers.

I'd go with the former - what difference does it make whether something is a statement, call or expression, other than what tools do with it?

+1 though on discussion this as a separate issue.

@chumer
Copy link
Member Author

chumer commented Mar 25, 2016

I'd go with the former - what difference does it make whether something is a statement, call or expression, other than what tools do with it?

Are you opting for no tag sharing between tools? We want a standard set of tags shared between the languages independent of tools. There is/was consensus about that in the meetings. How exactly the tags look like, we don't know. But we can evolve them easily. So it might turn out that we have a DEBUGGER_HALT standard tag in the future. Lets wait and see.

+1 though on discussion this as a separate issue.

Yes we should continue the discussion. For this release I will go with this set of tags (or maybe some minor modifications) because we have use-cases. If it turns out they are wrong we deprecate them. I do this now to fix concerns raised by @mickjordan .

@smarr
Copy link
Contributor

smarr commented Mar 25, 2016

Was there a specific reason not to introduce a common super class for tags? If I am not mistaken someone else also asked for it. Could make finding tags easier, I think. (and I think typing was brought up as another benefit.)

@woess
Copy link
Member

woess commented Mar 25, 2016

Note that statement is not necessarily the same as debugger-halt (i.e., you might want to halt on something that's not a statement (in the language) and not halt on some specific statements). Can this be a problem for our set of standard tags?

@woess
Copy link
Member

woess commented Mar 25, 2016

Was there a specific reason not to introduce a common super class for tags? If I am not mistaken someone else also asked for it. Could make finding tags easier, I think. (and I think typing was brought up as another benefit.)

One reason not to have a common superclass could be to allow annotation classes to be used as tags.

@smarr
Copy link
Contributor

smarr commented Mar 25, 2016

@woess, I think it is more of an indication that tags will likely be rather tool specific. For my tool, I see only the root node as useful.

wrt. superclass: ok, I see.

@lukasstadler
Copy link
Member

Are you opting for no tag sharing between tools?

Quite the opposite. My point is that it doesn't make sense for the languages to add tags and hope that they are useful. The set of tags should result from discussions on what would be useful for tools, and individual tags should be defined in terms of the expected behavior (as opposed to names of programming language concepts, which can be interpreted in wildly different ways by different languages).

@woess
Copy link
Member

woess commented Mar 25, 2016

@smarr Regarding superclass/interface: Not sure if that's relevant to users, though. The standard tags aren't annotation classes, so these can't be used as annotations right now.

@mlvdv
Copy link
Contributor

mlvdv commented Mar 28, 2016

This is a good discussion, but sometimes the most important requirements get lost in a sea of implementation detail. This is a big deal because tags are the single most important functional thing about Truffle Instrumentation (runtime overhead seems well under control now, and everything else is secondary).

Here is the first of three comments about the actual requirements for tags in the Instrumentation Framework

First: the usefulness of Truffle Instrumentation depends almost completely on their flexibility.

Here’s why. Tags provide the essential level of indirection between language engineering and tool behavior. Tags make it possible to configure tool behavior with the least possible impact on language engineering. Their concerns are orthogonal:

  • Language engineering is strongly constrained by correctness and mainly motivated by performance.
  • Tool behavior, on the other hand, is mainly sensitive to user expectations, part of an evolving tool set, and subject to frequent fine-tuning.

So, do not restrict how or when language implementors apply tags any more than absolutely necessary for fast path performance. Every bit of coupling between tool behavior and language engineering reduces the usefulness of tags, and this includes anything that relates to AST node types. This is true for the standard tools we are building now, and it will becomes more true for tools not yet built.

Other than fast path concerns, flexibility is the only thing that matters. A tag gets bound to one or more source locations; the rest is implementation detail.

@mlvdv
Copy link
Contributor

mlvdv commented Mar 28, 2016

A second comment on the real requirements for tags.

The critical functional requirement for Truffle tags is flexibility, but this discussion is also about how to use them. Instrumentation/tools will only get adopted in practice if we get the tags right. Our strategy for adoption is:

  • provide workable tools to both language implementors and end users
  • for very little effort
  • that can be incrementally fine-tuned and improved as needed.

This strategy creates two hard requirements:

  1. The platform must define standard tags that configure a collection of standard tools (debugger, profiler, etc.), and it must be as easy as possible for language implementors to apply tags and to adjust them frequently.
  2. The platform must provide standard tools that are useful (if not optimal) for every language implementation that applies standard tags, and it must be as easy as possible to fine-tune tool behavior using tool-specific tags.

Adoption depends on this positive feedback loop. A language implementor adds a few standard tags, and the Truffle platform rewards her/him immediately with tools that do something useful. Since they are useful, the language implementor and colleagues use tools during development and notice rough edges. This motivates him/her to improve tool behavior, possibly in collaboration with tool builders, by adjusting standard tags and fine-tuning tool-specific tags.

@mlvdv
Copy link
Contributor

mlvdv commented Mar 28, 2016

A third comment on the real requirements for tags.

As recent discussion shows, we are still stuck on how to define standard tags. The only way forward is to keep working with tags collaboratively, tool builders together with language implementors. We have to do this, even if this costs some API stability.

@chumer want to return to tags as objects, with some new (to me) features, and detach them from SourceSection. This is the right thing to do (except for not enough tagging flexibility!).

In order to move forward and get some experience, we should start now with an absolutely minimal set of standard tags. For Debugger only STATEMENT and CALL. I know that CALL can be tricky, so let’s get started working through those issues with Debugger and Profiler as clients. There seems to be an understood use case for ROOT, so that could also be included. All other tags can be postponed.

Then we can redirect most of this discussion, based on very little experience, back to working with the actual tags, languages, and tools. We'll all learn a lot.

Can we manage this within our API constraints? I’d be happy to keep a new set of standard tags out of the published API during this experimental period, if that’s workable; otherwise we just have to take some stability hit so we can work out issues between language implementations and tools.

@chumer
Copy link
Member Author

chumer commented Mar 29, 2016

@mlvdv don't my latest changes exactly do what you want? Introducing a minimal StandardTags class? I want to add the tags to the published API. No problem to deprecate and remove them later. Could you comment on the proposed set of StandardTags (with javadoc)?

Also another small correction: we don't move to objects as tags but classes as tags which is somewhat different.

@jtulach
Copy link
Contributor

jtulach commented Mar 29, 2016

I did a quick mapping of @mlvdv high-level goals to the actual code included in this pull request and as far as I can tell, the essentials are part of @chumer changes. The flexibility is there, type-safety has been increased, and now we even converged on a minimal set of general purpose tags!

That is great and I am looking forward to offer these changes to public as part of Truffle 0.12 release.

@chumer
Copy link
Member Author

chumer commented Mar 29, 2016

Will merge after the check is through.

@chumer chumer merged commit 162e68a into oracle:master Mar 29, 2016
dougxc pushed a commit that referenced this pull request Jul 2, 2016
…LE.COM/truffle:bug/empty-source-section to master

Including test that now passes.

* commit 'f827720a2922962c3e869b17b9438a6324cf62a1':
  SourceSection:  include a test that now passes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants