New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attaching additional information to the AST #17
Comments
The conventional way to do this is the Decorator pattern: You define a
But surely, these problems aren’t unique to |
Maybe the people who worked on the current compiler can say if this would be useful? @gavinking @FroMage @tombentley @quintesse @thradec? |
This is definitly a great idea. We've wanted to attach custom info for a while on both AST and model nodes. |
Thanks for the feedback! I’m adding this right now, and there’s one thing I didn’t think about: For |
Ouch. Without ceylon/ceylon-spec#791, this makes the AST very annoying to use when you don’t want additional info, because I can’t provide a default Postponed :( |
In fact, I don’t think I would add this even with I suppose we could have type-unsafe |
What is an |
I don’t care, I simply give you a place where you can put an |
So if you don't care, why impose a constraint? What's wrong with |
What do you mean? I’m not imposing a constraint. The problem is that even if I declare OTOH… if I make |
But what on earth is the point of the type parameter? You are either imposing a constraint: that every |
So you think I should just have It doesn’t make my code any more type safe, but it makes my user’s code more type safe. At least, that’s the intention. (But I’m not sure if the constraint that all |
This sounds bogus to me. Where would they get a
For
The second is more verbose and has worse performance. It's not like the whole tree is going to have the same kind of |
That was the intention. In the examples I’m thinking of, all nodes would have the same However, I just noticed another problem when I tried to add shared actual Boolean equals(Object that) {
if (is CurrentType<What?> that) { // <--
return attr == that.attr /* && etc. */;
} else {
return false;
}
}
Looks like this just can’t be done. |
But I think that's just extremely unlikely. Why would a |
Hm, maybe you’re right. In that case, (This would also solve the |
In our case most of the nodes would have the same info type. Like boxing, erasure, etc, so it's not crazy at all. |
Okay, but I still can’t implement |
|
No, I don’t care about my This is the current shared actual Boolean equals(Object that) {
if (is QualifiedType that) {
return nameAndArgs == that.nameAndArgs && qualifyingType == that.qualifyingType;
} else {
return false;
}
} If |
Ah, that's a good question for @gavinking ;) |
|
Okay, dang. That rules out typesafe I guess I could be happy about that, since the unsafe version is much easier to implement ;) The only remaining question: |
|
Well I'm not sure what you're using the |
@luolong: You’re completely right, that needs to be As to the second point… it’s probably better, because my way allows people to break the typesafety by using two keys with different type arguments but the same key argument. @gavinking: This way, you could never store two objects of the same type as extra info. I’m not sure if that’s a disadvantage. It would probably encourage people to use more “wrapper” classes ( |
Sure. But that's a good thing. |
Why? |
Because if you store two objects of the same type, how will you distinguish them, other than by using an untypesafe string name? |
Hm, I’m beginning to see your point. I was thinking of them as identifiers, but of course these keys aren’t checked by the compiler, so it is a bit less safe. But leaving that aside… can you actually implement the functions in Ceylon as you suggested them? I don’t think you can pass a type parameter as a key to an underlying map in Ceylon… (You could of course do it in Java, having access to the |
Well, I suppose you could iterate over the values in the underlying map, and check for
|
T getExtra<T>() given T satisfies Object {
assert (is T info = map.get(`T`));
return info;
}
void setExtra<T>(T info) => map.put(`T`, info); |
Okay,
https://gist.github.com/lucaswerkmeister/971f5e01b387b0ddbf56 (In short: “Type param” uses |
Or: T? getExtra<T>() given T satisfies Object {
for (info in list) {
if (is T info) {
return info;
}
}
else {
return null;
}
}
void setExtra(Object info) => list.add(info); |
P.S. I'm pretty certain that this is not a highly performance-sensitive feature! |
Depends… if someone builds a compiler using As to the
|
Why the
outside of the loop so it's obtained only once. |
I highly doubt that. It would likely be faster than a hashmap most of the time. What, you have nodes with 100s of info objects on them? Not likely. You'll have one, maybe two. |
Premature optimization, etc, etc. The three rules of optimization: you know the rest. |
I show the results on my machine for 1) the original code, 2) without the 1..5 for loops and 3) obtaining the Type for T only once in the method and 4) finally taking that completely outside the test loop and getting each type exactly once :
NB: seems like we could gain a lot with some clever caching somewhere, but it does underscore Gavin's point about premature optimization. |
Tako: The 'T' optimization is invalid, because get() will have to construct the metamodel every time - otherwise, we might as well use a proper key. IIRC I added the 1..5 to reflect that get() will most likely be used more than set(). Gavin: Well, this is an "optimization" that affects the API of Node - I can't really change it later. ----- Ursprüngliche Nachricht ----- I show the results on my machine for 1) the original code, 2) without the 1..5 for loops and 3) obtaining the Type for T only once in the method and 4) finally taking that completely outside the test loop and getting each type exactly once : Type param, String: 1516916732ns = 1516ms Type param, String: 1130417180ns = 1130ms Type param, String: 68354966ns = 68ms |
What I meant is that is that right now that might be really expensive, but with proper caching or other optimizations it might be improved, possibly a lot. @FroMage might be able to give some more info on that. |
Sure, if the type descriptor could cache its metamodel, performance would improve a lot. It appears that all type meta expressions use the the (undocumented!) toplevel |
This way, if the extraInfo interface changes in the future (see discussion in #17), I’ll only need to update Node and not the dozens of copy() methods as well. Done with the following command: find source/ceylon/ast/core/ -name '[A-Z]*.ceylon' \ -exec \ sed -i \ -e 's|ret\.extraInfo = extraInfo|copyExtraInfoTo(ret)|' \ {} +
The type descriptor performance becomes (almost) irrelevant if we use key objects, by the way, because then we can trivially determine With key objects, we could also speed up the lookup even more by having a constant number of All in all, I’m still in favour of this solution. |
Depending on whether a Bound is used as a lower or an upper bound, its token (< / <=) appears before or after the endpoint, which affects the correct token order. We now record the information whether it’s a lower or upper bound in the node’s extra information (#17) when transforming the WithinOperation, and then use that when transforming the Bound itself.
get() and remove() now take a covariant key. put() can’t be variant since it also returns the previous value, so a new set() is added that takes a contravariant key. CC #17. This can be used to solve the problem from this comment: #17 (comment) Key<out Anything> or Key<in Nothing> are supertypes of any key type.
Both document additional information (#17), but they differed slightly.
This removes thisInstance, superInstance, outerInstance and packageInstance. They should never have made it into the 1.1.0 release in the first place; the assumption that they are stateless was invalidated by the introduction of extra info on nodes (#17). If compilers and other tools attach information to these nodes, it’s actually very dangerous to reuse them, since that information will unexpectedly be shared between instances where the information should be different (worse, neither put() nor set() warn you about this).
I wonder if it would be useful to provide a place in the AST nodes where other tools / modules / programs can attach additional information. For example:
For example, a Ceylon compiler modeled after the current one might use
With ceylon/ceylon-spec#594, we could even have:
With this, you could make
extraInfo
non-optional as long as you provide a valid initial value.The text was updated successfully, but these errors were encountered: