-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explicit Shadowing of Functions/Classes/Vars #259
Comments
At the risk of exposing my large lack of understanding, I just want to share here that according to a certain prominent LLVM developer, trying to get around the One Definition Rule inevitably ends up with getting sucked into a singularity. However, in a REPL, there should be a way to implement redefinition behavior by making everything a compound statement. If I understand correctly, that would transform something like int a = 0;
void hi() { std::cout << "hi" << std::endl; }; into {
int a = 0;
{
void hi() { std::cout << "hi" << std::endl; };
// following statements nest here
}
} Which means that upon subsequent redefinition of hi, a separate new "thread" could be constructed, so that the following: int a = 0;
void hi() { std::cout << "hi" << std::endl; };
void hi() { std::cout << "hi again" << std::endl; }; would be translated in cling to: {
int a = 0;
{
void hi() { std::cout << "hi" << std::endl; };
}
{
void hi() { std::cout << "hi again" << std::endl; };
// NOW following statements nest here instead
}
} Would this be a feasible way of implementing this behavior in cling? It would clearly not be maximally performant with a lot of statements, but at least allowing such behavior optionally, it would allow people to experiment with it. |
I'll own up to my terrible ideas. ;] I don't use Cling, but I'll hypothesize a REPL interface by showing lines after I was focusing on non global variables which may or may not be what you want to focus on, but would suggest:
transforms into:
This allows shadowing naturally. However, it assumes you are operating at block scope... And that in turn means you probably aren't defining functions. And even if you are, shadowing may not work as well for that as those go into the enclosing namespace, not the block. If you are working at block scope, and you define lambdas instead of actual functions, then this technique works fine. Let's consider other scopes briefly.... If you are working at global scope or namespace scope, then inline namespaces might be used to allow this kind of chained shadowing:
(Compiler Explorer link: https://godbolt.org/z/VhCnX4) I can't come up with any useful semantics for allowing shadowing within a class body's scope, nor is there any syntax I can think of to do this. |
Also worth noting that you can just use nested anonymous namespaces for the namespace scope:
But it may make some things harder (and some things easier) due to not having a reliable mangled name. The key is that by using increasing layers of nesting you can shadow things "safely". |
Thank you very much for your ideas. This issue is really about finding suitable workflows for an iterative programming style as people are used to, e.g. in Jupyter. Maybe we can already do it, maybe we need to extend cling, well actually C++ for a REPL, for it. Anonymous and inline This would then be similar to rolling back the state of cling, which one could already to already. But this is then still "linear" programming (evaluating all input instead of the latest added block) contrary to the nice "jumping back and forth" users love with Jupyter cells. |
@egpbos if I understand your examples correctly, all declarations are scoped and could not be called (functions) / used (vars) from the underlying "global" scope. But the drafted |
I'm suggesting that the linear sequence of things in the notebook is mapped into the nested structure. You'll always want to evaluate from top to bottom, but you can trigger that as users jump around the notebook and make changes. The idea is to use a chain of nested scopes to represent the sequence in either the interpreter or the notebook. This essentially is just an implementation strategy for shadowing. |
cc @SimeonEhrig any thoughts on this? :) |
I definitely know nothing about how For example: double (*foo)(double);
foo = static_cast<double (*)(double)>([](double d){ return 1.0; });
foo(1.0) // returns 1.0
foo = static_cast<double (*)(double)>([](double d){ return 2.0; });
foo(1.0) // return 2.0 If the function signature is different, then it won't be a problem, right? For redefining classes, maybe it is a good idea to use macros?? // First definition of A
#define A underlying_A1
class A { ... };
// Refine A
#undef A
#define A underlying_A2
class A { ... };
// For any class, using directive, function that relies on class A,
// turn them into a template.
// A class depend on A
template <class _A = A>
class ... { ... }; |
Thanks for your thoughts! Function pointers: yes, a modern C++ way instead of function pointers that can be re-defined at runtime would likely be lambdas and #include <any>
#include <functional>
#include <iostream>
using namespace std;
int main()
{
any f = function<bool (double)>([]( double x ) { return x < 5.0; });
cout << any_cast<function<bool (double)> >(f)( 3.0 ) << " ";
f = function<int (int)>([]( int x ) { return x * 2.0; });
cout << any_cast<function<int (int)> >(f)( 42 );
return 0;
}
// This C++17 example allows to store a changing function signature in the same variable name:
// bool function and int function
// "1 84" (True, 84)
// One would write in a Jupyter/Cling cell the assignment to that std::any variable,
// which can be called multiple times. I wonder if we can force inlining with cling here. Macros & Classes: I don't fully understand how the shown macros might help here, but deriving as you suggest CRTP-style as a wrapper and constantly increasing the template argument, maybe in combination with the Too bad we don't have inline, anonymous class definitions for template arguments. |
@ax3l I don't think However, I do think a special callable class with a variadic function call overload that wraps all the overloaded version of the function should be written to enable overload. A possible implementation will be using a hash table that uses Whenever a function call is made, a search is done to find whether that overload is available or not. If it i available, then it is called; otherwise, an exception is thrown. This is the best way I can think of since I don't know anything about the compiler. Maybe somebody who understands |
@ax3l About overloading the class, maybe it can be done by defining a macro that will be substituted with the real class name, and when it is overloaded, But regardless of how it is implemented, all |
Sorry guys, just forget about the post before the previous post I posted. That method I mentioned can't deduce the return type, so it is probably better to introduce compiler instrinct in |
Awesome idea, @chandlerc - we never came up with nested unnamed namespaces! Our problem is people doing
Here,
Currently, our only known and theoretically possible solution is to rip declarations out of the AST and CodeGen, and to make the corresponding symbols inaccessible to RuntimeDyld's symbol lookup. That's super awkward because it's nothing a compiler would ever want to do. It works okay for some, and does not work for other declarations (here with ROOT ):
For now this is what cell-replay / -unloading should be based on. My main concern is that this approach is an uphill battle with any future clang changes, needs a compiler expert for maintenance and is thus a sustainability issue. But barring any better approach we will have to invest dev time to fix the remaining issues, or at least the major ones. |
I agree with @Axel-Naumann . At the moment, I see a verification problem at the C++ statements, which should to be used to allow redefinition. I'm not sure, if anybody can verify, that the semantic of the source code isn't changed beside the deliberately redefinition. I think manipulating the AST is most safety way, to keep the semantic. In this case, I also agree with Axel, it is a really basic feature, which needs a lot of modification of cling, clang and llvm code and for this, we need a compiler expert. |
Still not sure why you can't map this to nested namespaces to model the incremental additions.... You can even inject You could even put the current iteration into the cling prompt, and tie that to the nested namespace allowing people to reference specific Consider something like:
This could be implemented by synthesizing the following C++:
|
Thank you, @chandlerc and @SimeonEhrig ! When writing out a value we ask "what's your type", and the answer would be Regarding the overload issue:
would do. Now we're left with other mangling-style issues, e.g. explicit specializations:
For this one we could try to detect specializations, through pre-parsing. And then what? :-) In general the fundamental difference between prompt-parsed AST nodes and code injected through files is worrisome, to me. We certainly cannot move file-based code into Given that this is such a pivotal point for the future / maintainability of cling I will discuss this with a couple of experienced interpreter people (@vgvassilev and @pcanal for instance) and come back with a potentially more cohesive reply. Thanks everyone for sharing your ideas! |
Just to add to the example above, could we be auto- namespace {
int f(int) { return 1; } // <- user input
namespace {
// let cling pull in all known declarations automatically:
using ::f; // <- automatic "reflection" of :: or previous namespace?
// or just pull it in here, if an `f` in `::` exists already:
const char* f(const char*); // <- user input
// update: overwrites don't work when using is always done...
// int f(int) { return 2; } // <- user input
namespace {
int i = f(42); // sees f(int) now as well // <- user input
}
}
} Update: hm, seems that's basically what Chandler said ;-) |
No, you shouldn't do that IMO. The user needs to decide whether they want to define a new
|
I just have a question with this solution. Consider the following case: inline int F()
{/* ... */}
// F2 depends on F
// and it is beneficial to inline the call
void F2()
{
auto ret = F();
// ...
} If the |
And also consider class A {};
class B: A {};
template <class A_template_arg = A>
class C: A_template_arg {};
// Redefinition of A
class A {/**/};
|
And how to handle if the user creates an object to them? A a;
int (*fp)() = &F; If If not, how to change them? |
@NobodyXu If something uses a previous definition then subsequent re-definitions will not change that. Very similar to
I don't find that surprising behavior. @chandlerc I don't think our users will appreciate the need to explicitly pull in previous declarations to create overloads. But as I said - we'll discuss, I'll come back with something more substantial than my immediate reactions here. |
The way I would phrase this is slightly different. When you get asked "what's your type", I think the question should be "what is the shortest name I could use to refer to that type". Because you conceptually never close one of these nested namespaces, these prefixes aren't necessary. I would suggest using just the base name unless the name has been shadowed (which you can query out of the AST). When this fails, I would suggest using the namespace alias directly to the step: Put differently, we can think of this as a nested-namespace rewrite step to pick a minimal way to unambiguously reference the type from the current nest level, with added use of the namespace aliases introduced in the process.
Through out this, you seem to be using Anyways, for explicit specializations, I don't think there is anything you can do w/o changing the language rules. I'm also not sure this is a bad thing, as it seems like a real edge case. What I would suggest is that you make the fact that these aren't the same namespace a bit more explicit in the model of the interpreter and let the user override it to handle cases where you have to use a particular namespace. As an example, you could have special repl syntax that escapes the nesting sequence and instead allows the user to add some code to an explicitly specified namespace, potentially a prior one, thereby allowing them to explicitly specialize a thing. I also think it might make sense to make it explicit in the REPL when you "step" to a fresh scope, allowing folks to easily write more than one component in a single scope within the REPL. You can reflect this cleanly in the prompt so users are aware of their context and have the control needed to complex things while the simple/easy things remain simple and easy. The other nice thing is that it lets you explain and document all of this in terms of the language without having to invent and document special rules.
|
Oh, one thing that the explicit model of the scope-transition in the repl would help with is overload sets. It becomes easy to build those up w/o tons of using declarations, you just use "shift enter" or whatever the interactive facility is to put multiple things into a single step's scope before returning to the fresh-scope-each-time step mode, letting you accumulate an overload set, etc. |
Seems like this has now been successfully developed, really awesome work by @jalopezg-uc3m et al.! https://dl.acm.org/doi/abs/10.1145/3377555.3377901 So probably this issue can be closed? |
Absolutely, added in root-project/root#4214 and it's amazing! Thank you all for your input and @jalopezg-uc3m in particular for the implementation! Congrats on the paper! |
As mentioned at the ROOT User's Workshop 2018, many people using Cling or Cling in Jupyter for interactive workflows struggle with the following problem. When interactively defining and improving functions, classes and variables, the state of the interpreter is in the way of fast iteration of prototypes.
Let's take the following example:
If a user now wants to change
foo()
, they will just add it again, e.g. asYou can also imagine this in a Notebook: a users just scrolls up, changes and re-evaluates the cell.
Of course that is not working right now, and throws something like:
Possible Approach: Active Shadowing
Implementation?
I know that this is asking for a dynamic type system, but only for explicitly selected functions/classes/vars, in Cling C++. I have absolutely no idea, how one could efficiently implement such re-definitions but think it would be an important workflow for the code in cling's
main()
wrapper. I am aware that problems, like indirect calls frombar
that is not explicitly re-defined whenfoo()
is shadowed, can be very challenging, e.g. iffoo()
was inlined intobar
.If such an implementation is only possible under side-conditions, e.g. that
main()
-defined andallow_shadowing
attributed functions can not be inlined, that is probably fine as well. As a user, I don't expect the most efficient code from my control logic that I am interactively typing (between notebook cells), but from the code that is used from external imports.Alternatives
For functions represented as lambdas and variables one could probably establish work-around workflows by pushing them into a C++17
std::any
variables. For re-defining classes I have no good idea at the moment besides resetting the state of the interpreter / restarting the jupyter kernel.The text was updated successfully, but these errors were encountered: