-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STC object file #66
Comments
See src/exm/stc/ic/opt/ICOptimizer.java for how the optimizer pipeline is constructed. In particular, you'll see that it adds an instance of PruneFunctions() in a couple of places. What happens is that it builds the pipeline with all the passes, then they are skipped based on config settings. Then look at src/exm/stc/ic/opt/PruneFunctions.java - this is the one that's causing you problems. (Functions are also pruned in the FunctionInline, but that is definitely disabled at O0). The issue is that getConfigEnabledKey() is returning null - there is no config key that can turn off PruneFunctions. You'll need to add a config key in Settings.java like OPT_PRUNE_FUNCTIONS, add a default value of "true", then add it to get_compiler_opt_name in bin/stc. You can look at a different optimiser pass like FunctionInline.java to see how the options are set up and named. Once you've done that you should be able to give the -F prune-functions option to turn off function pruning. |
Let me know if you hit any blockers here. We use a whole-program compilation model where it just slurps in all the include files and compiles everything at once so I don't think we're going to go to fully separate compilation and linking. But it seems like it should be possible to kludge something so that we can at least smush multiple output files together. I think the proper solution to the pruning problem is to just have multiple root functions that are always retained. Currently it knows not to touch __entry (or whatever it is), shouldn't be too hard to prevent it from touching any functions in the main file under compilation. |
Also, feel free to add in any new command-line options using the -f/-F flags in bin/stc and the config mechanism in Settings.java. This is good if you need to have weird special-case or experimental behaviour conditionally enabled. You could also add the -c flag by modifying bin/stc and using the config mechanism in Settings.java. |
I managed to disable pruning pretty easily by doing what you said. I had to run :D EDIT: |
Nice work. Let's remove function-prune from O2_OPTS to O0_opts - we'll just retain the The issue may be the ordering of flags - if you call disable_opt before On 15 July 2015 at 16:21, Basheer Subei notifications@github.com wrote:
|
ok, I placed function-prune in It's probably the other optimizations that are somehow enabling or performing pruning, like you said in FunctionInline.java. EDIT: |
it's right here. Boy, I'm glad you commented your code! :D |
Shouldn't we remove function pruning from inside function inlining, since pruning is done separately now? Or is this a different kind of pruning (has a different effect)? EDIT: |
Feel free to remove it, it would be nice to reduce the complexity of the I think I added it there because it was easy to do and the PruneFunctions On 15 July 2015 at 17:19, Basheer Subei notifications@github.com wrote:
|
I added a function turbine::get_globals_map in the commit I just pushed to github. This should give you a Tcl dict (https://www.tcl.tk/man/tcl/TclCmd/dict.htm) containing all of the global variables declared. |
I just saw your edit. That's fine if you remove that functionality - if you're going to remove it can you complete remove usage of toRemove so that the code is a little cleaner. |
I just had some thoughts about how to possibly implement the REPL: I think if you're clever with how you use Turbine's task prioritisation, targeting, and data dependencies, you can do some clever things. So I think the key thing is that the rank 0 worker is the one that has to run the actual REPL code and interact with the user. You could actually potentially suspend execution of the REPL until data is available by creating a task targeted back to rank 0 (via turbine::rule) that is dependent on the data. Another thing is that you're going to have to somehow push new compiled code out to workers and ensure that the new code is loaded before those workers try to evaluate any new functions - STC/Turbine freely assume that the same Tcl functions exist on all workers. You could do this by sending out a max priority task, one targeted to each worker. This would for sure run before any new functions. If you wanted to be extra-sure, you could wait to get a confirmation from each worker by creating an adlb variable with write refcount and having each worker decrement the write refcount when it loads the new functions. Then you can have another task dependent on that variable to actually start executing the code you want to run. Anyway, just some thoughts - it may not align with the approach you're taking - but Turbine/ADLB is pretty powerful now if you know some of the tricks. |
I removed all the toRemove calls and uses, but I'm not sure about this if-block here. Can you double-check it? |
Looks safe to remove to me. On 19 July 2015 at 21:40, Basheer Subei notifications@github.com wrote:
|
The new globals_map looks clear enough for me to use. I'll just have to deal with the optimization for now (my REPL works for O2, not for O0. And O2 removes most of the globals for basic scripts). Once I fix that, I'll be able to use the globals_map. |
The REPL breaks for O0? |
yes, only because of the way I currently grep and remove the boilerplate tic code. I'm adding a new compiler setting so that the |
Currently, this is what the REPL does:user inputs swift code --> REPL runs it through stc normally --> greps through generated tic code and removes boilerplate ( This is what I'm planning to make it do:user inputs swift code --> REPL runs it through stc with |
that previous toRemove commit I made had a bug in it (that's what I get for not using an IDE). I fixed it and pushed it again. This time I actually ran the tests and checked it. |
It's worth learning Eclipse - it's a huge time saver with Java. I still On 20 July 2015 at 17:06, Basheer Subei notifications@github.com wrote:
|
I modified STC so it doesn't generate the boilerplate tic code when a I've noticed very strange behavior when the REPL tries to eval the tic code (can I call it .tric code, stands for Turbine-REPL intermediate code? xD). For example, when I use Maybe I have a fundamental misunderstanding with how the whole runtime works. All this REPL code I'm running (that |
I think it's a little tricky to get this to behave right. The way the runtime works is that it invokes mpiexec, which invokes n copies of the Tcl interpreter, each running the .tic file. In MPI each process has a rank from 0 to n-1. There's a setup stage where everything gets initialized and everyone agrees on which rank is doing what. Rank 0 is always a worker and rank n-1 is always an ADLB server, but in-between ranks can be allocated in different ways. After the setup stage, the ADLB servers start running their server loops, the rank 0 worker starts running swift:main and the other workers wait to receive a task. The ADLB servers will stay in the server loop until the whole run shuts down, and the workers will sit in their own loops executing a task (i.e. a fragment of Tcl code that calls a function with some arguments), then requesting another task, and so on. So yeah, workers are the only things that can execute Tcl code. I think one thing you need to do is make sure that the REPL code is executed on rank 0. I believe MPI generally routes STDIN to rank 0 (http://www.mcs.anl.gov/research/projects/mpi/mpich1-old/docs/mpichman-chp4mpd/node13.htm). So if your REPL executes on, say, rank 1, it will never get any input. I suspect this probably has something to do with your O0 problems. There might also be some trickiness with evaluating the modified .tic code - you probably want to evaluate it in the global scope in all functions. There are ways in Tcl to control the scope in which code is evaled. I believe you want something like uplevel #0 $the_command Each worker should be able to execute any Turbine code. They're symmetrical aside from the fact that rank 0 starts execution and that MPI sends input to rank 0. The exception is if you have special executors (like Coasters) which are only on certain worker ranks. |
Well, the thing is, my REPL does get user input, and it compiles it in STC just fine. But when it |
Ohhhhh, I think I get it! Since |
Well, I'm completely stumped. There is no way to make the other ranks call create_globals with the same data. And we can't remove the collective nature of create_globals, since all the ranks must have the same globals (that's the reason for the collective call, right?) |
Oh right... yeah create_globals as-is won't work. I think there are multiple possible solutions, might require some thought about which is best:
|
I think you could probably do something with a pair of functions like:
|
(phew) I thought there was no way around this problem... I think I generally understand where to go from here. But a few questions: 1- if we can create "global" variables using I'm probably going to have tons of questions about how to code certain parts of these, but I'll ask them as I get there. Thanks for all the help! :) |
All good questions.
To create an int with default properties (1r, 1w, not permanent), the args would be: If you change the last arg to 1 it will be permanent.
a) how IDs are allocated (collectively between all workers versus created on-demand). The negative ID space is allocated from -1 downwards whenever create_global is called by all ranks. The positive ID space is divided between servers round-robin and then each server can allocate its next ID to a caller without any coordination with other servers. |
Looking at the generated .tic code, it seems the args used in multicreate are different: Upon taking a closer look at |
I can't figure out how to get STC to add the permanent arg in multicreate. It's in |
not sure if it's a bug, but it seems that even There's no permanent flag set. Is this a bug? (probably not since everything works) |
Yeah it isn't set in the generated code, but in the C implementation of create_globals, the default value for permanent is set to true. https://github.com/swift-lang/swift-t/blob/master/turbine/code/src/tcl/adlb/tcl-adlb.c#L1509 |
Just saw the earlier comment. It looks like you may need to add an additional permanent argument here: I would probably have it allowed to be null and only add the extra arg on if permanent is non-null. This is just to avoid increasing the size of output code with extra parameters when not needed. That part of the code generator is a bit messy - we intend to keep the syntactic details in Turbine.java and the higher-level logic in TurbineGenerator.java but it's not separated well here. |
There's no need to pass the globals around explicitly as part of the task. The global declarations are in the tic anyways. I'm thinking something like this:
|
I guess we'll just assume that the user won't interactively redefine any functions or globals while a task (using those definitions) is running on a worker. Also, we can rename |
Would it make sense to append a special prefix to functions generated from the REPL so that they're unique? IIRC the Tcl function name generation is done in a single spot. I think the background tasks idea is cool but probably quite a bit of work beyond this problem. I think sending the tasks out maximum priority is a good idea regardless. If you did that and omitted the barrier I think it would work ok in 99.99+% of cases since in the current implementation max-priority targeted tasks will be at the head of the targeted worker's queue as soon as the task creation returns to the caller. We don't guarantee this in the API. |
For the issue with the permanent arg, I'll work on that in a separate issue #67 |
the unique prefix sounds like a good and easy solution. I was under the impression that the main purpose of a REPL or interactive Swift/T was so that a user could interactively issue multiple tasks and have them run in the background as they monitor and probe the variable states etc. I guess we'll find out soon enough whether it's feasible or not. Regardless, I'll be working towards that until then. For the max priority task idea: max priority tasks (even if at the head of the queue) cannot stop what the worker is doing, right? Is there a way to notify the worker to stop the current task and move it to the back of the queue (or does it periodically poll the servers for such information)? |
If not, then I guess we could just do away with the barrier, like you said. 99.99% sounds good enough for me (and for whatever basic use cases that apply to interactive Swift scripting). |
You're right, having lots of background tasks would be very useful. Let's see how this solution works out, I think there are other possibilities too if that doesn't work out. There's no way to interrupt a worker, no, since the task is just arbitrary code. The standard worker loop is here, actually: https://github.com/swift-lang/swift-t/blob/master/turbine/code/src/turbine/worker.c#L50 - it just alternates between ADLB_Get and Tcl_Eval. |
It works! The REPL can actually run swift code that spawns tasks on other workers! :D Here's a copy-paste of a shell session: But I messed up on the global variables thing. I misunderstood how it works, and now it creates a copy of the variables in each worker (and you end up with multiple copies). I think I'll make a modified version of |
Awesome work :) |
Also, the second link I posted above shows some errors related to unfilled subscribes (also a bunch of nulls where the variables are supposed to be in the work queue). Any idea what that unfilled subscribes could be related to? |
The nulls are the usual behaviour when there isn't a debug symbol for the variable. It would be nice to have cleaner output for that case but that's how it works right now. The //'s are used to indicate which variable the task was waiting on, e.g. this one is waiting on ID <20> The unfilled subscribe messages are closely related: subscribes were made for <20> and <30> (either a task was dependent on those IDs, or someone manually called ADLB_Subscribe for some reason), but the write refcount of <20> and <30> never dropped to zero so a notification was never sent out and the subscribe was "unfilled". |
So it looks like x, the third argument to printf, was never assigned. |
hmm... sounds like it's caused by my multicreate problem. I'll work on it next and see if it changes the results. |
I may have found a solution for being able to compile Swift scripts into "object" files and then "linking" them, at least for global variables in multiple scripts. The nice thing about global variables, is that STC generates the same names for them in the tic code. For example, every script that declares a global called Example solution:
This seems to work fine in the REPL use case, as we can implicitly add all global variable declarations (using extern) to every line of user input in the REPL. For example (given the above solution):
The one thing I'm worried about is adding the |
I've also been thinking about complex use cases of
You can have arbitrarily complex scenarios where figuring out which script depends on which variable definitions is non-trivial. Should I altogether ignore this issue for now? |
Ignore deadlock- that kind of case would be problematic in Swift anyway. |
Are you looking at editing the ANTLR grammar to support the extern keyword? |
I assume I would have to, right? |
Yes. |
so, am I wrong in assuming that the ExM.g file contains the actual ANTLR grammar, while the ExMLexer and ExMParser are generated based on that grammar? Is there a toolset I need to install to do that generating? |
That's all correct. The existing Ant build script manages everything and allows for changes to the grammar. |
update:
|
STC can now compile However, I need to change STC to not check for undefined variables when it's an |
STC no longer gives an undefined var error when dealing with externs. But I realized the extern Tcl statements are generated in the wrong place. Instead of generating them in the global namespace in the tic file, they have to be generated inside |
I just fixed it to generate the extern declaration in |
Summary
It would be great if STC could compile separate Swift script fragments and sort of "link" them together and run it. Example: f.swift contains definition for function f(), while main.swift calls f() and prints result to screen. We would like STC to be able to compile f.swift and main.swift separately into f.tic and main.tic (where f.tic doesn't have any turbine startup or shutdown boilerplate code, since it's only a function definition), and then being able to run main.tic (which calls f() defined in f.tic).
Purpose
Potential Issues
__entry:if1
). Does this also happen when STC optimization is on?Current Tasks
-c
flag so that it modifies the turbine code generating (essentially removing the boilerplate code)-O0
no optimizations, as well.adlb::create_globals
won't work since it's a collective call.extern
keyword functionality to ANTLR grammar parser and STC code generator.extern
declared variables.The text was updated successfully, but these errors were encountered: