-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dearging pass on lambda box #29
Conversation
fix fragmented sentence
This adds a pass that analyzes and removes unused function parameters and unused data type parameters on lambda box with an accompanying correctness proof. A few things to note: * The analysis is not currently proven correct, but the spec required from the analysis is boolean so checking it is easy. However, in the future I want to eventually prove it correct. * The correctness statement requires the program to be eta expanded. The pass should still be correct even when it is not, as it will eta expand on its own, but there is not formal statement proven about this situation.
Hmm, is the CI working? The error does not seem related to any of my additions, AFAICT. |
I don't think Travis is properly set up, and I didn't start configuring GitHub Actions yet. The issue is that we don't directly rely on So atm the PR merge process is that somebody (mostly Zoe) compiles on her machine to verify that everything indeed works. I can try to set up a basic GitHub actions CI this afternoon based on submodules for and opam for Coq. |
Hi @jakobbotsch, thanks for the PR! I will have a deeper look sometime in the next days to see what's the effect on the final program. BTW I changed the base branch to dearg for now. Indeed, CI was not set up correctly and we are using the sophisticated method that Yannick described :). Yannick just set up github actions and I disabled Travis for now. |
Hi @jakobbotsch, I was just having a look at this (I’m sorry it took so long) and here’s what I dicovered: I was looking at the sha_fast benchmark from lib/sha256.v, for which the generated file is smaller when your transformation is enabled.
I also want to understand why your transformation appears to increase the size of the generated code for the VeriStar benchmark (vs_easy and vs_hard). By the way, in order to compare the diffs, I looked at λANF code (which can be pretty printed with the command CertiCoq Show IR), as the generated C files are very large and it’s very hard to find our what is going on. I changed (locally) the show_IR function in pipeline.v so that it calls your dearg phase. |
So I fixed the bug in our dead_param_elim and 1 and 3 do not apply anymore, since CertiCoq removes all non-relevant arguments from functions. Overall dearg is beneficial as it removes propositional arguments of constructors (e.g. Mkint in the sha_fast bechmark, Mkt and PureClause in vs_easy). In some cases, it prevents some trivial free variables from being allocated in closure environments resulting in smaller closures. Some functions seem to be inlined when dearg is on. There are some functions seem to only appear in dearg’ed code, most likely because of eta-expansions done by dearg. That might also (partially) explain the larger files sizes in some cases. There’s something else that I don’t fully understand. In vs_easy/vs_hard with dearg on, the id function ends up in a lot of closure environments (although functions without free variables should never end up in closure environments). Most likely it comes from some partially applied higher-order function that gets inlined later. The best way to figure out what is going on is to print λANF code before inlining/closure conversion/hoisting but currently the flag that does this conflicts with the dearg flag (both are -dev X) so I didn’t try that yet. For reference here’s the sizes of affected files. vs_easy w/ dearg 55641 vs_easy w/o dearg 54971 diff 670 (I’m comparing only the default CertiCoq configuration which produces the most optimized one.) |
Great, thanks for the analysis @zoep! It does indeed seems like there are limited benefits on the final code compared to your default pass. I suppose the main benefit then is that our pass can live earlier in the pipeline and that it manages to do something on constructors too. But I'm not sure if integration is worth it for this, that's up to you.
I imagine this is because the type parameters are inside the fix point (instead of outside)? If so, indeed our pass does not touch fixpoints.
That makes sense. It is not always easy to know when to remove some dead parameters, eg. it might be beneficial to avoid removing a dead argument if the function is often partially applied where an eta expansion would be required. Anyway, thanks again for the analysis! |
Hi @jakobbotsch, No problem, it was fun investigating! Thanks for the work in adding dearg to the pipeline. The benefits are that 1.) dearg is verified (unlike our constructor parameter elimination -- afaik there are no plans to prove it correct in the near future) and 2.) it also removes propositional arguments of constructors. I think that both of these are useful features. The reasons why I would't suggest integrating at this point are:
|
Hi Zoe, thanks for the clarifications. Just to answer your questions:
This is indeed true for the proof, although the implementation does do eta expansion on the fly if this is necessary. We deal with this using a certifying eta expansion pass on the template AST, and there was some talk with the MetaCoq folks about trying to verify an eta expansion pass on the AST, although I don't think this is happening yet.
Indeed, the analysis as it is implemented in this PR uses only the body of functions to determine whether a parameter can be removed. This means that if the function is partially applied somewhere it will potentially need eta expansion if the missing arguments are unused in the function. The analysis could relatively easily be updated to take this into account. In any case I understand your reasons for not integrating it at this point. I think that the unverified eta-expansion is the biggest blocker currently since the analysis would not be hard to improve. For now I will close this and then we can revisit it if/when we get an eta-expansion pass that is proven correct in the deep embedding. |
This adds a pass that analyzes and removes unused function parameters
and unused data type parameters on lambda box with an accompanying
correctness proof. A few things to note:
from the analysis is boolean so checking it is easy. However, in the
future I want to eventually prove it correct.
pass should still be correct even when it is not, as it will eta
expand on its own, but there is not formal statement proven about this
situation.
We integrated it in the makefiles/pipeline in a quick and dirty way to be able to test things which should probably be improved upon. If you have any feedback on how you'd like this to be done then please let me know. There are also some scripts that we should probably delete. Feel free to make any changes to the PR that you'd like.
cc @zoep