Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid killing the register allocator with the initialization functions in flambda #1401

Merged
merged 2 commits into from Jan 21, 2018

Conversation

Projects
None yet
7 participants
@chambart
Copy link
Contributor

chambart commented Oct 5, 2017

With flambda, large modules can generate large problems, for the register allocator, where every register is in conflict with every other one. The initialization code is a long sequence of independent assignments from one symbol field to one another. This does not generates any conflict in this form, but the problems appear when CSE shares the various symbol address loading.

This PR propose to sidestep the problem by seriously reducing the size of that code sequence. Usually, in a module, most of the fields are constants (mainly closures). Hence these fields do not requires to be initialized, but can be set to the right value from the beginning. In the stdlib for instance there are never more than 9 fields per module that needs to be initialized.

That should also reduce a bit the code size, but I didn't measure it.

(related mantis PR: https://caml.inria.fr/mantis/view.php?id=7630)

@gasche
Copy link
Member

gasche left a comment

I reviewed the code and it looks correct to me. I'm not knowledgeable enough to "approve", though, and would wait for someone with opinions about the backend -- and who knows something about possible issues with the proposed compilation strategy.

From an implementation point of view, there is a design choice to define a new type Clambda.uconst_block_field to characterize what precisely can be used in pre-initialized fields, and it is not clear to me what is gained compared to just using Clambda.data_item directly (one would think that maybe the code producer could be extended later to pre-initialize more kind of data items, and that the current design would then require to change the custom type). Are there some data_item values that it would be morally wrong to put in those fields?

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Oct 5, 2017

Well there isn't a Clambda.data_item type; So I suppose you are either refering to Clambda.uconstant / Clambda.ustructured_constant or Cmm.data_item.

the closest type is uconstant which allows to define a constant with the symbol. This is not something that is useful here and would require a bigger change to handle correctly.
The type ustructured_constant can't represent symbols.
and Cmm.data_item does not really works the same way. These are basically assembly directives lifted to cmm and some wouldn't have any meaning in this context.

@lpw25

lpw25 approved these changes Oct 5, 2017

Copy link
Contributor

lpw25 left a comment

This looks correct to me.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 5, 2017

You need a Changes entry (preferably acknowledging the input of the Coq people involved: Emilio Jesús Gallego Arias, Pierre-Marie Pédrot and Paul Steckler).

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 5, 2017

If I understand correctly this will go into OCaml 4.06 ?

[Asking because this is the last showstopper for us to recommend flambda usage in Coq 8.7]

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 5, 2017

@ejgallego, would you be able to test that Coq-with-flambda indeed works with this patch applied?

For this right now the easiest way may be to clone the 4.06 branch from the git repo and apply 1401.patch directly (and then use my opam-compiler-conf script, if you know about it). But I think in a few hours or tomorrow you should be able to install the 4.06.0+pr1041 switch from the opam-pr-repository repository overlay.

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 5, 2017

I will be able to test this on Monday; would that work for you?

[I usually defer this kind of testing for when I have access to my reasonably beefy desktop instead of my 5yo laptop, but I can try testing on the laptop if you need confirmation by tomorrow]

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 5, 2017

Monday sounds fine.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 5, 2017

(I have not confirmed about whether this will go into 4.06 as I'm not able to give any guarantees, but I think this is somewhat orthogonal to your test results.)

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Oct 5, 2017

Thanks for the quick review. Here is the Change. I'll wait for Coq's guys report to merge.

@gasche gasche added this to the 4.06.0 milestone Oct 5, 2017

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 9, 2017

I'm much afraid that I couldn't test this due to 4.06.0 being ahead of camlp5's copy: camlp5/camlp5#11 @roglo any ETA to regain compatibility?

By the way, testing with the opam-pr repos doesn't work as it doesn't offer +flambda switches.

@roglo

This comment has been minimized.

Copy link
Contributor

roglo commented Oct 9, 2017

New version 7.02 of camlp5 released, compatible with ocaml 4.06.

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 9, 2017

Great thanks!

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 9, 2017

So Coq builds fine with this patch and with 4.06, however the problematic file doesn't seem to improve at all. Have you folks tried with the example posted in Mantis?

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Oct 13, 2017

I didn't try the Coq generated file. I can try If you provides the context of how to build it.
Otherwise, the large file with lots of arrays is expected not to get any better with that. The declared values are built at run-time and can't be preallocated.

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 13, 2017

I didn't try the Coq generated file. I can try If you provides the context of how to build it.

You can just build Coq with a flambda compiler; that is usually straighforward, the problematic file is called OrdersEx.v. Otherwise, I can upload a self-contained file in a few days.

@gasche gasche modified the milestones: 4.06.0, 4.07-or-later Oct 13, 2017

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Oct 13, 2017

This PR does not solve the Coq issue, but it is still an improvement for the flambda backend that has been reviewed and approved. I think it would be natural to merge it in trunk.

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Oct 19, 2017

@ejgallego how do you dump the generated .ml files and how is it built ?

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 19, 2017

@chambart -debug passed to Coq should do the trick. You can use make VERBOSE=1 to build Coq until you hit the problematic file. Options passed to Ocaml follow the logic here

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 19, 2017

I had uploaded the bad file to a host, but it turned out to be a non-free one, sorry for that. A more stable hosted example is now at https://x80.org/tmp/bad.ml

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Oct 27, 2017

I did a few more tests. I realized that in that specific problematic file, -Oclassic does not behaves that well and a lot of the cases expected to be constants are not due to some missing dead code elimination. It works a lot better with the default situation (O1). It is still too much, ~half the definitions are still not constants and 4GB of memory is not sufficient to compile. My next suggestion would be to use linscan for the toplevel initialization function which is almost never performance critical.

@ejgallego

This comment has been minimized.

Copy link

ejgallego commented Oct 27, 2017

Thank you @chambart , Coq's case is a bit particular as when to recommend -linscan to native_compute users as it will depend on the depth of the used module structure, which may not be obvious at all. I guess we will avoid too much file-specific flags for the moment and just assume users have enough memory.

@mshinwell

This comment has been minimized.

Copy link
Contributor

mshinwell commented Dec 6, 2017

@chambart Let's decide whether to merge this now or to just close it. I think given it didn't solve the Coq problem, that we have a complete re-implementation of the language for statically-allocated values in Flambda 2.0, and that this GPR does not touch only Flambda then my inclination is to close without merging.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Dec 7, 2017

I'm fine with either decisions but I'll note that the PR was carefully reviewed by both @lpw25 and myself, so I think it would also be perfectly fine to include it in trunk -- which may be useful if Flambda 2.0 is not merged right away. (It does not solve Coq's issue, which are caused by unnatural code produced by Coq extraction, but it does solve an instance of this issue and speedup compilation time for programs written by humans which use flambda.)

@mshinwell

This comment has been minimized.

Copy link
Contributor

mshinwell commented Dec 7, 2017

@lpw25 agrees with you @gasche , so let's merge it. @chambart : could you rebase?

@chambart chambart force-pushed the chambart:partial_toplevel_constants branch from 0cd3e10 to faf246a Dec 7, 2017

@chambart chambart changed the base branch from 4.06 to trunk Dec 7, 2017

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Dec 7, 2017

Here it is on trunk.

@gasche gasche closed this Dec 8, 2017

@gasche gasche reopened this Dec 8, 2017

@damiendoligez

This comment has been minimized.

Copy link
Member

damiendoligez commented Jan 19, 2018

@gasche @chambart any reason why this wasn't merged last month?

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Jan 19, 2018

I forgot about this. I guess that we had a CI error at the time -- that would be consistent with the close-reopen action. @chambart needs to rebase and I think he can merge. (I think this is complementary with the disable-cse appraoch taken in the other PR.)

@chambart chambart force-pushed the chambart:partial_toplevel_constants branch from faf246a to 1088e1f Jan 21, 2018

@chambart

This comment has been minimized.

Copy link
Contributor Author

chambart commented Jan 21, 2018

Rebased, and the CI is happy. I merge.

@chambart chambart merged commit b4db864 into ocaml:trunk Jan 21, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.