New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flambda manual chapter #503

Merged
merged 1 commit into from Apr 1, 2016

Conversation

Projects
None yet
8 participants
@mshinwell
Contributor

mshinwell commented Mar 11, 2016

User documentation about Flambda for the manual.
If you would like to read this without building the manual, there is a temporary link here:
https://ocaml.janestreet.com/ocaml-core/flambda_manual/

This doesn't constitute developer documentation. That is intended to be provided by the source code itself and the comments in that code. We will try to do some more improving of comments before the 4.03 release.

In this GPR there is a hack in fix_index.sh (so it would work for me) which I shall revert before merging this.

@mshinwell mshinwell added this to the 4.03.0 milestone Mar 11, 2016

\item[{\bf Arbitrary effects:}] All other expressions.
\end{options}
There is a single classification for coeffects:

This comment has been minimized.

@bobot

bobot Mar 11, 2016

Contributor

Since there is Arbitrary effects for effects, it is strange that there is a single classification for coeffects. Can't you add Arbitrary coeffects?

@bobot

bobot Mar 11, 2016

Contributor

Since there is Arbitrary effects for effects, it is strange that there is a single classification for coeffects. Can't you add Arbitrary coeffects?

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

There has not yet been a need, and the manual is following the code here.

@mshinwell

mshinwell Mar 11, 2016

Contributor

There has not yet been a need, and the manual is following the code here.

algorithms in loops, it may be found that the optimiser entirely
elides the code being benchmarked. This behaviour can be prevented by
using the {\tt Sys.opaque\_identity} function (which indeed behaves as a
normal OCaml function and does not possess any ``magic'' semantics). The

This comment has been minimized.

@bobot

bobot Mar 11, 2016

Contributor

normal OCaml function -> normal OCaml identity function just to highlight that the type is indeed 'a -> 'a.

@bobot

bobot Mar 11, 2016

Contributor

normal OCaml function -> normal OCaml identity function just to highlight that the type is indeed 'a -> 'a.

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

I think I prefer the existing wording. The point being made is not that it's an identity function, but that it is specifically not "magic".

@mshinwell

mshinwell Mar 11, 2016

Contributor

I think I prefer the existing wording. The point being made is not that it's an identity function, but that it is specifically not "magic".

to add type annotations that claim some mutable value is always immediate
if it might be possible for an unsafe operation to update it to a boxed
value.

This comment has been minimized.

@bobot

bobot Mar 11, 2016

Contributor

At OUPS @chambart presented other unsafe operation that flambda (rightly!) breaks. At the end my understanding is that if you want to use Obj.* function to mutate (set_tag, set_field, truncate, ...) an OCaml value you should allocate it with make_block. Any other use is too dangerous. Perhaps something can be added about that.

@bobot

bobot Mar 11, 2016

Contributor

At OUPS @chambart presented other unsafe operation that flambda (rightly!) breaks. At the end my understanding is that if you want to use Obj.* function to mutate (set_tag, set_field, truncate, ...) an OCaml value you should allocate it with make_block. Any other use is too dangerous. Perhaps something can be added about that.

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

Perhaps "write to any value" isn't clear enough. I will clarify that to encompass some of these other operations.

@mshinwell

mshinwell Mar 11, 2016

Contributor

Perhaps "write to any value" isn't clear enough. I will clarify that to encompass some of these other operations.

@bobot

This comment has been minimized.

Show comment
Hide comment
@bobot

bobot Mar 11, 2016

Contributor

This chapter is very interesting to read, thank you!

Contributor

bobot commented Mar 11, 2016

This chapter is very interesting to read, thank you!

that the code size along the hot path is kept smaller, so as to increase
locality.
The inliner is directed using attributes.

This comment has been minimized.

@Octachron

Octachron Mar 11, 2016

Contributor

It might be nice to update the description of built-in attributes in exten.etex to list all attributes used by the inliner in order to have an exhaustive list of built-in attributes in one place.

@Octachron

Octachron Mar 11, 2016

Contributor

It might be nice to update the description of built-in attributes in exten.etex to list all attributes used by the inliner in order to have an exhaustive list of built-in attributes in one place.

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

I've added that to the list of things to do.

@mshinwell

mshinwell Mar 11, 2016

Contributor

I've added that to the list of things to do.

produces
smaller {\tt .cmx} files, shorter compilation times and code that probably
runs rather slower. When using {\tt -Oclassic}, only the following options
described in this section are relevant: {\tt -inlining-report} and

This comment has been minimized.

@edwintorok

edwintorok Mar 11, 2016

Contributor

Do I get an error if I try to mix -Oclassic with non-revelant inlining options?

@edwintorok

edwintorok Mar 11, 2016

Contributor

Do I get an error if I try to mix -Oclassic with non-revelant inlining options?

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

Probably not, unfortunately. I've added a note in the code to fix this, but it probably won't be for 4.03.

@mshinwell

mshinwell Mar 11, 2016

Contributor

Probably not, unfortunately. I've added a note in the code to fix this, but it probably won't be for 4.03.

As such, $n$ behaves as the ``maximum depth of unrolling''.
\end{options}
A compiler warning will be emitted if it was found impossible to obey an

This comment has been minimized.

@edwintorok

edwintorok Mar 11, 2016

Contributor

Should I expect to see warnings like this when using the stdlib, i.e. are any of the functions in the stdlib annotated with these?

@edwintorok

edwintorok Mar 11, 2016

Contributor

Should I expect to see warnings like this when using the stdlib, i.e. are any of the functions in the stdlib annotated with these?

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

I don't think there are any such annotations. When there are such annotations I don't think they should be producing the warning in any case.

@mshinwell

mshinwell Mar 11, 2016

Contributor

I don't think there are any such annotations. When there are such annotations I don't think they should be producing the warning in any case.

@edwintorok

This comment has been minimized.

Show comment
Hide comment
@edwintorok

edwintorok Mar 11, 2016

Contributor

Impressive amount of work (both flambda and this manual).
There are some things that are unclear, perhaps these could be briefly mentioned in the manual as well?

  • What happens with bounds checks?
    Does simplification eliminate them when it is statically known that they are never exceeded?
    And would this only work for constants or symbolic expressions as well?
    E.g. if I have a loop from 0 to String.length s - 1 and access each character in the string, will Flambda be able to remove the bounds check?
  • Can Flambda use assert false, or assert cond as hints for its optimizations?
  • Would dead code elimination allow Flambda to skip optimizing functions that are not used inside modules?
    E.g. if a module is created by applying functors, and not all functions are used, and module is not visible outside of the .ml
  • Is it possible to defer Flambda to link time and avoid generating native code (i.e. GCC/Clang -flto)?
    I would guess no, although considering that "inlining of functions from previously-compiled units will subject their code to the optimisation parameters of the unit currently being compiled, rather than
    those specified when they were previously compiled." could I tell Flambda to recursively recompile all functions used based on their Flambda form in the .cmx at link time?
    Of course there is a tradeoff here: optimizing/generating code at link time is not easily parallelizable, OTOH it can avoid optimizing/generating code for functions/modules that are not used at all.
Contributor

edwintorok commented Mar 11, 2016

Impressive amount of work (both flambda and this manual).
There are some things that are unclear, perhaps these could be briefly mentioned in the manual as well?

  • What happens with bounds checks?
    Does simplification eliminate them when it is statically known that they are never exceeded?
    And would this only work for constants or symbolic expressions as well?
    E.g. if I have a loop from 0 to String.length s - 1 and access each character in the string, will Flambda be able to remove the bounds check?
  • Can Flambda use assert false, or assert cond as hints for its optimizations?
  • Would dead code elimination allow Flambda to skip optimizing functions that are not used inside modules?
    E.g. if a module is created by applying functors, and not all functions are used, and module is not visible outside of the .ml
  • Is it possible to defer Flambda to link time and avoid generating native code (i.e. GCC/Clang -flto)?
    I would guess no, although considering that "inlining of functions from previously-compiled units will subject their code to the optimisation parameters of the unit currently being compiled, rather than
    those specified when they were previously compiled." could I tell Flambda to recursively recompile all functions used based on their Flambda form in the .cmx at link time?
    Of course there is a tradeoff here: optimizing/generating code at link time is not easily parallelizable, OTOH it can avoid optimizing/generating code for functions/modules that are not used at all.
Consult the {\em Glossary} at the end of this chapter for definitions of
technical terms used below.
\section{Command-line flags}

This comment has been minimized.

@edwintorok

edwintorok Mar 11, 2016

Contributor

Do these flags only have effect at compile time, or should they be used at link time too?

@edwintorok

edwintorok Mar 11, 2016

Contributor

Do these flags only have effect at compile time, or should they be used at link time too?

This comment has been minimized.

@mshinwell

mshinwell Mar 11, 2016

Contributor

Theoretically they have an effect at link time because they apply to the code in the startup file. However in practice that's probably not relevant.

@mshinwell

mshinwell Mar 11, 2016

Contributor

Theoretically they have an effect at link time because they apply to the code in the startup file. However in practice that's probably not relevant.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Mar 11, 2016

Contributor

The last two questions are in fact related, so let's address those first. There isn't any support at the moment for saying things like "function foo isn't exposed in the interface of this module and isn't used inside, so we'll delete it before optimisation". It's possible something like that could be thought about in the future though.

I have plans for a whole-program mode which will recompile at link time based on the intermediate representations in the .cmx files (with all .cmx files being required). We could have a flag to suppress normal object file generation when planning to link in that mode. The main advantage this will give is that dead code elimination will happen at the Flambda stage basically automatically. Proposals for other means of large-scale dead code elimination are highly involved.

Contributor

mshinwell commented Mar 11, 2016

The last two questions are in fact related, so let's address those first. There isn't any support at the moment for saying things like "function foo isn't exposed in the interface of this module and isn't used inside, so we'll delete it before optimisation". It's possible something like that could be thought about in the future though.

I have plans for a whole-program mode which will recompile at link time based on the intermediate representations in the .cmx files (with all .cmx files being required). We could have a flag to suppress normal object file generation when planning to link in that mode. The main advantage this will give is that dead code elimination will happen at the Flambda stage basically automatically. Proposals for other means of large-scale dead code elimination are highly involved.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Mar 11, 2016

Contributor

I think the answer about bounds checks is that they won't be eliminated statically since they are expanded after Flambda. (I'm not completely sure about this, but I think it is the case.)

There isn't any special handling based on "assert".

Contributor

mshinwell commented Mar 11, 2016

I think the answer about bounds checks is that they won't be eliminated statically since they are expanded after Flambda. (I'm not completely sure about this, but I think it is the case.)

There isn't any special handling based on "assert".

@lindig

This comment has been minimized.

Show comment
Hide comment
@lindig

lindig Mar 11, 2016

I agree with @edwintorok, this is impressive. I think adding a section with (automatically generated) benchmark numbers (optimisation, code size, performance) would provide valuable context as it is difficult to imagine the effect of different optimisations.

lindig commented Mar 11, 2016

I agree with @edwintorok, this is impressive. I think adding a section with (automatically generated) benchmark numbers (optimisation, code size, performance) would provide valuable context as it is difficult to imagine the effect of different optimisations.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Mar 11, 2016

Contributor

The problem is that it so heavily depends on what the code actually is that generalised benchmark numbers may be misleading. I think we should perhaps leave something like that until 4.04, when we will have received more information on general use cases in the wild.

Contributor

mshinwell commented Mar 11, 2016

The problem is that it so heavily depends on what the code actually is that generalised benchmark numbers may be misleading. I think we should perhaps leave something like that until 4.04, when we will have received more information on general use cases in the wild.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Mar 11, 2016

Contributor

Suggestions received from William via email, which I shall look at:

"20.1 :
please give the command or procedure that tells whether the ocaml we have (through an already existing installation, for example debian) has flambda activated or not

"Detailed descriptions of each flag are given in the sections that follow." : I would replace by : "Terms and concepts used here are better described in the following sections".
Also, I don't know if that would make the reading worse or not, but I would replace each defined term like "call site", "specialised", "inline" by a pointer to the glossary. Similarly, you should reference for each options the related chapters (the page is big, and it is not easy to have a full view of what is going to be explained, and we might also miss some chapters related to an option).

20.2 :
-remove-unused-arguments
"Remove unused function arguments even when the argument is not specialised." :
do you mean "remove unused argument that are not specialised, in addition to unused arguments that are specialised" ?
then what is the trade-off behind this option ? why is it not activated by default ?

-unbox-closures
make pointers to the glossary and to the related chapters (20.9.3).

"advanced options" : do you mean it is even less used that "less commonly-used options" ?

"The set of command line flags relating to optimisation should typically be specified to be the same across an entire project." why not give a typical ocamlbuild command. ? In the mean time, I see you are from Janestreet, so you would not do that as you use other tool ...

"Flambda-specific flags are silently accepted " : as it is silent only for benchmark purposes, why not make warnings by default, and add an option to make them silent when performing benchmanking ?

"-rounds" : refer to the related chapter 20.2.1

"If the first form is used, the value will apply to all rounds.If the second form is used, zero-based round integers specify values which are to be used only for those rounds. " : what is first/second forms ? Maybe reformulating this with examples would be clearer.

Glossary
"call site" : I guess it is the point where the function is used, rather than the point where the function is defined
"closure" : This is really hard to understand. Maybe more explanations, or use of less technical terms ?
"specialised argument" : you should refer to the "specialised" chapter/section."

Contributor

mshinwell commented Mar 11, 2016

Suggestions received from William via email, which I shall look at:

"20.1 :
please give the command or procedure that tells whether the ocaml we have (through an already existing installation, for example debian) has flambda activated or not

"Detailed descriptions of each flag are given in the sections that follow." : I would replace by : "Terms and concepts used here are better described in the following sections".
Also, I don't know if that would make the reading worse or not, but I would replace each defined term like "call site", "specialised", "inline" by a pointer to the glossary. Similarly, you should reference for each options the related chapters (the page is big, and it is not easy to have a full view of what is going to be explained, and we might also miss some chapters related to an option).

20.2 :
-remove-unused-arguments
"Remove unused function arguments even when the argument is not specialised." :
do you mean "remove unused argument that are not specialised, in addition to unused arguments that are specialised" ?
then what is the trade-off behind this option ? why is it not activated by default ?

-unbox-closures
make pointers to the glossary and to the related chapters (20.9.3).

"advanced options" : do you mean it is even less used that "less commonly-used options" ?

"The set of command line flags relating to optimisation should typically be specified to be the same across an entire project." why not give a typical ocamlbuild command. ? In the mean time, I see you are from Janestreet, so you would not do that as you use other tool ...

"Flambda-specific flags are silently accepted " : as it is silent only for benchmark purposes, why not make warnings by default, and add an option to make them silent when performing benchmanking ?

"-rounds" : refer to the related chapter 20.2.1

"If the first form is used, the value will apply to all rounds.If the second form is used, zero-based round integers specify values which are to be used only for those rounds. " : what is first/second forms ? Maybe reformulating this with examples would be clearer.

Glossary
"call site" : I guess it is the point where the function is used, rather than the point where the function is defined
"closure" : This is really hard to understand. Maybe more explanations, or use of less technical terms ?
"specialised argument" : you should refer to the "specialised" chapter/section."

that developers experiment to determine whether the option is beneficial
for their code. (It is expected that in the future it will be possible
for the performance degradation to be removed.)

This comment has been minimized.

@bluddy

bluddy Mar 11, 2016

Question: any idea why this regression happens? This should be a very beneficial optimization.

@bluddy

bluddy Mar 11, 2016

Question: any idea why this regression happens? This should be a very beneficial optimization.

This comment has been minimized.

@mshinwell

mshinwell Mar 14, 2016

Contributor

This is a source of great frustration. We don't fully know the answer to this question yet. I have suspected for a while that higher register pressure may be something to do with it. One thing might be that, since there are no callee-save registers, it means that the extra arguments added by this pass have to be spilled and reloaded across any calls in the function. This (plus other shuffling of registers if pressure is high) may on average be worse than fishing the values out from a single closure argument. We need to do some more experiments.

Another problem, which we have a concrete example of, relates to the Flambda representation not handling well the case where a closure is constant but some of the arguments are specialised (i.e. known to have the same value as some other variable in scope). The specialisation prevents them being lifted and assigned to symbols (references to which are direct; they are not captured by closures) until Cmmgen. References to such closures are transformed by unbox-closures into extra arguments, which can kind of cascade if there is a series of such functions each referencing the previous one(s). In fact, all such arguments are unnecessary, since the accesses should be via symbols. We have some ideas as to how to fix this, but it's not entirely straightforward.

@mshinwell

mshinwell Mar 14, 2016

Contributor

This is a source of great frustration. We don't fully know the answer to this question yet. I have suspected for a while that higher register pressure may be something to do with it. One thing might be that, since there are no callee-save registers, it means that the extra arguments added by this pass have to be spilled and reloaded across any calls in the function. This (plus other shuffling of registers if pressure is high) may on average be worse than fishing the values out from a single closure argument. We need to do some more experiments.

Another problem, which we have a concrete example of, relates to the Flambda representation not handling well the case where a closure is constant but some of the arguments are specialised (i.e. known to have the same value as some other variable in scope). The specialisation prevents them being lifted and assigned to symbols (references to which are direct; they are not captured by closures) until Cmmgen. References to such closures are transformed by unbox-closures into extra arguments, which can kind of cascade if there is a series of such functions each referencing the previous one(s). In fact, all such arguments are unnecessary, since the accesses should be via symbols. We have some ideas as to how to fix this, but it's not entirely straightforward.

This comment has been minimized.

@bluddy

bluddy Mar 14, 2016

If it's a register pressure issue, it should be much worse on 32 bit OCaml than on 64 bit, right? Do you observe this?

Interesting. While I have your attention, since we're already specializing by value, do you think it'll be possible to extend Flambda to specialize by types ie. stamp out functions that handle more specific, unboxed types (so long as they don't trip up the GC, of course)?

@bluddy

bluddy Mar 14, 2016

If it's a register pressure issue, it should be much worse on 32 bit OCaml than on 64 bit, right? Do you observe this?

Interesting. While I have your attention, since we're already specializing by value, do you think it'll be possible to extend Flambda to specialize by types ie. stamp out functions that handle more specific, unboxed types (so long as they don't trip up the GC, of course)?

This comment has been minimized.

@mshinwell

mshinwell Mar 30, 2016

Contributor

We haven't checked this on 32-bit yet.
Type specialisation has been covered in the AMA pull request, I think.

@mshinwell

mshinwell Mar 30, 2016

Contributor

We haven't checked this on 32-bit yet.
Type specialisation has been covered in the AMA pull request, I think.

@Drup Drup referenced this pull request Mar 19, 2016

Closed

AMA about Flambda #517

@Drup

This comment has been minimized.

Show comment
Hide comment
@Drup

Drup Mar 19, 2016

Contributor

Remarks:

  • I'm not really fond of how the description of options is at the top, disconnected from the section that actually explains what they do. When you read the explanation, you have to backtrack to see the name of the options. I really think you should at least add many forward/back references.
  • Description of options should say what they take as arguments. I had to look at the default to know those were mostly integers.
  • In one of the code examples:
let bar x =
    x * 3
  [@@inline never]

can also be written :

let[@inline never] bar x =
    x * 3

Same thing for the module. I'll let you decide if it's better or not. :)

  • It looks like flambda can detect if the argument of a function is not used at all in the compilation unit. Would this trigger a warning if the function is not exported ? It sounds like a useful warning.
  • About non-escaping references, would it be useful to have an attribute for that, [@noescape], that would produce an error if it's not respected ? (It sounds useful, but you may know better).
Contributor

Drup commented Mar 19, 2016

Remarks:

  • I'm not really fond of how the description of options is at the top, disconnected from the section that actually explains what they do. When you read the explanation, you have to backtrack to see the name of the options. I really think you should at least add many forward/back references.
  • Description of options should say what they take as arguments. I had to look at the default to know those were mostly integers.
  • In one of the code examples:
let bar x =
    x * 3
  [@@inline never]

can also be written :

let[@inline never] bar x =
    x * 3

Same thing for the module. I'll let you decide if it's better or not. :)

  • It looks like flambda can detect if the argument of a function is not used at all in the compilation unit. Would this trigger a warning if the function is not exported ? It sounds like a useful warning.
  • About non-escaping references, would it be useful to have an attribute for that, [@noescape], that would produce an error if it's not respected ? (It sounds useful, but you may know better).

@mshinwell mshinwell merged commit bbb1f5e into ocaml:4.03 Apr 1, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment