Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we really need objects in ocamlopt ? #11034

Closed
wants to merge 2 commits into from

Conversation

lthls
Copy link
Contributor

@lthls lthls commented Feb 19, 2022

(Note: this is a PR for discussion, not reviewing)

Context

I tried earlier this week to compile the Flambda 2 compiler with aggressive inlining (to try to find bugs in our code), and found a bug with the compilation of objects (long story short, not enough opaque identities), which resulted in all of the compiler building but ocamlopt.opt failing on startup.
This made me think that maybe we don't really need to have objects in the compiler's code, and today I took some time to see what would happen if we removed them.

The offending code

The native compiler currently uses objects (and classes) in the native backend to allow a generic implementation that performs all the work that is similar between architectures, and each architecture only needs to provide implementations for the non-generic parts (plus some architecture specific tweaks to the generic parts).
Concretely, the Selectgen, Reloadgen and CSEgen modules provides the generic implementations as classes, with virtual methods for the parts that cannot be generic, and each architecture implements Selection, Reload and CSE by inheriting from the generic class.

What I propose

I'm proposing to replace the generic classes with records of functions, each function taking as first argument such a record (classical implementation of open recursion, like what we have for the AST iterators for example). In addition, since the classes in Selectgen and Reloadgen contain mutable instance variables not exposed in the interface, this will be reflected by a mutable field whose type will be exported as abstract (there are other alternatives that would work too, but this one looked like a minimal amount of code changes).

The consequences

Not having objects in the compiler means that we're not potentially generating wrong code because we messed up some tricky part of the compilation of objects. But on the other hand, sometimes it helps to have our own code break if we mess up, as it makes it less likely that a bug will go unfixed (or even unnoticed) for a long period of time.

The performance impact is probably irrelevant here. The object-free code is going to be obviously much faster than the object code, but it's not in a part of the compiler that is usually a bottleneck so I doubt we will notice any significant difference.
As an anecdote, when running ocamlopt.opt on typing/typecore.ml, it takes around 1.4s to compile, with 0.017s spent in Selection (according to -dprofile). The same command without this patch spends 0.024s in Selection.

What this PR contains

So far, I've patched only Selectgen and amd64/selection.ml (this still passes the Github CI apparently, maybe we should restore check_all_arches). I think it is enough to get an idea of what the code would look like if we decide to go this way.
I'm expecting to keep this PR around as a draft until we reach a decision, at which point I'll either close it or write the remaining code.

Comments welcome!

method insert_op env op rs rd =
self#insert_op_debug env op Debuginfo.none rs rd
let default_insert_op self env op rs rd =
self.insert_op_debug self env op Debuginfo.none rs rd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the fact that some operations are called like insert_debug self env ..., while others are called like self.insert_op_debug self env .... In the object version you don't have to know whether a method is meant for overriding or not when you call it, and this makes the code more regular. (It also makes it easier to decide to override more functions later, etc.)

I may be missing something, but I think that you could either:

  1. define let <foo> self = self.<foo> self for all fields <foo>, and consistently use <foo> self
  2. or consistently put all self-taking functions in the record, and consistently use self.<foo> self

((1) works well with your choice of defining let default_<foo> self = ... functions for functions meant to be overriden.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was making a distinction in my mind between functions that can be overrided and functions that can't. The default_* naming scheme was actually so that I never called an overridable function directly by mistake.
But I agree that your suggestion (1) makes sense, I'll update the code to use that.

I think that putting all self-taking functions is the record would be a shame. In particular for the recursive functions, going through the record would turn direct calls into indirect calls, which has a sizeable impact on performance. If I get enough complaints that it makes it harder to assess whether the code has changed or not, I might reluctantly go back to a more mechanical transformation. But note that some of these functions were only exported because of technical details (see comments in selectgen.mli), so this looked like a good opportunity to fix that.

@nojb
Copy link
Contributor

nojb commented Feb 20, 2022

Personally I'm not in favor of this change (but won't object merging it if there is a consensus to do so), mostly due to

But on the other hand, sometimes it helps to have our own code break if we mess up, as it makes it less likely that a bug will go unfixed (or even unnoticed) for a long period of time.

We already make sure to avoid objects (and some other advanced features if I remember correctly) inside the bootstrap loop (ie ocamlc), but otherwise I think that keeping objects in the compiler is good as a way to test them in a real-life example (and note that we don't any "fancy" features of objects in the compiler, just the basic stuff).

@lthls
Copy link
Contributor Author

lthls commented Feb 20, 2022

Thanks for your comment. I'd just like to nitpick on the last detail:

(and note that we don't any "fancy" features of objects in the compiler, just the basic stuff).

This is a bit of a stretch. This code uses classes and inheritance, private methods, virtual methods, functional object copying... The only "fancy" feature I'm aware of that we don't use is multiple inheritance.

@nojb
Copy link
Contributor

nojb commented Feb 20, 2022

This is a bit of a stretch. This code uses classes and inheritance, private methods, virtual methods, functional object copying... The only "fancy" feature I'm aware of that we don't use is multiple inheritance.

Point taken.

@xavierleroy
Copy link
Contributor

You could question why the ocamlopt back-end is programmed with late binding, overriding, and open recursion. I would say mostly to handle i386-specific horrors, and once we get rid of i386 we could consider a simpler structure for the back-end.

But if we agree we need late binding, overriding, and open recursion, it would be silly not to use the OCaml language features that precisely support these, namely the class language. What's the point of rolling up your own little objects when the language has them?

Because of this, and because our time right now would be better spent on fixing everything that needs fixing in OCaml 5.0, I'm going to close this PR. No hard feelings, just a reminder that we have other things to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants