partially remove polymorphic compare for constructor_tag #406

dwightguth · 2016-01-09T00:22:25Z

compare_val is quite slow and was accounting for a significant
fraction of the time spent compiling my program when profiled with
perf. Hopefully this will be modestly faster.

compare_val is quite slow and was accounting for a significant fraction of the time spent compiling my program when profiled with perf. Hopefully this will be modestly faster.

gasche · 2016-01-09T01:15:15Z

Did you observe any improvement in compilation time on your program after applying this change?

dwightguth · 2016-01-09T01:29:23Z

I didn't have time to test it yet because I would have to rebuild my entire application and it was already the end of the workday. But perf reported about 20% total time spent in compare_val of which over 95% was traced back to this call site. My experience typically has been that compare_val is always slower than an explicit comparison function. But I'll get you numbers on Monday.

Drup · 2016-01-09T01:34:25Z

typing/types.ml

+    | c -> c)
+  | Cstr_constant _, _ -> 1
+  | Cstr_block _, _ -> 1
+  | _ -> -1


That seems wrong. compare_tag (Cstr_constant _) (Cst_block _) = compare_tag (Cst_block _) (Cstr_constant _) = 1
Also, while using substraction as comparison is ok for those small integers, it's not really faster than the correct version.

You only need equal_tag anyway.

oh good catch. I've written these functions before but I seem to have forgotten a case. Alright, I'll rewrite it with an equals method.

let-def · 2016-01-09T14:30:47Z

It would be very interesting to know how much faster your build get with this optimization.

If it is significant, maybe the native code generator should inline the first level for comparison of sum types. Anybody knows if this has been considered before?

murmour · 2016-01-09T15:17:28Z

But perf reported about 20% total time spent in compare_val of which over 95% was traced back to this call site. My experience typically has been that compare_val is always slower than an explicit comparison function.

And my guess is that the true bottleneck is the structural comparison of Path.t, not a couple of redundant branches (tag checks) that you've shaved off prematurely.

20% of whole compilation time is insane. It would be interesting to know more details about your particular program. Let me guess further: it is generated code that contains lots of nested modules?

dwightguth · 2016-01-11T15:16:26Z

I timed compilation of my program before and after this change. Compilation time dropped from roughly 600ms to roughly 475 ms. This is a relatively small program that is generated automatically which unmarshals a term from a string literal, performs match-intensive processing on it, prints a small amount of output, and then terminates. It does link against a rather complex runtime but the compilation of that runtime is not included in these running times.

chambart · 2016-01-12T08:46:11Z

typing/types.ml

+  | Cstr_block i1, Cstr_block i2 -> i2 = i1
+  | Cstr_extension (path1, b1), Cstr_extension (path2, b2) -> path1 = path2 && b1 = b2
+  | _ -> false
+


This should probably be an exhaustive match to catch problems if the type is extended.

Sorry I'm not sure I quite follow what you mean. You mean you want it to exhaustively match the structure of Path.t? Or are you suggesting something else?

@chambart means that there should be no final | _ -> false case. The code should be such that, if we add a new constructor to the input datatype, we get an exhaustivity warning.

I personally structure equality tests as follows to ensure this property:

| K1 p, K1 q -> ... | K1 _, _ | _, K1 _ -> false | K2 p, K2 q -> ... | K2 _, _ | _, K2 _ -> false ... | Kn p, Kn p -> ...

(The last case does not have a catch-all sibling because all constructors have been matched.)

I'm not sure whether this generates more or less efficient matching code, so maybe it would make sense to re-run your benchmark if you change this function.

An alternative to omitting the last case is to expand only the left column and group the non-matching constructors together:

K1 p, K1 q -> ... | K2 p, K2 q -> ... ... | Kn p, Kn q -> ... | (K1 _|K2 _|...|Kn _), _ -> false

This seems to result in slightly better lambda code, but I also find it easier to read.

I've addressed this and did not see any significant impact to performance. I suspect because the OCAML compiler detects that the pattern is exhaustive with code as it exists currently and therefore it does not need to actually perform the check once jumping to this case.

alainfrisch · 2016-01-15T13:19:05Z

If it is significant, maybe the native code generator should inline the first level for comparison of sum types.

Why only sum types? Records and tuples would also be useful. And one could go further and generate specialized comparison code for the known structure of the type (probably generating recursive functions for each required type node in each unit).

add compiler patch (cf ocaml/ocaml#406)

Conflicts: typing/parmatch.ml

mshinwell · 2016-12-28T07:15:47Z

typing/types.ml

@@ -304,6 +304,12 @@ and constructor_tag =
  | Cstr_extension of Path.t * bool     (* Extension constructor
                                           true if a constant false if a block*)

+let equal_tag t1 t2 = match (t1, t2) with


Stylistic comment: "match" should be on the next line.

mshinwell · 2016-12-28T07:16:05Z

typing/types.ml

+let equal_tag t1 t2 = match (t1, t2) with
+  | Cstr_constant i1, Cstr_constant i2 -> i2 = i1
+  | Cstr_block i1, Cstr_block i2 -> i2 = i1
+  | Cstr_extension (path1, b1), Cstr_extension (path2, b2) -> path1 = path2 && b1 = b2


You should use Path.same here I think.

mshinwell · 2016-12-28T07:17:33Z

I think efforts to remove uses of polymorphic comparison are to be encouraged, so I move in favour of merging this patch, after a couple of comments have been addressed.

dwightguth · 2017-02-06T23:22:03Z

I believe this should address the remaining issues, but if you have more feedback let me know. I can also do a history cleanup but if you want me to do that I would prefer to wait until all other feedback is addressed so I'm not doing it multiple times.

Note that I also did some profiling on code recently and found that the time spent in ocamlc on one of my programs decreased from 3 minutes to 1 minute if I replaced the List.exists call in this PR with a Hashtbl. Would you like that included here or as a separate PR subsequent to the merging of this one?

bobzhang · 2017-02-07T01:22:36Z

typing/types.ml

+  | Cstr_extension (path1, b1), Cstr_extension (path2, b2) -> 
+      Path.same path1 path2 && b1 = b2
+  | (Cstr_constant _|Cstr_block _|Cstr_unboxed|Cstr_extension _), _ -> false
+


actually I find this more readable, should compiles fast and generate efficient code

let equal_tag t1 = match t1 with | Cstr_constant i1 -> begin match t2 with Cstr_constant i2 -> i1 = i2 | _ -> false end | ..

Isn't it the compiler's job to make it fast and efficient?

Regarding readability and maintainability: you lose exhaustiveness checks this way.

For readability, I prefer the original version. I find @bobzhang's version harder to read.

dwightguth · 2017-02-07T02:17:31Z

I don't know what goes on under the hood of ocaml's optimizer obviously but I would actually expect the code as written now to be faster than matching each argument individually because the compiler won't be able to create jump instructions as effectively. Am I missing something?

…

On Feb 6, 2017 7:33 PM, "Max Mouratov" ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In typing/types.ml <#406>: > @@ -322,6 +322,15 @@ and constructor_tag = | Cstr_extension of Path.t * bool (* Extension constructor true if a constant false if a block*) +let equal_tag t1 t2 = + match (t1, t2) with + | Cstr_constant i1, Cstr_constant i2 -> i2 = i1 + | Cstr_block i1, Cstr_block i2 -> i2 = i1 + | Cstr_unboxed, Cstr_unboxed -> true + | Cstr_extension (path1, b1), Cstr_extension (path2, b2) -> + Path.same path1 path2 && b1 = b2 + | (Cstr_constant _|Cstr_block _|Cstr_unboxed|Cstr_extension _), _ -> false + Isn't it the compiler's job to make it fast and efficient? Regarding readability and maintainability: you lose exhaustiveness checks this way. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#406>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AE3jnYzjShE9DSR29WxFHg4Ez6AtVKMAks5rZ8nRgaJpZM4HBk0j> .

mshinwell · 2017-02-21T16:13:35Z

@dwightguth You might as well put the Hashtbl change into this patch. Can you do a bit more benchmarking to make sure it isn't also a regression on average?

dwightguth · 2017-02-21T16:45:44Z

Yeah, I can do that. I don't have a lot of experience benchmarking compilers though, do you have any advice on what type of programs you'd like to see as part of the benchmark?

…List.exists

mshinwell · 2017-03-07T10:19:42Z

For this one maybe just check the time spent building the compiler distribution to make sure it hasn't degraded noticeably. I think it should be fine.

dwightguth · 2017-03-07T17:08:34Z

I ran the following command twice, once before bootstrapping the compiler and once after, and these are the results:

$ time sh -c './configure && make world && make bootstrap && make world.opt' # before bootstrapping
<snipped>
real	6m32.920s
user	5m50.812s
sys	0m16.232s

$ time sh -c './configure && make world && make bootstrap && make world.opt' # after bootstrapping
<snipped>
real	6m20.863s
user	5m46.392s
sys	0m15.668s

So it looks like if there is an effect on the average case performance, it is small but positive.

damiendoligez · 2017-03-29T07:16:09Z

Changes

@@ -378,7 +378,11 @@ Next version (4.05.0):
  (Alain Frisch, report by Anil Madhavapeddy)

 - PR#7443, GPR#990: spurious unused open warning with local open in patterns
-  (Florian Angeletti, report by Gabriel Scherer)
+  (Florian Angeletti, report by Gabriel Scherer


you removed a parenthesis by mistake

damiendoligez · 2017-03-29T07:18:26Z

Changes

+
+- GPR#406: remove polymorphic comparison for Types.constructor_tag in compiler
+  (Dwight Guth,
+   review by Gabriel Radanne, Pierre Chambart, Mark Shinwell)


Please respect the format and put it all on one line.

I think that multi-line credits are acceptable, and in particular may be used to respect the 80-columns rule. I'm not sure we care about which wrapping strategy is used.

I adjusted it to wrap only at 80 characters. If you want it all on one line even though it now would be longer than 80 characters on one line, let me know and I can do that as well.

damiendoligez · 2017-03-29T07:26:06Z

typing/types.ml

+  | Cstr_extension (path1, b1), Cstr_extension (path2, b2) -> 
+      Path.same path1 path2 && b1 = b2
+  | (Cstr_constant _|Cstr_block _|Cstr_unboxed|Cstr_extension _), _ -> false
+


For readability, I prefer the original version. I find @bobzhang's version harder to read.

mshinwell · 2017-04-28T14:31:05Z

Now that @damiendoligez has approved I'm pretty sure this is ok to merge, even though AppVeyor failed for some reason.

Remove ephemeron usage of RPC

* Always rebuild the term inside subfunctions * Wording and clarify comment

partially remove polymorphic compare for constructor_tag

e73e2ce

compare_val is quite slow and was accounting for a significant fraction of the time spent compiling my program when profiled with perf. Hopefully this will be modestly faster.

Drup reviewed Jan 9, 2016
View reviewed changes

Dwight Guth added 2 commits January 11, 2016 08:25

update to use equal instead of compare

e7b2edb

Merge remote-tracking branch 'origin/trunk' into compare_tag

0573a3f

chambart reviewed Jan 12, 2016
View reviewed changes

damiendoligez added this to the 4.04-or-later milestone Jan 12, 2016

use exhaustive match for equal_tag

6b5847f

dwightguth pushed a commit to runtimeverification/k that referenced this pull request Jan 20, 2016

add compiler patch (cf ocaml/ocaml#406)

162ae3d

dwightguth added a commit to runtimeverification/k that referenced this pull request Jan 20, 2016

Merge pull request #5 from runtimeverification/ocaml

04f9927

add compiler patch (cf ocaml/ocaml#406)

Merge branch 'trunk' into compare_tag

ee119f4

Conflicts: typing/parmatch.ml

damiendoligez removed this from the 4.04 milestone Aug 3, 2016

mshinwell requested changes Dec 28, 2016

View reviewed changes

dwightguth added 5 commits December 28, 2016 07:54

adjust style of equal_tag and use Path.same for path comparison

3a33197

Merge remote-tracking branch 'origin/trunk' into compare_tag

f4df62d

add Cstr_unboxed case to Types.equal_tag

fdf076b

add changelog entry

7af0ec8

break up long lines

039b0d1

bobzhang mentioned this pull request Feb 7, 2017

backport #polymorphic comparison rescript-lang/rescript-compiler#1177

Closed

bobzhang reviewed Feb 7, 2017

View reviewed changes

bobzhang added a commit to rescript-lang/ocaml that referenced this pull request Feb 7, 2017

backport gpr ocaml#406

c29feb0

dwightguth added 2 commits February 24, 2017 15:50

add a Hashtbl to check set membership for constructor_tag instead of …

50b10ab

…List.exists

Merge branch 'trunk' into compare_tag

48e2f8e

mshinwell added the approved label Mar 8, 2017

damiendoligez approved these changes Mar 29, 2017

View reviewed changes

dwightguth added 2 commits March 29, 2017 09:31

incorporate feedback on Changes file

fc20a70

Merge branch 'trunk' into compare_tag

80c2a8e

mshinwell approved these changes Apr 28, 2017

View reviewed changes

mshinwell added no-change-entry-needed and removed no-change-entry-needed labels Apr 28, 2017

mshinwell merged commit 7d48407 into ocaml:trunk Apr 28, 2017

EduardoRFS pushed a commit to EduardoRFS/ocaml that referenced this pull request Dec 6, 2020

Merge pull request ocaml#406 from ctk21/remove_ephemeron_rpc

a49675c

Remove ephemeron usage of RPC

poechsel added a commit to poechsel/ocaml that referenced this pull request Apr 16, 2021

Always rebuild the term inside subfunctions (ocaml#406)

2559d9b

* Always rebuild the term inside subfunctions * Wording and clarify comment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

partially remove polymorphic compare for constructor_tag #406

partially remove polymorphic compare for constructor_tag #406

dwightguth commented Jan 9, 2016

gasche commented Jan 9, 2016

dwightguth commented Jan 9, 2016

Drup Jan 9, 2016

dwightguth Jan 9, 2016

let-def commented Jan 9, 2016

murmour commented Jan 9, 2016

dwightguth commented Jan 11, 2016

chambart Jan 12, 2016

dwightguth Jan 12, 2016

gasche Jan 12, 2016

yallop Jan 12, 2016

dwightguth Jan 12, 2016

alainfrisch commented Jan 15, 2016

mshinwell Dec 28, 2016

mshinwell Dec 28, 2016

mshinwell commented Dec 28, 2016 •

edited

dwightguth commented Feb 6, 2017

bobzhang Feb 7, 2017

murmour Feb 7, 2017

damiendoligez Mar 29, 2017

dwightguth commented Feb 7, 2017 via email

mshinwell commented Feb 21, 2017

dwightguth commented Feb 21, 2017

mshinwell commented Mar 7, 2017

dwightguth commented Mar 7, 2017

damiendoligez Mar 29, 2017

damiendoligez Mar 29, 2017

gasche Mar 29, 2017

dwightguth Mar 29, 2017

damiendoligez Mar 29, 2017

mshinwell commented Apr 28, 2017

partially remove polymorphic compare for constructor_tag #406

partially remove polymorphic compare for constructor_tag #406

Conversation

dwightguth commented Jan 9, 2016

gasche commented Jan 9, 2016

dwightguth commented Jan 9, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

let-def commented Jan 9, 2016

murmour commented Jan 9, 2016

dwightguth commented Jan 11, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alainfrisch commented Jan 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshinwell commented Dec 28, 2016 • edited

dwightguth commented Feb 6, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dwightguth commented Feb 7, 2017 via email

mshinwell commented Feb 21, 2017

dwightguth commented Feb 21, 2017

mshinwell commented Mar 7, 2017

dwightguth commented Mar 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshinwell commented Apr 28, 2017

mshinwell commented Dec 28, 2016 •

edited