Optimize ReorderGlobals ordering #6625

kripken · 2024-05-22T20:05:14Z

The old ordering in that pass did a topological sort while sorting by uses
both within topological groups and between them. That could be unoptimal
in some cases, however, and actually on J2CL output this pass made the
binary larger, which is how we noticed this.

The problem is that such a toplogical sort keeps topological groups in
place, but it can be useful to interleave them sometimes. Imagine this:

     $c - $a
    /
  $e
    \
     $d - $b

Here $e depends on $c, etc. The optimal order may interleave the two
arms here, e.g. $a, $b, $d, $c, $e. That is because the dependencies define
a partial order, and so the arms here are actually independent.

Sorting by toplogical depth first might help in some cases, but also is not
optimal in general, as we may want to mix toplogical depths:
$a, $c, $b, $d, $e does so, and it may be the best ordering.

This PR implements a natural greedy algorithm that picks the global with
the highest use count at each step, out of the set of possible globals, which
is the set of globals that have no unresolved dependencies. So we start by
picking the first global with no dependencies and add at at the front; then
that unlocks anything that depended on it and we pick from that set, and
so forth.

This may also not be optimal, but it is easy to make it more flexible by
customizing the counts, and we consider 4 sorts here:

Set all counts to 0. This means we only take into account dependencies,
and we break ties by the original order, so this is as close to the original
order as we can be.
Use the actual use counts. This is the simple greedy algorithm.
Set the count of each global to also contain the counts of its children,
so the count is the total that might be unlocked. This gives more weight
to globals that can unlock more later, so it is less greedy.
As 3, but weight children's counts lower in an exponential way, which
makes sense as they may depend on other globals too.

In practice it is simple to generate cases where 1, 2, or 3 is optimal (see
new tests), but on real-world J2CL I see that 4 (with a particular exponential
coefficient) is best, so the pass computes all 4 and picks the best. As a
result it will never worsen the size and it has a good chance of
improving.

The differences between these are small, so in theory we could pick any
of them, but given they are all modifications of a single algorithm it is
very easy to compute them all with little code complexity.

Some data on J2CL:

Method	Size (less is better)
Greedy	4338066
Original order	4336535
Old (before this PR)	4336518
Sum	4336503
Exponential (0.5)	4336373
Exponential (PR's value)	4336127

There is a slight runtime cost to this: J2CL goes from 0.9666 to 1.1351
seconds. As this is one of our faster passes the slight slowdown seems
worth it in return for the guarantee to never increase size, and the small
improvement.

gkdn · 2024-05-22T23:34:57Z

Interesting problem! (cc @rluble)

The number differences here are small; makes me wonder if magic-import that pushes higher indices benefit more.

tlively

Was there a test for which the exponential sort was optimal?

tlively · 2024-05-29T22:55:09Z

src/passes/ReorderGlobals.cpp

+  // each one moves, which is logically a mapping between indices.
+  using IndexIndexMap = std::vector<Index>;
+
+  // We will also track counts of uses for each global.


It might be worth briefly explaining why we use a double here.

src/passes/ReorderGlobals.cpp

kripken · 2024-05-30T21:13:48Z

Was there a test for which the exponential sort was optimal?

Sadly no. It's very hard to make such a test, as it would need very many deep dependency chains (on which it pays to take them into account a little, but not too much).

(I would not include code for that in the pass, except that it ends up as just another constant value, and we do measure the sizes, so it seems safe.)

Co-authored-by: Thomas Lively <tlively123@gmail.com>

src/passes/ReorderGlobals.cpp

Co-authored-by: Thomas Lively <tlively123@gmail.com>

kripken added 30 commits May 16, 2024 14:16

start

b308805

fix

8c55b6f

fix

835622f

show problem

d09f4a0

work

1c52de6

fix

927897c

test

de1d20d

work

6cf9fb3

exllore

322bc71

fail

0b2f8a9

try

407bad3

work

a88c8c4

Merge remote-tracking branch 'origin/main' into globses

7de4b4d

undo

08e07cb

heapify

38ac562

heapify

ad78406

fix

c4da577

work

f90da11

todo

8d54784

work

6000256

fix

0baee6b

undo

ddc8f47

undo

52ebfae

format

fac33e4

fancy

2fd4ec0

notes

dbf7673

test

c1bd7c6

test

d325aba

test

c8f6c2d

test

84b0bc8

kripken added 9 commits May 22, 2024 10:19

work

9e3d467

indices

c17d1b3

actually indices

4bf17c6

actually indices

8834d84

clean

e189d66

format

838b3b3

comments

567e25d

fix

57eb5df

fix comment

280cfd7

kripken requested a review from tlively May 22, 2024 20:05

tlively reviewed May 30, 2024

View reviewed changes

kripken and others added 12 commits May 30, 2024 14:17

Update src/passes/ReorderGlobals.cpp

dda7b3d

Co-authored-by: Thomas Lively <tlively123@gmail.com>

comment

ff792bc

typo

055f296

use topological sort utility

bd77630

Merge remote-tracking branch 'myself/globses' into globses

b68359e

typo

8ffc896

optimize

507dff8

Update src/passes/ReorderGlobals.cpp

3353220

Co-authored-by: Thomas Lively <tlively123@gmail.com>

Update src/passes/ReorderGlobals.cpp

7da3b03

Co-authored-by: Thomas Lively <tlively123@gmail.com>

missing work

4ee047b

Merge remote-tracking branch 'myself/globses' into globses

35405ca

format

9cc3be5

tlively approved these changes May 30, 2024

View reviewed changes

src/passes/ReorderGlobals.cpp Outdated Show resolved Hide resolved

kripken and others added 2 commits May 31, 2024 09:20

clarify sort

044a7cf

Update src/passes/ReorderGlobals.cpp

d191ce1

Co-authored-by: Thomas Lively <tlively123@gmail.com>

kripken merged commit f8086ad into WebAssembly:main May 31, 2024
13 checks passed

kripken deleted the globses branch May 31, 2024 21:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ReorderGlobals ordering #6625

Optimize ReorderGlobals ordering #6625

kripken commented May 22, 2024

gkdn commented May 22, 2024

tlively left a comment

tlively May 29, 2024

kripken commented May 30, 2024

Optimize ReorderGlobals ordering #6625

Optimize ReorderGlobals ordering #6625

Conversation

kripken commented May 22, 2024

gkdn commented May 22, 2024

tlively left a comment

Choose a reason for hiding this comment

tlively May 29, 2024

Choose a reason for hiding this comment

kripken commented May 30, 2024