Permalink
Newer
Older
100644 258 lines (199 sloc) 8.22 KB
1
Ordering the compiler's passes
2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3
4
Change notes
5
~~~~~~~~~~~~
6
1 Nov 94 * NB: if float-out is done after strictness, remember to
7
switch off demandedness flags on floated bindings!
8
13 Oct 94 * Run Float Inwards once more after strictness-simplify [andre]
9
4 Oct 94 * Do simplification between float-in and strictness [andre]
10
* Ignore-inline-pragmas flag for final simplification [andre]
11
12
Aug 94 Original: Simon, Andy, Andre
13
14
15
16
17
This ordering obeys all the constraints except (5)
18
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19
20
full laziness
21
simplify with foldr/build
22
float-in
23
simplify
24
strictness
25
float-in
26
27
[check FFT2 still gets benefits with this ordering]
28
29
=================================
30
Constraints
31
=================================
32
33
1. float-in before strictness.
34
Reason: floating inwards moves definitions inwards to a site at which
35
the binding might well be strict.
36
37
Example let x = ... in
38
y = x+1
39
in
40
...
41
===>
42
let y = let x = ... in x+1
43
in ...
44
45
The strictness analyser will do a better job of the latter
46
than the former.
47
48
2. Don't simplify between float-in and strictness,
49
unless you disable float-let-out-of-let, otherwise
50
the simiplifier's local floating might undo some
51
useful floating-in.
52
53
Example let f = let y = .. in \x-> x+y
54
in ...
55
===>
56
let y = ...
57
f = \x -> x+y
58
in ...
59
60
This is a bad move, because now y isn't strict.
61
In the pre-float case, the binding for y is strict.
62
Mind you, this isn't a very common case, and
63
it's easy to disable float-let-from-let.
64
65
3. Want full-laziness before foldr/build.
66
Reason: Give priority to sharing rather than deforestation.
67
68
Example \z -> let xs = build g
69
in foldr k z xs
70
===>
71
let xs = build g
72
in \x -> foldr k z xs
73
74
In the post-full-laziness case, xs is shared between all
75
applications of the function. If we did foldr/build
76
first, we'd have got
77
78
\z -> g k z
79
80
and now we can't share xs.
81
82
83
4. Want strictness after foldr/build.
84
Reason: foldr/build makes new function definitions which
85
can benefit from strictness analysis.
86
87
Example: sum [1..10]
88
===> (f/b)
89
let g x a | x > 10 = a
90
| otherwise = g (x+1) (a+x)
91
92
Here we clearly want to get strictness analysis on g.
93
94
95
5. Want full laziness after strictness
96
Reason: absence may allow something to be floated out
97
which would not otherwise be.
98
99
Example \z -> let x = f (a,z) in ...
100
===> (absence anal + inline wrapper of f)
101
\z -> let x = f.wrk a in ...
102
===> (full laziness)
103
let x= f.wrk a in \z -> ...
104
105
TOO BAD. This doesn't look a common case to me.
106
107
108
6. Want float-in after foldr/build.
109
Reason: Desugaring list comprehensions + foldr/build
110
gives rise to new float-in opportunities.
111
112
Example ...some list comp...
113
==> (foldr/build)
114
let v = h xs in
115
case ... of
116
[] -> v
117
(y:ys) -> ...(t v)...
118
==> (simplifier)
119
let v = h xs in
120
case ... of
121
[] -> h xs
122
(y:ys) -> ...(t v)...
123
124
Now v could usefully be floated into the second branch.
125
126
7. Want simplify after float-inwards.
127
[Occurred in the prelude, compiling ITup2.hs, function dfun.Ord.(*,*)]
128
This is due to the following (that happens with dictionaries):
129
130
let a1 = case v of (a,b) -> a
131
in let m1 = \ c -> case c of I# c# -> case c# of 1 -> a1 5
132
2 -> 6
133
in let m2 = \ c -> case c of I# c# ->
134
case c# +# 1# of cc# -> let cc = I# cc#
135
in m1 cc
136
in (m1,m2)
137
138
floating inwards will push the definition of a1 into m1 (supposing
139
it is only used there):
140
141
in let m1 = let a1 = case v of (a,b) -> a
142
in \ c -> case c of I# c# -> case c# of 1 -> a1 5
143
2 -> 6
144
in let m2 = \ c -> case c of I# c# ->
145
case c# +# 1# of cc# -> let cc = I# cc#
146
in m1 cc
147
in (m1,m2)
148
149
if we do strictness analysis now we will not get a worker-wrapper
150
for m1, because of the "let a1 ..." (notice that a1 is not strict in
151
its body).
152
153
Not having this worker wrapper might be very bad, because it might
154
mean that we will have to rebox arguments to m1 if they are
155
already unboxed, generating extra allocations, as occurs with m2 (cc)
156
above.
157
158
To solve this problem we have decided to run the simplifier after
159
float-inwards, so that lets whose body is a HNF are floated out,
160
undoing the float-inwards transformation in these cases.
161
We are then back to the original code, which would have a worker-wrapper
162
for m1 after strictness analysis and would avoid the extra let in m2.
163
164
What we lose in this case are the opportunities for case-floating
165
that could be presented if, for example, a1 would indeed be demanded (strict)
166
after the floating inwards.
167
168
The only way of having the best of both is if we have the worker/wrapper
169
pass explicitly called, and then we could do with
170
171
float-in
172
strictness analysis
173
simplify
174
strictness analysis
175
worker-wrapper generation
176
177
as we would
178
a) be able to detect the strictness of m1 after the
179
first call to the strictness analyser, and exploit it with the simplifier
180
(in case it was strict).
181
b) after the call to the simplifier (if m1 was not demanded)
182
it would be floated out just like we currently do, before stricness
183
analysis II and worker/wrapperisation.
184
185
The reason to not do worker/wrapperisation twice is to avoid
186
generating wrappers for wrappers which could happen.
187
188
189
8. If full laziness is ever done after strictness, remember to switch off
190
demandedness flags on floated bindings! This isn't done at the moment.
191
192
193
Ignore-inline-pragmas flag for final simplification
194
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
195
[Occurred in the prelude, compiling ITup2.hs, function dfun.Ord.(*,*)]
196
Sometimes (e.g. in dictionary methods) we generate
197
worker/wrappers for functions but the wrappers are never
198
inlined. In dictionaries we often have
199
200
dict = let f1 = ...
201
f2 = ...
202
...
203
in (f1,f2,...)
204
205
and if we create worker/wrappers for f1,...,fn the wrappers will not
206
be inlined anywhere, and we will have ended up with extra
207
closures (one for the worker and one for the wrapper) and extra
208
function calls, as when we access the dictionary we will be acessing
209
the wrapper, which will call the worker.
210
The simplifier never inlines workers into wrappers, as the wrappers
211
themselves have INLINE pragmas attached to them (so that they are always
212
inlined, and we do not know in advance how many times they will be inlined).
213
214
To solve this problem, in the last call to the simplifier we will
215
ignore these inline pragmas and handle the workers and the wrappers
216
as normal definitions. This will allow a worker to be inlined into
217
the wrapper if it satisfies all the criteria for inlining (e.g. it is
218
the only occurrence of the worker etc.).
219
220
Run Float Inwards once more after strictness-simplify
221
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
222
[Occurred in the prelude, compiling IInt.hs, function const.Int.index.wrk]
223
When workers are generated after strictness analysis (worker/wrapper),
224
we generate them with "reboxing" lets, that simply reboxes the unboxed
225
arguments, as it may be the case that the worker will need the
226
original boxed value:
227
228
f x y = case x of
229
(a,b) -> case y of
230
(c,d) -> case a == c of
231
True -> (x,x)
232
False -> ((1,1),(2,2))
233
234
==> (worker/wrapper)
235
236
f_wrapper x y = case x of
237
(a,b) -> case y of
238
(c,d) -> f_worker a b c d
239
240
f_worker a b c d = let x = (a,b)
241
y = (c,d)
242
in case a == c of
243
True -> (x,x)
244
False -> ((1,1),(2,2))
245
246
in this case the simplifier will remove the binding for y as it is not
247
used (we expected this to happen very often, but we do not know how
248
many "reboxers" are eventually removed and how many are kept), and
249
will keep the binding for x. But notice that x is only used in *one*
250
of the branches in the case, but is always being allocated! The
251
floating inwards pass would push its definition into the True branch.
252
A similar benefit occurs if it is only used inside a let definition.
253
These are basically the advantages of floating inwards, but they are
254
only exposed after the S.A./worker-wrapperisation of the code! As we
255
also have reasons to float inwards before S.A. we have to run it
256
twice.