Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

file 1260 lines (945 sloc) 45.512 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260
------------------------------------
GHCI hacking
------------------------------------

* Don't forget to put deferred-type-decls back into RnIfaces

* Do we want to record a package name in a .hi file?
  Does pi_mod have a ModuleName or a Module?

------------------------------------
Mainly FunDeps (23 Jan 01)
------------------------------------

This commit re-engineers the handling of functional dependencies.
A functional dependency is no longer an Inst; instead, the necessary
dependencies are snaffled out of their Class when necessary.

As part of this exercise I found that I had to re-work how to do generalisation
in a binding group. There is rather exhaustive documentation on the new Plan
at the top of TcSimplify.

******************
WARNING: I have compiled all the libraries with this new compiler
and all looks well, but I have not run many programs.
Things may break. Let me know if so.
******************

The main changes are these:

1. typecheck/TcBinds and TcSimplify have a lot of changes due to the
    new generalisation and context reduction story. There are extensive
    comments at the start of TcSimplify

2. typecheck/TcImprove is removed altogether. Instead, improvement is
    interleaved with context reduction (until a fixpoint is reached).
    All this is done in TcSimplify.

3. types/FunDeps has new exports
* 'improve' does improvement, returning a list of equations
* 'grow' and 'oclose' close a list of type variables wrt a set of
PredTypes, but in slightly different ways. Comments in file.

4. I improved the way in which we check that main::IO t. It's tidier now.

In addition

* typecheck/TcMatches:
a) Tidy up, introducing a common function tcCheckExistentialPat

b) Improve the typechecking of parallel list comprehensions,
which wasn't quite right before. (see comments with tcStmts)

WARNING: (b) is untested! Jeff, you might want to check.

* Numerous other incidental changes in the typechecker

* Manuel found that rules don't fire well when you have partial applications
    from overloading. For example, we may get

f a (d::Ord a) = let m_g = g a d
in
\y :: a -> ...(m_g (h y))...

    The 'method' m_g doesn't get inlined because (g a d) might be a redex.
    Yet a rule that looks like
g a d (h y) = ...
    won't fire because that doesn't show up. One way out would be to make
    the rule matcher a bit less paranoid about duplicating work, but instead
    I've added a flag
-fno-method-sharing
    which controls whether we generate things like m_g in the first place.
    It's not clear that they are a win in the first place.

    The flag is actually consulted in Inst.tcInstId



------------------------------------
Mainly PredTypes (28 Sept 00)
------------------------------------

Three things in this commit:

1. Main thing: tidy up PredTypes
2. Move all Keys into PrelNames
3. Check for unboxed tuples in function args

1. Tidy up PredTypes
~~~~~~~~~~~~~~~~~~~~
The main thing in this commit is to modify the representation of Types
so that they are a (much) better for the qualified-type world. This
should simplify Jeff's life as he proceeds with implicit parameters
and functional dependencies. In particular, PredType, introduced by
Jeff, is now blessed and dignified with a place in TypeRep.lhs:

data PredType = Class Class [Type]
| IParam Name Type

Consider these examples:
f :: (Eq a) => a -> Int
g :: (?x :: Int -> Int) => a -> Int
h :: (r\l) => {r} => {l::Int | r}

Here the "Eq a" and "?x :: Int -> Int" and "r\l" are all called
*predicates*, and are represented by a PredType. (We don't support
TREX records yet, but the setup is designed to expand to allow them.)

In addition, Type gains an extra constructor:

data Type = .... | PredTy PredType

so that PredType is injected directly into Type. So the type
p => t
is represented by
PredType p `FunTy` t

I have deleted the hackish IPNote stuff; predicates are dealt with entirely
through PredTys, not through NoteTy at all.


2. Move Keys into PrelNames
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is just a housekeeping operation. I've moved all the pre-assigned Uniques
(aka Keys) from Unique.lhs into PrelNames.lhs. I've also moved knowKeyRdrNames
from PrelInfo down into PrelNames. This localises in PrelNames lots of stuff
about predefined names. Previously one had to alter three files to add one,
now only one.

3. Unboxed tuples
~~~~~~~~~~~~~~~~~~
Add a static check for unboxed tuple arguments. E.g.
data T = T (# Int, Int #)
is illegal



---------------------------------------
Update in place
---------------------------------------

-funfolding-update-in-place
Switching it on doesn't affect many programs, except these
sphere is because it makes a critical function (vecsub) more inlinable

         sphere 66465k -20.61%
          infer 13390k +1.27%
        parstof 1461k +1.18%
          fluid 3442k +1.61%
           atom 177163k +13.20%
           bspt 4837k +4.85%
       cichelli 33546k +2.69%
      typecheck 146023k +1.47%


---------------------------------------
Simon's tuning changes: early Sept 2000
---------------------------------------

Library changes
~~~~~~~~~~~~~~~
* Eta expand PrelShow.showLitChar. It's impossible to compile this well,
  and it makes a big difference to some programs (e.g. gen_regexps)

* Make PrelList.concat into a good producer (in the foldr/build sense)


Flag changes
~~~~~~~~~~~~
* Add -ddump-hi-diffs to print out changes in interface files. Useful
  when watching what the compiler is doing

* Add -funfolding-update-in-place to enable the experimental optimisation
  that makes the inliner a bit keener to inline if it's in the RHS of
  a thunk that might be updated in place. Sometimes this is a bad idea
  (one example is in spectral/sphere; see notes in nofib/Simon-nofib-notes)


Tuning things
~~~~~~~~~~~~~
* Fix a bug in SetLevels.lvlMFE. (change ctxt_lvl to dest_level)
  I don't think this has any performance effect, but it saves making
  a redundant let-binding that is later eliminated.

* Desugar.dsProgram and DsForeign
  Glom together all the bindings into a single Rec. Previously the
  bindings generated by 'foreign' declarations were not glommed together, but
  this led to an infelicity (i.e. poorer code than necessary) in the modules
  that actually declare Float and Double (explained a bit more in Desugar.dsProgram)

* OccurAnal.shortMeOut and IdInfo.shortableIdInfo
  Don't do the occurrence analyser's shorting out stuff for things which
  have rules. Comments near IdInfo.shortableIdInfo.
  This is deeply boring, and mainly to do with making rules work well.
  Maybe rules should have phases attached too....

* CprAnalyse.addIdCprInfo
  Be a bit more willing to add CPR information to thunks;
  in particular, if the strictness analyser has just discovered that this
  is a strict let, then the let-to-case transform will happen, and CPR is fine.
  This made a big difference to PrelBase.modInt, which had something like
modInt = \ x -> let r = ... -> I# v in
...body strict in r...
  r's RHS isn't a value yet; but modInt returns r in various branches, so
  if r doesn't have the CPR property then neither does modInt

* MkId.mkDataConWrapId
  Arrange that vanilla constructors, like (:) and I#, get unfoldings that are
  just a simple variable $w:, $wI#. This ensures they'll be inlined even into
  rules etc, which makes matching a bit more reliable. The downside is that in
  situations like (map (:) xs), we'll end up with (map (\y ys. $w: y ys) xs.
  Which is tiresome but it doesn't happen much.

* SaAbsInt.findStrictness
  Deal with the case where a thing with no arguments is bottom. This is Good.
  E.g. module M where { foo = error "help" }
  Suppose we have in another module
case M.foo of ...
  Then we'd like to do the case-of-error transform, without inlining foo.


Tidying up things
~~~~~~~~~~~~~~~~~
* Reorganised Simplify.completeBinding (again).

* Removed the is_bot field in CoreUnfolding (is_cheap is true if is_bot is!)
  This is just a tidy up

* HsDecls and others
  Remove the NewCon constructor from ConDecl. It just added code, and nothing else.
  And it led to a bug in MkIface, which though that a newtype decl was always changing!

* IdInfo and many others
  Remove all vestiges of UpdateInfo (hasn't been used for years)

------------------------------
Join points Sept 2000
------------------------------

With Andrew Kennedy, I found out why a few of the join points introduced by
the simplifier end up as *not* let-no-escpaed. Here's an example:

f x y = case (pwr x b) == 1 of
False -> False
True -> pwr x c == 1

This compiles to:
  f = \ @ t w :: Integer ->
let {
$j :: (State# RealWorld -> Bool)
P
$j
= \ w1 :: (State# RealWorld) ->
case pwr w c of wild {
S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse };
J# s d1 ->
case cmpIntegerInt# s d1 1 of wild2 {
0 -> $wTrue; __DEFAULT -> $wFalse
}
}
} in
case pwr w b of wild {
S# i ->
case i of wild1 { 1 -> $j realWorld#; __DEFAULT -> $wFalse };
J# s d1 ->
case cmpIntegerInt# s d1 1 of wild2 {
0 -> $j realWorld#; __DEFAULT -> $wFalse
}
}

Now consider

case (f x) of
True -> False
False -> True

Suppose f is inlined into this case. No new join points are introduced,
because the alternatives are both small. But the consumer
case [.] of {True -> False; False -> True}
will move into the body of f, be duplicated 4 ways, and end up consuming
the result of the four outcomes at the body of f. This yields:
$j :: (State# RealWorld -> Bool)
P
$j
= \ w1 :: (State# RealWorld) ->
case pwr w c of wild {
S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse };
J# s d1 ->
case cmpIntegerInt# s d1 1 of wild2 {
0 -> $wTrue; __DEFAULT -> $wFalse
}
}
} in
case pwr w b of wild {
S# i ->
case i of wild1 { 1 -> case $j realWorld# of {T->F; F->T}
; __DEFAULT -> $wTrue };
J# s d1 ->
case cmpIntegerInt# s d1 1 of wild2 {
0 -> case $j realWorld# of {T->F; F->T}
; __DEFAULT -> $wTrue
}
}

And, voila, the join point $j isn't let-no-escaped any more.
The point is that the consuming context can't "see inside" the join point.
It's a phase ordering thing. If f is inlined before the join points
are built in the first place, then all is well.



-----------------------------
Sept 7 2000
-----------------------------

* Make the simplifier's Stop continuation record whether the expression being
  simplified is the RHS of a thunk, or (say) the body of a lambda or case RHS.
  In the thunk case we want to be a bit keener about inlining if the type of
  the thunk is amenable to update in place.

* SetLevels was being a bit too eager to float things to the top
  level; e.g. _inline_me_ (\a -> e); here e got floated...
  Easily fixed by a change to ltMajLvl

* Make CoreUnfold.calcUnfoldingGuidance a bit less keen to make case expressions
  seem small. The original idea was to make inlined wrappers look small, so that
  when we inline a wrapper it doesn't make call site (much) bigger
  Otherwise we get nasty phase ordering stuff:
-- f x = g x x
-- h y = ...(f e)...
  If we inline g's wrapper, f looks big, and doesn't get inlined
  into h; if we inline f first, while it looks small, then g's
  wrapper will get inlined later anyway. To avoid this nasty
  ordering difference, we make (case a of (x,y) -> ...),
  *where a is one of the arguments* look free.

  BUT (a) It's too eager. We don't want to inline a wrapper into a
context with no benefit.
E.g. \ x. f (x+x) o point in inlining (+) here!

(b) It's ineffective. Once g's wrapper is inlined, its case-expressions
aren't scrutinising arguments any more

  So I've rescinded this idea for now. cases still look fairly small.

* Fix interestingArg, which was being too liberal, and hence doing
  too much inlining.

* Extended CoreUtils.exprIsCheap to make two more things cheap:
    - case (coerce x) of ...
    - let x = y +# z
  This makes a bit more eta expansion happen. It was provoked by
  a program of Marcin's.
  
* The simplifier used to glom together all the top-level bindings into
  a single Rec every time it was invoked. The reason for this is explained
  in SimplCore.lhs, but for at least one simple program it meant that the
  simplifier never got around to unravelling the recursive group into
  non-recursive pieces. So I've put the glomming under explicit flag
  control with a -fglom-binds simplifier pass. A side benefit is
  that because it happens less often, the (expensive) SCC algorithm
  runs less often.
  
* MkIface.ifaceBinds. Make sure that we emit rules for things
  (like class operations) that don't get a top-level binding in the
  interface file. Previously such rules were silently forgotten.

* Move transformRhs to *after* simplification, which makes it a
  little easier to do, and means that the arity it computes is
  readily available to completeBinding. This gets much better
  arities.

* Do coerce splitting in completeBinding. This gets good code for
newtype CInt = CInt Int

test:: CInt -> Int
test x = case x of
1 -> 2
2 -> 4
3 -> 8
4 -> 16
_ -> 0

* Modify the meaning of "arity" so that during compilation it means
  "if you apply this function to fewer args, it will do virtually
  no work". So, for example
f = coerce t (\x -> e)
  has arity at least 1. When a function is exported, it's arity becomes
  the number of exposed, top-level lambdas, which is subtly different.
  But that's ok.

  I removed CoreUtils.exprArity altogether: it looked only at the exposed
  lambdas. Instead, we use exprEtaExpandArity exclusively.

  All of this makes I/O programs work much better.


-----------------------------
Sept 4 2000
-----------------------------

* PrimRep, TysPrim. Add PrimPtrRep as the representation for
  MVars and MutVars. Previously they were given PtrRep, but that
  crashed dataReturnConvPrim! Here's the program the killed it:
     data STRef s a = STRef (MutVar# s a)
     from (STRef x) = x
  
* Make the desugarer use string equality for string literal
  patterns longer than 1 character. And put a specialised
  eqString into PrelBase, with a suitable specialisation rule.
  This makes a huge difference to the size of the code generated
  by deriving(Read) notably in Time.lhs

-----------------------------
Marktoberdorf Commits (Aug 2000)
-----------------------------

1. Tidy up the renaming story for "system binders", such as
dictionary functions, default methods, constructor workers etc. These
are now documented in HsDecls. The main effect of the change, apart
from tidying up, is to make the *type-checker* (instead of the
renamer) generate names for dict-funs and default-methods. This is
good because Sergei's generic-class stuff generates new classes at
typecheck time.


2. Fix the CSE pass so it does not require the no-shadowing invariant.
Keith discovered that the simplifier occasionally returns a result
with shadowing. After much fiddling around (which has improved the
code in the simplifier a bit) I found that it is nearly impossible to
arrange that it really does do no-shadowing. So I gave up and fixed
the CSE pass (which is the only one to rely on it) instead.


3. Fix a performance bug in the simplifier. The change is in
SimplUtils.interestingArg. It computes whether an argment should
be considered "interesting"; if a function is applied to an interesting
argument, we are more likely to inline that function.
Consider this case
let x = 3 in f x
The 'x' argument was considered "uninteresting" for a silly reason.
Since x only occurs once, it was unconditionally substituted, but
interestingArg didn't take account of that case. Now it does.

I also made interestingArg a bit more liberal. Let's see if we
get too much inlining now.


4. In the occurrence analyser, we were choosing a bad loop breaker.
Here's the comment that's now in OccurAnal.reOrderRec

    score ((bndr, rhs), _, _)
| exprIsTrivial rhs = 3 -- Practically certain to be inlined
-- Used to have also: && not (isExportedId bndr)
-- But I found this sometimes cost an extra iteration when we have
-- rec { d = (a,b); a = ...df...; b = ...df...; df = d }
-- where df is the exported dictionary. Then df makes a really
-- bad choice for loop breaker

I also increased the score for bindings with a non-functional type, so that
dictionaries have a better chance of getting inlined early


5. Add a hash code to the InScopeSet (and make it properly abstract)
This should make uniqAway a lot more robust. Simple experiments suggest
that uniqAway no longer gets into the long iteration chains that it used
to.


6. Fix a bug in the inliner that made the simplifier tend to get into
a loop where it would keep iterating ("4 iterations, bailing out" message).
In SimplUtils.mkRhsTyLam we float bindings out past a big lambda, thus:
x = /\ b -> let g = \x -> f x x
in E
becomes
g* = /\a -> \x -> f x x
x = /\ b -> let g = g* b in E

It's essential that we don't simply inling g* back into the RHS of g,
else we will be back to square 1. The inliner is meant not to do this
because there's no benefit to the inlining, but the size calculation
was a little off in CoreUnfold.


7. In SetLevels we were bogus-ly building a Subst with an empty in-scope
set, so a WARNING popped up when compiling some modules. (knights/ChessSetList
was the example that tickled it.) Now in fact the warning wasn't an error,
but the Right Thing to do is to carry down a proper Subst in SetLevels, so
that is what I have now done. It is very little more expensive.



~~~~~~~~~~~~
Apr/May 2000
~~~~~~~~~~~~

This is a pretty big commit! It adds stuff I've been working on
over the last month or so. DO NOT MERGE IT WITH 4.07!

Recompilation checking
~~~~~~~~~~~~~~~~~~~~~~
Substantial improvement in recompilation checking. The version management
is now entirely internal to GHC. ghc-iface.lprl is dead!

The trick is to generate the new interface file in two steps:
  - first convert Types etc to HsTypes etc, and thereby
build a new ParsedIface
  - then compare against the parsed (but not renamed) version of the old
interface file
Doing this meant adding code to convert *to* HsSyn things, and to
compare HsSyn things for equality. That is the main tedious bit.

Another improvement is that we now track version info for
fixities and rules, which was missing before.


Interface file reading
~~~~~~~~~~~~~~~~~~~~~~
Make interface files reading more robust.
  * If the old interface file is unreadable, don't fail. [bug fix]

  * If the old interface file mentions interfaces
    that are unreadable, don't fail. [bug fix]

  * When we can't find the interface file,
    print the directories we are looking in. [feature]


Type signatures
~~~~~~~~~~~~~~~
  * New flag -ddump-types to print type signatures


Type pruning
~~~~~~~~~~~~
When importing
data T = T1 A | T2 B | T3 C
it seems excessive to import the types A, B, C as well, unless
the constructors T1, T2 etc are used. A,B,C might be more types,
and importing them may mean reading more interfaces, and so on.
 So the idea is that the renamer will just import the decl
data T
unless one of the constructors is used. This turns out to be quite
easy to implement. The downside is that we must make sure the
constructors are always available if they are really needed, so
I regard this as an experimental feature.


Elimininate ThinAir names
~~~~~~~~~~~~~~~~~~~~~~~~~
Eliminate ThinAir.lhs and all its works. It was always a hack, and now
the desugarer carries around an environment I think we can nuke ThinAir
altogether.

As part of this, I had to move all the Prelude RdrName defns from PrelInfo
to PrelMods --- so I renamed PrelMods as PrelNames.

I also had to move the builtinRules so that they are injected by the renamer
(rather than appearing out of the blue in SimplCore). This is if anything simpler.

Miscellaneous
~~~~~~~~~~~~~
* Tidy up the data types involved in Rules

* Eliminate RnEnv.better_provenance; use Name.hasBetterProv instead

* Add Unique.hasKey :: Uniquable a => a -> Unique -> Bool
  It's useful in a lot of places

* Fix a bug in interface file parsing for __U[!]


=======================================
To-do
~~~~~
* Try the effect of enhancing update in place with the CPR
  idea in CoreUnfold.calcUnfoldingGuidance

* Check with Simon M re srt on Lit

* Make all primops return a data type so that we can't over-apply a primop
  This makes code gen simpler. Currently the only primops with a polymorphic
  return type are:
raise# :: a -> b
catch# :: a -> (b->a) -> a
tagToEnum# :: Int -> a

  Very strange code for PrelException.catchException! What has STret got
  to do with it?

* Liberate case

* Missing w/w for coerce in go2 functions of fibToList' in fibheaps

* Watch out for re-boxing in workers; sometimes it happens
  and then w/w is a Bad Thing

* Only two uses of mkCompulsoryUnfolding -- try to nuke it

* Note that mkDupAlt makes alts that have binders that
  are guaranteed to appear just once or not at all
(a,b) -> j a
  Same for case binder, but that's harder to take into account.

* max :: Int -> Int -> Int could be CPRd but isn't.

* In mandel2 we do a little less well than 4.04 because we aren't
  inlining point_colour, and that means we have to box up an argument
  before calling it. [This was due to a bug in 4.04]
  There's also a great opportunity for liberateCase
  in check_radius, where it loops around with two lazy F# built each time

* In PrelShow.itos' we find a thunk like:
tpl = case chrzh {(zpzh {(remIntzh {x{-aMf-} 10}) 48})}
of tpl{-X1j-} __D P { __DEFAULT ->
PrelBase.Czh{-62,s-} {tpl{-X1j-}}
}
  This is a pity. The remInt# can't fail because the divisor isn't 0,
  so we could do the sum eagerly and allocate a charcter instead of a thunk.

* It's good to do let-to-case before we wrap up. Consider
  f b xs = let ys = partition isUpper xs
zs = case ys of (a,b) -> a
           in case b of
True -> case ys of
(a,b) -> (zs,[])
False -> case ys of
(a,b) -> (zs ++ xs,[])
  If we don't do let-to-case at all, we get 3 redundant case ys left.
  On the other hand we don't want to do it too early, because it
  prevents inlining into strict arg positions, which is important for
  rules to work.

* Strict dictionaries.

* INLINE functions are not always INLINEd, so it's sad to leave
  stuff in their bodies like constructors that havn't been inlined.

* If let x = e in b is strict, then CPR can use the CPR info from x
  This bites in the mod method of Integral Int

* Inline wrappers if they are the RHS of a let, so that update in place
  can happen?

* Consider doing unboxing on strict constr args in a pattern match,
  as part of w/w.

* In spectral/expert/Search.ask there's a statically visible CSE. Catching this
  depends almost entirely on chance, which is a pity.

* Think about exprEtaExpandArity in WwLib. Perhaps eliminate eta expand in simplify?
  Perhaps use even if no coerces etc, just eta expansion. (e.g. PrelArr.done)

* In knights/KnightHeuristic, we don't find that possibleMoves is strict
  (with important knock-on effects) unless we apply rules before floating
  out the literal list [A,B,C...].
  Similarly, in f_se (F_Cmp ...) in listcompr (but a smaller effect)

* Floating can float the entire body of an INLINE thing out.
  e.g. PrelArr.done
  This is sad, and a bit stupid.

* In spectral/multiplier, we have
    xor = lift21 forceBit f
      where f :: Bit -> Bit -> Bit
f 0 0 = 0
f 0 1 = 1
f 1 0 = 1
f 1 1 = 0
  Trouble is, f is CPR'd, and that means that instead of returning
  the constants I# 0, I# 1, it returns 0,1 and then boxes them.
  So allocation goes up. I don't see a way around this.

* spectral/hartel/parstof ends up saying
case (unpackCString "x") of { c:cs -> ... }
  quite a bit. We should spot these and behave accordingly.

* Try a different hashing algorithms in hashUFM. This might reduce long CSE lists
  as well as making uniqAway faster.

* [I'm not sure this is really important in the end.]
  Don't float out partial applications in lvlMFE. E.g. (in hPutStr defn of shoveString)
\x -> case .. of
[] -> setBufWPtr a b
...
  setBufWPtr has arity 3. Floating it out is plain silly. And in this particular
  case it's harmful, because it ends up preventing eta expansion on the \x.
  That in turn leads to a big extra cost in hPutStr.

  *** Try not doing lvlMFE on the body of a lambda and case alternative ***

* PrelNumExtra.lhs we get three copies of dropTrailing0s. Too much inlining!
  drop0 has cost 21, but gets a discount of 6 (3 * #constrs) for its arg.
  With a keen-neess factor of 2, that makes a discount of 12. Add two for
  the arguments and we get 21-12-2, which is just small enough to inline.
  But that is plainly stupid.

  Add one for cases; and decrease discount for constructors.

* IO.hGetContents still doesn't see that it is strict in the handle.
  Coerces still getting in the way.

* Try not having really_interesting_cont (subsumed by changes in the
way guidance is calculated for inline things?)

* Enumeration types in worker/wrapper for strictness analysis

* This should be reported as an error:
data T k = MkT (k Int#)

* Bogus report of overlapped pattern for
f (R {field = [c]}) = 1
   f (R {}) = 2
  This shows up for TyCon.tyConSingleDataCon_maybe

* > module Main( main ) where

   > f :: String -> Int
   > f "=<" = 0
   > f "=" = 0
   
   > g :: [Char] -> Int
   > g ['=','<'] = 0
   > g ['='] = 0
   
   > main = return ()
   
   For ``f'' the following is reported.
   
   tmp.lhs:4:
    Pattern match(es) are overlapped in the definition of function `f'
            "=" = ...

   There are no complaints for definition for ``g''.

* Without -O I don't think we need change the module version
  if the usages change; I forget why it changes even with -O

* Record selectors for existential type; no good! What to do?
  Record update doesn't make sense either.

  Need to be careful when figuring out strictness, and when generating
  worker-wrapper split.

  Also when deriving.


Jan 2000
~~~~~~~~

A fairly big pile of work originally aimed at
removing the Con form of Core expression, and replacing it with simple
Lit form. However, I wanted to make sure that the resulting thing
performed better than the original, so I ended up making an absolute
raft of other changes.

Removing the Con form of Core expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The big thing is that

  For every constructor C there are now *two* Ids:

C is the constructor's *wrapper*. It evaluates and unboxes arguments
before calling $wC. It has a perfectly ordinary top-level defn
in the module defining the data type.

$wC is the constructor's *worker*. It is like a primop that simply
allocates and builds the constructor value. Its arguments are the
actual representation arguments of the constructor.

  For every primop P there is *one* Id, its (curried) Id

  Neither contructor worker Id nor the primop Id have a defminition anywhere.
  Instead they are saturated during the core-to-STG pass, and the code generator
  generates code for them directly. The STG language still has saturated
  primops and constructor applications.

* The Const type disappears, along with Const.lhs. The literal part
  of Const.lhs reappears as Literal.lhs. Much tidying up in here,
  to bring all the range checking into this one module.

* I got rid of NoRep literals entirely. They just seem to be too much trouble.

* Because Con's don't exist any more, the funny C { args } syntax
  disappears from inteface files.

* Every constructor, C, comes with a

  *wrapper*, called C, whose type is exactly what it looks like
in the source program. It is an ordinary function,
and it gets a top-level binding like any other function

  *worker*, called $wC, which is the actual data constructor.
Its type may be different to C, because:
- useless dict args are dropped
- strict args may be flattened
It does not have a binding.

  The worker is very like a primop, in that it has no binding,


Parsing
~~~~~~~
* Result type signatures now work
f :: Int -> Int = \x -> x
-- The Int->Int is the type of f

g x y :: Int = x+y
-- The Int is the type of the result of (g x y)


Recompilation checking and make
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* The .hi file for a modules is not touched if it doesn't change. (It used to
  be touched regardless, forcing a chain of recompilations.) The penalty for this
  is that we record exported things just as if they were mentioned in the body of
  the module. And the penalty for that is that we may recompile a module when
  the only things that have changed are the things it is passing on without using.
  But it seems like a good trade.

* -recomp is on by default

Foreign declarations
~~~~~~~~~~~~~~~~~~~~
* If you say
foreign export zoo :: Int -> IO Int
  then you get a C produre called 'zoo', not 'zzoo' as before.
  I've also added a check that complains if you export (or import) a C
  procedure whose name isn't legal C.


Code generation and labels
~~~~~~~~~~~~~~~~~~~~~~~~~~
* Now that constructor workers and wrappers have distinct names, there's
  no need to have a Foo_static_closure and a Foo_closure for constructor Foo.
  I nuked the entire StaticClosure story. This has effects in some of
  the RTS headers (i.e. s/static_closure/closure/g)


Rules, constant folding
~~~~~~~~~~~~~~~~~~~~~~~
* Constant folding becomes just another rewrite rule, attached to the Id for the
  PrimOp. To achieve this, there's a new form of Rule, a BuiltinRule (see CoreSyn.lhs).
  The prelude rules are in prelude/PrelRules.lhs, while simplCore/ConFold.lhs has gone.

* Appending of constant strings now works, using fold/build fusion, plus
  the rewrite rule
unpack "foo" c (unpack "baz" c n) = unpack "foobaz" c n
  Implemented in PrelRules.lhs

* The CCall primop is tidied up quite a bit. There is now a data type CCall,
  defined in PrimOp, that packages up the info needed for a particular CCall.
  There is a new Id for each new ccall, with an big "occurrence name"
{__ccall "foo" gc Int# -> Int#}
  In interface files, this is parsed as a single Id, which is what it is, really.

Miscellaneous
~~~~~~~~~~~~~
* There were numerous places where the host compiler's
  minInt/maxInt was being used as the target machine's minInt/maxInt.
  I nuked all of these; everything is localised to inIntRange and inWordRange,
  in Literal.lhs

* Desugaring record updates was broken: it didn't generate correct matches when
  used withe records with fancy unboxing etc. It now uses matchWrapper.

* Significant tidying up in codeGen/SMRep.lhs

* Add __word, __word64, __int64 terminals to signal the obvious types
  in interface files. Add the ability to print word values in hex into
  C code.

* PrimOp.lhs is no longer part of a loop. Remove PrimOp.hi-boot*


Types
~~~~~
* isProductTyCon no longer returns False for recursive products, nor
  for unboxed products; you have to test for these separately.
  There's no reason not to do CPR for recursive product types, for example.
  Ditto splitProductType_maybe.

Simplification
~~~~~~~~~~~~~~~
* New -fno-case-of-case flag for the simplifier. We use this in the first run
  of the simplifier, where it helps to stop messing up expressions that
  the (subsequent) full laziness pass would otherwise find float out.
  It's much more effective than previous half-baked hacks in inlining.

  Actually, it turned out that there were three places in Simplify.lhs that
  needed to know use this flag.

* Make the float-in pass push duplicatable bindings into the branches of
  a case expression, in the hope that we never have to allocate them.
  (see FloatIn.sepBindsByDropPoint)

* Arrange that top-level bottoming Ids get a NOINLINE pragma
  This reduced gratuitous inlining of error messages.
  But arrange that such things still get w/w'd.

* Arrange that a strict argument position is regarded as an 'interesting'
  context, so that if we see
foldr k z (g x)
  then we'll be inclined to inline g; this can expose a build.

* There was a missing case in CoreUtils.exprEtaExpandArity that meant
  we were missing some obvious cases for eta expansion
  Also improve the code when handling applications.

* Make record selectors (identifiable by their IdFlavour) into "cheap" operations.
[The change is a 2-liner in CoreUtils.exprIsCheap]
  This means that record selection may be inlined into function bodies, which
  greatly improves the arities of overloaded functions.

* Make a cleaner job of inlining "lone variables". There was some distributed
  cunning, but I've centralised it all now in SimplUtils.analyseCont, which
  analyses the context of a call to decide whether it is "interesting".

* Don't specialise very small functions in Specialise.specDefn
  It's better to inline it. Rather like the worker/wrapper case.

* Be just a little more aggressive when floating out of let rhss.
  See comments with Simplify.wantToExpose
  A small change with an occasional big effect.

* Make the inline-size computation think that
case x of I# x -> ...
  is *free*.


CPR analysis
~~~~~~~~~~~~
* Fix what was essentially a bug in CPR analysis. Consider

letrec f x = let g y = let ... in f e1
in
if ... then (a,b) else g x

  g has the CPR property if f does; so when generating the final annotated
  RHS for f, we must use an envt in which f is bound to its final abstract
  value. This wasn't happening. Instead, f was given the CPR tag but g
  wasn't; but of course the w/w pass gives rotten results in that case!!
  (Because f's CPR-ness relied on g's.)

  On they way I tidied up the code in CprAnalyse. It's quite a bit shorter.

  The fact that some data constructors return a constructed product shows
  up in their CPR info (MkId.mkDataConId) not in CprAnalyse.lhs



Strictness analysis and worker/wrapper
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* BIG THING: pass in the demand to StrictAnal.saExpr. This affects situations
  like
f (let x = e1 in (x,x))
  where f turns out to have strictness u(SS), say. In this case we can
  mark x as demanded, and use a case expression for it.

  The situation before is that we didn't "know" that there is the u(SS)
  demand on the argument, so we simply computed that the body of the let
  expression is lazy in x, and marked x as lazily-demanded. Then even after
  f was w/w'd we got

let x = e1 in case (x,x) of (a,b) -> $wf a b

  and hence

let x = e1 in $wf a b

  I found a much more complicated situation in spectral/sphere/Main.shade,
  which improved quite a bit with this change.
 
* Moved the StrictnessInfo type from IdInfo to Demand. It's the logical
  place for it, and helps avoid module loops

* Do worker/wrapper for coerces even if the arity is zero. Thus:
stdout = coerce Handle (..blurg..)
  ==>
wibble = (...blurg...)
stdout = coerce Handle wibble
  This is good because I found places where we were saying
case coerce t stdout of { MVar a ->
...
case coerce t stdout of { MVar b ->
...
  and the redundant case wasn't getting eliminated because of the coerce.



End December
~~~~~~~~~~~~
* Fix a few renamer bugs

* Substantially reorganise the Prelude to eliminate all orphan declarations.
  Details in PrelBase.lhs

* Do a much better job of appending literal strings
   - remove NoRepStr
   - move unpackCString stuff to PrelBase
   - add BuiltinRules to the Rule type
   - add fold/build rules for literal strings

  

Week of Mon 25 Oct
~~~~~~~~~~~~~~~~~~
* Fix a terrible bug in Simplify.mkDupableAlt; we were duplicating a small
  *InAlt*, but doing so invalidated occurrence info, which could lead to
  substantial code duplication.

* Fix a bug in WwLib.mkWWcpr; I was generating CPR wrappers like
I# (case x of ...)
  which is utterly wrong. It should be
case x of ...(I# r)
  (The effect was to make functions stricter than they really are.)

* Try doing no inlining at all in phase 0. This noticeably improved
  spectral/fish (esp Main.hs I think), by improving floating.
  This single change has quite a large effect on some programs (allocation)

   Don't inline Don't inline
wrappers anything
in phase 0 in phase 0
         awards 113k -7.08%
       cichelli 28962k -3.12%
      wave4main 88089k +130.45%
       fibheaps 31731k +19.01%
           fish 8273k -1.64%
      typecheck 148713k +4.91%

  But I found that fish worked much better if we inline *local* things
  in phase 0, but not *imported* things.

* Fix a terrible bug in Simplify.mkLamBndrZapper. It was counting
  type args in one place, but not type binders, so it was sometimes
  inlining into unsaturated lambdas!

* I found that there were some very bad loss-of-arity cases in PrelShow.
  In particular, we had:

showl "" = showChar '"' s
showl ('"':xs) = showString "\\\"" . showl xs
showl (x:xs) = showLitChar x . showl xs

  Trouble is, we get
showl = \xs -> case xs of
...
(x:xs) -> let f = showLitChar x
g = showl xs
in \s -> f (g x)
  which is TERRIBLE. We can't spot that showLitChar has arity 2 because
  it looks like this:

...other eqns...
        showLitChar c = showString ('\\' : asciiTab!!ord c)

  notice that the (asciiTab!!orc c) is outside the \s, so GHC can't rewrite it to

showLitChar c = \s -> showString ('\\' : asciiTab!!ord c) s

  So I've changed PrelShow.showLitChar to use explicit \s. Even then, showl
  doesn't work, because GHC can't see that showl xs can be pushed inside the \s.
  So I've put an explict \s there too.

showl "" s = showChar '"' s
showl ('"':xs) s = showString "\\\"" (showl xs s)
showl (x:xs) s = showLitChar x (showl xs s)

  Net result: imaginary/gen_regexps more than halves in allocation!

  Turns out that the mkLamBndrZapper bug (above) meant that showl was
  erroneously inlining showLitChar x and showl xs, which is why this
  problem hasn't shown up before.
  
* Improve CSE a bit. In ptic
case h x of y -> ...(h x)...
  replaces (h x) by y.

* Inline INLINE things very agressively, even though we get code duplication
  thereby. Reason: otherwise we sometimes call the original un-inlined INLINE
  defns, which have constructors etc still un-inlined in their RHSs. The
  improvement is dramatic for a few programs:

      typecheck 150865k -1.43%
      wave4main 114216k -22.87%
          boyer 28793k -7.86%
       cichelli 33786k -14.28%
            ida 59505k -1.79%
        rewrite 14665k -4.91%
          sched 17641k -4.22%

  Code size increases by 10% which is not so good. There must be a better way.
  Another bad thing showed up in fish/Main.hs. Here we have
(x1,y1) `vec_add` (x2,y2) = (x1+x2, y1+y2)
  which tends to get inlined. But if we first inline (+), it looks big,
  so we don't inline it. Sigh.


* Don't inline constructors in INLINE RHSs. Ever. Otherwise rules don't match.
  E.g. build

* In ebnf2ps/Lexer.uncommentString, it would be a good idea to inline a constructor
  that occurs once in each branch of a case. That way it doesn't get allocated
  in the branches that don't use it. And in fact in this particular case
  something else good happens. So CoreUnfold now does that.

* Reverted to n_val_binders+2 in calcUnfoldingGuidance
  Otherwise wrappers are inlined even if there's no benefit.


Week of Mon 18 Oct
~~~~~~~~~~
* Arrange that simplConArgs works in one less pass than before.
  This exposed a bug: a bogus call to completeBeta.

* Add a top-level flag in CoreUnfolding, used in callSiteInline

* Extend w/w to use etaExpandArity, so it does eta/coerce expansion

* Don't float anything out of an INLINE.
  Don't float things to top level unless they also escape a value lambda.
[see comments with SetLevels.lvlMFE
  Without at least one of these changes, I found that
{-# INLINE concat #-}
concat = __inline (/\a -> foldr (++) [])
  was getting floated to
concat = __inline( /\a -> lvl a )
lvl = ...inlined version of foldr...

  Subsequently I found that not floating constants out of an INLINE
  gave really bad code like
__inline (let x = e in \y -> ...)
  so I now let things float out of INLINE

* Implement inline phases. The meaning of the inline pragmas is
  described in CoreUnfold.lhs

* Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier
  to implement it in SetLevels, and may benefit full laziness too.

Thurs 14 Oct
~~~~~~~~~~~~
* It's a good idea to inline inRange. Consider

index (l,h) i = case inRange (l,h) i of
True -> l+i
False -> error
  inRange itself isn't strict in h, but if it't inlined then 'index'
  *does* become strict in h. Interesting!

* Big change to the way unfoldings and occurrence info is propagated in the simplifier
  The plan is described in Subst.lhs with the Subst type
  Occurrence info is now in a separate IdInfo field than user pragmas

* I found that
(coerce T (coerce S (\x.e))) y
  didn't simplify in one round. First we get to
(\x.e) y
  and only then do the beta. Solution: cancel the coerces in the continuation

* Amazingly, CoreUnfold wasn't counting the cost of a function an application.

Early Oct
~~~~~~~~~
* No commas between for-alls in RULES

* Disable rules in initial simplifier run. Otherwise full laziness
  doesn't get a chance to lift out a MFE before a rule (e.g. fusion)
  zaps it. queens is a case in point

* Improve float-out stuff significantly. The big change is that if we have

\x -> ... /\a -> ...let p = ..a.. in let q = ...p...

  where p's rhs doesn't x, we abstract a from p, so that we can get p past x.
  (We did that before.) But we also substitute (p a) for p in q, and then
  we can do the same thing for q. (We didn't do that, so q got stuck.)
  This is much better. It involves doing a substitution "as we go" in SetLevels,
  though.


Weds 15 Sept
~~~~~~~~~~~~
* exprIsDupable for an application (f e1 .. en) wasn't calling exprIsDupable
  on the arguments!! So applications with few, but large, args were being dupliated.

* sizeExpr on an application wasn't doing a nukeScrutDiscount on the arg of
  an application!! So bogus discounts could accumulate from arguments!

* Improve handling of INLINE pragmas in calcUnfoldingGuidance. It was really
  wrong before

* Substantially improve handling of coerces in worker/wrapper

Tuesday 6 June
~~~~~~~~~~~~~~
* Fix Kevin Atkinson's cant-find-instance bug. Turns out that Rename.slurpSourceRefs
  needs to repeatedly call getImportedInstDecls, and then go back to slurping
  source-refs. Comments with Rename.slurpSourceRefs.

* Add a case to Simplify.mkDupableAlt for the quite-common case where there's
  a very simple alternative, in which case there's no point in creating a
  join-point binding.

* Fix CoreUtils.exprOkForSpeculation so that it returns True of (==# a# b#).
  This lack meant that
case ==# a# b# of { True -> x; False -> x }
  was not simplifying

* Make float-out dump bindings at the top of a function argument, as
  at the top of a let(rec) rhs. See notes with FloatOut.floatRhs

* Make the ArgOf case of mkDupableAlt generate a OneShot lambda.
  This gave a noticeable boost to spectral/boyer2


Monday 5 June
~~~~~~~~~~~~~
Work, using IO.hPutStr as an example, to reduce the number of coerces.
The main idea is in WwLib.mkWWcoerce. The gloss is that we must do
the w/w split even for small non-recursive things. See notes with
WorkWrap.tryWw.


Friday 2 June
~~~~~~~~~~~~~
Study why gen_regexps is slower than before. Problem is in IO.writeLines,
in particular the local defn shoveString. Two things are getting
in the way of arity expansion, which means we build far more function
closures than we should:
shove = \ x -> let lvl = \s -> ...
in \s -> ... lvl ...

The two things are:
a) coerces
b) full laziness floats


Solution to (a): add coerces to the worker/wrapper stuff.
See notes with WwLib.mkWWcoerce.

This further complicated getWorkerId, so I finally bit the bullet and
make the workerInfo field of the IdInfo work properly, including
under substitutions. Death to getWorkerId.



Solution to (b): make all lambdas over realWorldStatePrimTy
into one-shot lambdas. This is a GROSS HACK.

* Also make the occurrence analyser aware of one-shot lambdas.


Thurs 1 June
~~~~~~~~~~~~
Fix SetLevels so that it does not clone top-level bindings, but it
*does* clone bindings that are destined for the top level.

The global invariant is that the top level bindings are always
unique, and never cloned.
Something went wrong with that request. Please try again.