Skip to content
Permalink
Browse files

Major patch to implement the new Demand Analyser

This patch is the result of Ilya Sergey's internship at MSR.  It
constitutes a thorough overhaul and simplification of the demand
analyser.  It makes a solid foundation on which we can now build.
Main changes are

* Instead of having one combined type for Demand, a Demand is
   now a pair (JointDmd) of
      - a StrDmd and
      - an AbsDmd.
   This allows strictness and absence to be though about quite
   orthogonally, and greatly reduces brain melt-down.

* Similarly in the DmdResult type, it's a pair of
     - a PureResult (indicating only divergence/non-divergence)
     - a CPRResult (which deals only with the CPR property

* In IdInfo, the
    strictnessInfo field contains a StrictSig, not a Maybe StrictSig
    demandInfo     field contains a Demand, not a Maybe Demand
  We don't need Nothing (to indicate no strictness/demand info)
  any more; topSig/topDmd will do.

* Remove "boxity" analysis entirely.  This was an attempt to
  avoid "reboxing", but it added complexity, is extremely
  ad-hoc, and makes very little difference in practice.

* Remove the "unboxing strategy" computation. This was an an
  attempt to ensure that a worker didn't get zillions of
  arguments by unboxing big tuples.  But in fact removing it
  DRAMATICALLY reduces allocation in an inner loop of the
  I/O library (where the threshold argument-count had been
  set just too low).  It's exceptional to have a zillion arguments
  and I don't think it's worth the complexity, especially since
  it turned out to have a serious performance hit.

* Remove quite a bit of ad-hoc cruft

* Move worthSplittingFun, worthSplittingThunk from WorkWrap to
  Demand. This allows JointDmd to be fully abstract, examined
  only inside Demand.

Everything else really follows from these changes.

All of this is really just refactoring, so we don't expect
big performance changes, but acutally the numbers look quite
good.  Here is a full nofib run with some highlights identified:

        Program           Size    Allocs   Runtime   Elapsed  TotalMem
--------------------------------------------------------------------------------
         expert          -2.6%    -15.5%      0.00      0.00     +0.0%
          fluid          -2.4%     -7.1%      0.01      0.01     +0.0%
             gg          -2.5%    -28.9%      0.02      0.02    -33.3%
      integrate          -2.6%     +3.2%     +2.6%     +2.6%     +0.0%
        mandel2          -2.6%     +4.2%      0.01      0.01     +0.0%
       nucleic2          -2.0%    -16.3%      0.11      0.11     +0.0%
           para          -2.6%    -20.0%    -11.8%    -11.7%     +0.0%
         parser          -2.5%    -17.9%      0.05      0.05     +0.0%
         prolog          -2.6%    -13.0%      0.00      0.00     +0.0%
         puzzle          -2.6%     +2.2%     +0.8%     +0.8%     +0.0%
        sorting          -2.6%    -35.9%      0.00      0.00     +0.0%
       treejoin          -2.6%    -52.2%     -9.8%     -9.9%     +0.0%
--------------------------------------------------------------------------------
            Min          -2.7%    -52.2%    -11.8%    -11.7%    -33.3%
            Max          -1.8%     +4.2%    +10.5%    +10.5%     +7.7%
 Geometric Mean          -2.5%     -2.8%     -0.4%     -0.5%     -0.4%

Things to note

* Binary sizes are smaller. I don't know why, but it's good.

* Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
  (I checked treejoin, gg, sorting) arise from one place, namely a function
  GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
  which have several arugments.  Not w/w'ing both arguments (which is what
  we did before) has a big effect.  So the big win in actually somewhat
  accidental, gained by removing the "unboxing strategy" code.

* A couple of benchmarks allocate slightly more.  This turns out
  to be due to reboxing (integrate).  But the biggest increase is
  mandel2, and *that* turned out also to be a somewhat accidental
  loss of CSE, and pointed the way to doing better CSE: see Trac
  #7596.

* Runtimes are never very reliable, but seem to improve very slightly.

All in all, a good piece of work.  Thank you Ilya!
  • Loading branch information...
simonpj committed Jan 17, 2013
1 parent aef38d1 commit 0831a12ea2fc73c33652eeec1adc79fa19700578

Large diffs are not rendered by default.

@@ -38,15 +38,15 @@ module Id (
recordSelectorFieldLabel,
-- ** Modifying an Id
setIdName, setIdUnique, Id.setIdType,
setIdExported, setIdNotExported,
globaliseId, localiseId,
setIdName, setIdUnique, Id.setIdType,
setIdExported, setIdNotExported,
globaliseId, localiseId,
setIdInfo, lazySetIdInfo, modifyIdInfo, maybeModifyIdInfo,
zapLamIdInfo, zapDemandIdInfo, zapFragileIdInfo, transferPolyIdInfo,
zapIdStrictness,
-- ** Predicates on Ids
isImplicitId, isDeadBinder,
isImplicitId, isDeadBinder,
isStrictId,
isExportedId, isLocalId, isGlobalId,
isRecordSelector, isNaughtyRecordSelector,
@@ -69,9 +69,7 @@ module Id (
setOneShotLambda, clearOneShotLambda,
-- ** Reading 'IdInfo' fields
idArity,
idDemandInfo, idDemandInfo_maybe,
idStrictness, idStrictness_maybe,
idArity,
idUnfolding, realIdUnfolding,
idSpecialisation, idCoreRules, idHasRules,
idCafInfo,
@@ -82,12 +80,17 @@ module Id (
setIdUnfoldingLazily,
setIdUnfolding,
setIdArity,
setIdDemandInfo,
setIdStrictness, zapIdStrictness,
setIdSpecialisation,
setIdCafInfo,
setIdOccInfo, zapIdOccInfo,
setIdDemandInfo,
setIdStrictness,
idDemandInfo,
idStrictness,
) where
#include "HsVersions.h"
@@ -127,12 +130,14 @@ infixl 1 `setIdUnfoldingLazily`,
`setIdUnfolding`,
`setIdArity`,
`setIdOccInfo`,
`setIdDemandInfo`,
`setIdStrictness`,
`setIdSpecialisation`,
`setInlinePragma`,
`setInlineActivation`,
`idCafInfo`
`idCafInfo`,
`setIdDemandInfo`,
`setIdStrictness`
\end{code}

%************************************************************************
@@ -464,17 +469,14 @@ idRepArity x = typeRepArity (idArity x) (idType x)
isBottomingId :: Id -> Bool
isBottomingId id = isBottomingSig (idStrictness id)
idStrictness_maybe :: Id -> Maybe StrictSig
idStrictness :: Id -> StrictSig
idStrictness_maybe id = strictnessInfo (idInfo id)
idStrictness id = idStrictness_maybe id `orElse` topSig
idStrictness id = strictnessInfo (idInfo id)
setIdStrictness :: Id -> StrictSig -> Id
setIdStrictness id sig = modifyIdInfo (`setStrictnessInfo` Just sig) id
setIdStrictness id sig = modifyIdInfo (`setStrictnessInfo` sig) id
zapIdStrictness :: Id -> Id
zapIdStrictness id = modifyIdInfo (`setStrictnessInfo` Nothing) id
zapIdStrictness id = modifyIdInfo (`setStrictnessInfo` topSig) id
-- | This predicate says whether the 'Id' has a strict demand placed on it or
-- has a type such that it can always be evaluated strictly (e.g., an
@@ -485,8 +487,9 @@ zapIdStrictness id = modifyIdInfo (`setStrictnessInfo` Nothing) id
isStrictId :: Id -> Bool
isStrictId id
= ASSERT2( isId id, text "isStrictId: not an id: " <+> ppr id )
(isStrictDmd (idDemandInfo id)) ||
(isStrictType (idType id))
(isStrictType (idType id)) ||
-- Take the best of both strictnesses - old and new
(isStrictDmd (idDemandInfo id))
---------------------------------
-- UNFOLDING
@@ -508,14 +511,11 @@ setIdUnfoldingLazily id unfolding = modifyIdInfo (`setUnfoldingInfoLazily` unfol
setIdUnfolding :: Id -> Unfolding -> Id
setIdUnfolding id unfolding = modifyIdInfo (`setUnfoldingInfo` unfolding) id
idDemandInfo_maybe :: Id -> Maybe Demand
idDemandInfo :: Id -> Demand
idDemandInfo_maybe id = demandInfo (idInfo id)
idDemandInfo id = demandInfo (idInfo id) `orElse` topDmd
idDemandInfo id = demandInfo (idInfo id)
setIdDemandInfo :: Id -> Demand -> Id
setIdDemandInfo id dmd = modifyIdInfo (`setDemandInfo` Just dmd) id
setIdDemandInfo id dmd = modifyIdInfo (`setDemandInfo` dmd) id
---------------------------------
-- SPECIALISATION
@@ -654,11 +654,11 @@ zapInfo zapper id = maybeModifyIdInfo (zapper (idInfo id)) id
zapLamIdInfo :: Id -> Id
zapLamIdInfo = zapInfo zapLamInfo
zapFragileIdInfo :: Id -> Id
zapFragileIdInfo = zapInfo zapFragileInfo
zapDemandIdInfo :: Id -> Id
zapDemandIdInfo = zapInfo zapDemandInfo
zapFragileIdInfo :: Id -> Id
zapFragileIdInfo = zapInfo zapFragileInfo
\end{code}

Note [transferPolyIdInfo]
@@ -725,11 +725,12 @@ transferPolyIdInfo old_id abstract_wrt new_id
old_inline_prag = inlinePragInfo old_info
old_occ_info = occInfo old_info
new_arity = old_arity + arity_increase
old_strictness = strictnessInfo old_info
new_strictness = fmap (increaseStrictSigArity arity_increase) old_strictness
new_strictness = increaseStrictSigArity arity_increase old_strictness
transfer new_info = new_info `setStrictnessInfo` new_strictness
`setArityInfo` new_arity
transfer new_info = new_info `setArityInfo` new_arity
`setInlinePragInfo` old_inline_prag
`setOccInfo` old_occ_info
`setStrictnessInfo` new_strictness
\end{code}
@@ -25,7 +25,9 @@ module IdInfo (
seqIdInfo, megaSeqIdInfo,
-- ** Zapping various forms of Info
zapLamInfo, zapDemandInfo, zapFragileInfo,
zapLamInfo, zapFragileInfo,
zapDemandInfo,
-- ** The ArityInfo type
ArityInfo,
@@ -82,12 +84,10 @@ import BasicTypes
import DataCon
import TyCon
import ForeignCall
import Demand
import Outputable
import Module
import FastString
import Data.Maybe
import Demand
-- infixl so you can say (id `set` a `set` b)
infixl 1 `setSpecInfo`,
@@ -203,14 +203,10 @@ data IdInfo
inlinePragInfo :: InlinePragma, -- ^ Any inline pragma atached to the 'Id'
occInfo :: OccInfo, -- ^ How the 'Id' occurs in the program
strictnessInfo :: Maybe StrictSig, -- ^ Id strictness information. Reason for Maybe:
-- the DmdAnal phase needs to know whether
-- this is the first visit, so it can assign botSig.
-- Other customers want topSig. So @Nothing@ is good.
strictnessInfo :: StrictSig, -- ^ A strictness signature
demandInfo :: Demand -- ^ ID demand information
demandInfo :: Maybe Demand -- ^ Id demand information. Similarly we want to know
-- if there's no known demand yet, for when we are looking
-- for CPR info
}
-- | Just evaluate the 'IdInfo' to WHNF
@@ -227,20 +223,18 @@ megaSeqIdInfo info
-- some unfoldings are not calculated at all
-- seqUnfolding (unfoldingInfo info) `seq`
seqDemandInfo (demandInfo info) `seq`
seqDemandInfo (demandInfo info) `seq`
seqStrictnessInfo (strictnessInfo info) `seq`
seqCaf (cafInfo info) `seq`
seqLBVar (lbvarInfo info) `seq`
seqOccInfo (occInfo info)
seqStrictnessInfo :: Maybe StrictSig -> ()
seqStrictnessInfo Nothing = ()
seqStrictnessInfo (Just ty) = seqStrictSig ty
seqStrictnessInfo :: StrictSig -> ()
seqStrictnessInfo ty = seqStrictSig ty
seqDemandInfo :: Maybe Demand -> ()
seqDemandInfo Nothing = ()
seqDemandInfo (Just dmd) = seqDemand dmd
seqDemandInfo :: Demand -> ()
seqDemandInfo dmd = seqDemand dmd
\end{code}

Setters
@@ -275,10 +269,10 @@ setCafInfo info caf = info { cafInfo = caf }
setLBVarInfo :: IdInfo -> LBVarInfo -> IdInfo
setLBVarInfo info lb = {-lb `seq`-} info { lbvarInfo = lb }
setDemandInfo :: IdInfo -> Maybe Demand -> IdInfo
setDemandInfo info dd = dd `seq` info { demandInfo = dd }
setDemandInfo :: IdInfo -> Demand -> IdInfo
setDemandInfo info dd = dd `seq` info { demandInfo = dd }
setStrictnessInfo :: IdInfo -> Maybe StrictSig -> IdInfo
setStrictnessInfo :: IdInfo -> StrictSig -> IdInfo
setStrictnessInfo info dd = dd `seq` info { strictnessInfo = dd }
\end{code}

@@ -295,8 +289,8 @@ vanillaIdInfo
lbvarInfo = NoLBVarInfo,
inlinePragInfo = defaultInlinePragma,
occInfo = NoOccInfo,
demandInfo = Nothing,
strictnessInfo = Nothing
demandInfo = topDmd,
strictnessInfo = topSig
}
-- | More informative 'IdInfo' we can use when we know the 'Id' has no CAF references
@@ -363,9 +357,8 @@ type InlinePragInfo = InlinePragma
%************************************************************************

\begin{code}
pprStrictness :: Maybe StrictSig -> SDoc
pprStrictness Nothing = empty
pprStrictness (Just sig) = ppr sig
pprStrictness :: StrictSig -> SDoc
pprStrictness sig = ppr sig
\end{code}


@@ -524,7 +517,7 @@ zapLamInfo info@(IdInfo {occInfo = occ, demandInfo = demand})
| is_safe_occ occ && is_safe_dmd demand
= Nothing
| otherwise
= Just (info {occInfo = safe_occ, demandInfo = Nothing})
= Just (info {occInfo = safe_occ, demandInfo = topDmd})
where
-- The "unsafe" occ info is the ones that say I'm not in a lambda
-- because that might not be true for an unsaturated lambda
@@ -535,16 +528,13 @@ zapLamInfo info@(IdInfo {occInfo = occ, demandInfo = demand})
OneOcc _ once int_cxt -> OneOcc insideLam once int_cxt
_other -> occ
is_safe_dmd Nothing = True
is_safe_dmd (Just dmd) = not (isStrictDmd dmd)
is_safe_dmd dmd = not (isStrictDmd dmd)
\end{code}

\begin{code}
-- | Remove demand info on the 'IdInfo' if it is present, otherwise return @Nothing@
zapDemandInfo :: IdInfo -> Maybe IdInfo
zapDemandInfo info@(IdInfo {demandInfo = dmd})
| isJust dmd = Just (info {demandInfo = Nothing})
| otherwise = Nothing
zapDemandInfo info = Just (info {demandInfo = topDmd})
\end{code}

\begin{code}
@@ -230,7 +230,6 @@ Hence we translate to
-- Coercion from family type to representation type
Co7T a :: T [a] ~ :R7T a


Note [Newtype datacons]
~~~~~~~~~~~~~~~~~~~~~~~
The "data constructor" for a newtype should always be vanilla. At one
@@ -286,10 +285,10 @@ mkDictSelId dflags no_unf name clas
-- to get (say) C a -> (a -> a)
base_info = noCafIdInfo
`setArityInfo` 1
`setStrictnessInfo` Just strict_sig
`setUnfoldingInfo` (if no_unf then noUnfolding
else mkImplicitUnfolding dflags rhs)
`setArityInfo` 1
`setStrictnessInfo` strict_sig
`setUnfoldingInfo` (if no_unf then noUnfolding
else mkImplicitUnfolding dflags rhs)
-- In module where class op is defined, we must add
-- the unfolding, even though it'll never be inlined
-- becuase we use that to generate a top-level binding
@@ -318,10 +317,12 @@ mkDictSelId dflags no_unf name clas
-- where the V depends on which item we are selecting
-- It's worth giving one, so that absence info etc is generated
-- even if the selector isn't inlined
strict_sig = mkStrictSig (mkTopDmdType [arg_dmd] TopRes)
strict_sig = mkStrictSig (mkTopDmdType [arg_dmd] topRes)
arg_dmd | new_tycon = evalDmd
| otherwise = Eval (Prod [ if the_arg_id == id then evalDmd else Abs
| id <- arg_ids ])
| otherwise = mkProdDmd [ if the_arg_id == id then evalDmd else absDmd
| id <- arg_ids ]
tycon = classTyCon clas
new_tycon = isNewTyCon tycon
@@ -384,7 +385,7 @@ mkDataConWorkId wkr_name data_con
wkr_arity = dataConRepArity data_con
wkr_info = noCafIdInfo
`setArityInfo` wkr_arity
`setStrictnessInfo` Just wkr_sig
`setStrictnessInfo` wkr_sig
`setUnfoldingInfo` evaldUnfolding -- Record that it's evaluated,
-- even if arity = 0
@@ -428,9 +429,9 @@ dataConCPR con
, isDataTyCon tycon
, wkr_arity > 0
, wkr_arity <= mAX_CPR_SIZE
= retCPR
= cprRes
| otherwise
= TopRes
= topRes
-- RetCPR is only true for products that are real data types;
-- that is, not unboxed tuples or [non-recursive] newtypes
where
@@ -486,15 +487,15 @@ mkDataConRep dflags fam_envs wrap_name data_con
-- applications are treated as values
`setInlinePragInfo` alwaysInlinePragma
`setUnfoldingInfo` wrap_unf
`setStrictnessInfo` Just wrap_sig
`setStrictnessInfo` wrap_sig
-- We need to get the CAF info right here because TidyPgm
-- does not tidy the IdInfo of implicit bindings (like the wrapper)
-- so it not make sure that the CAF info is sane
wrap_sig = mkStrictSig (mkTopDmdType wrap_arg_dmds (dataConCPR data_con))
wrap_arg_dmds = map mk_dmd (dropList eq_spec wrap_bangs)
mk_dmd str | isBanged str = evalDmd
| otherwise = lazyDmd
| otherwise = topDmd
-- The Cpr info can be important inside INLINE rhss, where the
-- wrapper constructor isn't inlined.
-- And the argument strictness can be important too; we
@@ -891,10 +892,10 @@ mkPrimOpId prim_op
id = mkGlobalId (PrimOpId prim_op) name ty info
info = noCafIdInfo
`setSpecInfo` mkSpecInfo (maybeToList $ primOpRules name prim_op)
`setArityInfo` arity
`setStrictnessInfo` Just strict_sig
`setInlinePragInfo` neverInlinePragma
`setSpecInfo` mkSpecInfo (maybeToList $ primOpRules name prim_op)
`setArityInfo` arity
`setStrictnessInfo` strict_sig
`setInlinePragInfo` neverInlinePragma
-- We give PrimOps a NOINLINE pragma so that we don't
-- get silly warnings from Desugar.dsRule (the inline_shadows_rule
-- test) about a RULE conflicting with a possible inlining
@@ -924,12 +925,12 @@ mkFCallId dflags uniq fcall ty
info = noCafIdInfo
`setArityInfo` arity
`setStrictnessInfo` Just strict_sig
`setStrictnessInfo` strict_sig
(_, tau) = tcSplitForAllTys ty
(arg_tys, _) = tcSplitFunTys tau
arity = length arg_tys
strict_sig = mkStrictSig (mkTopDmdType (replicate arity evalDmd) TopRes)
(_, tau) = tcSplitForAllTys ty
(arg_tys, _) = tcSplitFunTys tau
arity = length arg_tys
strict_sig = mkStrictSig (mkTopDmdType (replicate arity evalDmd) topRes)
\end{code}


0 comments on commit 0831a12

Please sign in to comment.
You can’t perform that action at this time.