Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ThroughputModifications #14

Merged
merged 17 commits into from
Aug 6, 2018
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
9ac6813
wrote up 4 cateogires in throughputmodifications.hs, reogred speed up…
David-Durst Jul 30, 2018
c207658
commenting lb better, explaining all 4 categories
David-Durst Jul 30, 2018
8218608
speed up documented according to 4 categories
David-Durst Jul 30, 2018
1d9d301
all of slowdown done except decreaseLBPxPerClock
David-Durst Jul 31, 2018
f4198d0
speedup and slowdown finished documenting
David-Durst Jul 31, 2018
223227c
changed throuhgMult/throughDiv to requestedMult/requestedDiv
David-Durst Jul 31, 2018
6f39e90
reflowed text a little, define newPar in comments
David-Durst Jul 31, 2018
c3352a6
renamed numComb to numTokens and reorded vars for AST.hs and Throughp…
David-Durst Jul 31, 2018
aae0239
reordered args to reduce, rename numComb to numTokens
David-Durst Jul 31, 2018
76c0a48
Swap par and numTokens arg in simhlReduce to match new order.
akeley98 Jul 31, 2018
c56fd50
explaioning sequence array repack slow down and speed up
David-Durst Aug 6, 2018
5e3f14e
renamed modifiable, non-modifiable to directly/indirectly scalable
David-Durst Aug 6, 2018
700f5d0
using throuhgput parameters instead of rate in throuhgput modificatio…
David-Durst Aug 6, 2018
1e46023
rewrote rate to throughput parameter in rest of code, cleaned up comment
David-Durst Aug 6, 2018
879dfb3
defined parent/leaf and directly/indirectly scalable
David-Durst Aug 6, 2018
cc38780
explaining basic type assumption
David-Durst Aug 6, 2018
c6a3295
explain order and value preservation assumption
David-Durst Aug 6, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions src/Core/Aetherling/Analysis/Latency.hs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{-|
Module: Aetherling.Analysis.Latency
Description: Determines the initial latency for how long it takes for a
Description: Compute latency of Aetherling ops

Determines the initial latency for how long it takes for a
pipelined module to receive input, and the max combinational path
for the highest latency, single cycle part of the circuit.
-}
Expand Down Expand Up @@ -54,11 +56,11 @@ initialLatency (ArrayReshape _ _) = 1
initialLatency (DuplicateOutputs _ _) = 1

initialLatency (MapOp _ op) = initialLatency op
initialLatency (ReduceOp par numComb op) | par `mod` numComb == 0 && isComb op = 1
initialLatency (ReduceOp par numComb op) | par `mod` numComb == 0 = initialLatency op * (ceilLog par)
initialLatency (ReduceOp par numComb op) =
initialLatency (ReduceOp numTokens par op) | par == numTokens && isComb op = 1
initialLatency (ReduceOp numTokens par op) | par == numTokens = initialLatency op * (ceilLog par)
initialLatency (ReduceOp numTokens par op) =
-- pipelinng means only need to wait on latency of tree first time
reduceTreeInitialLatency + (numComb `ceilDiv` par) * (initialLatency op + registerInitialLatency)
reduceTreeInitialLatency + (numTokens `ceilDiv` par) * (initialLatency op + registerInitialLatency)
where
reduceTreeInitialLatency = initialLatency (ReduceOp par par op)
-- op adds nothing if its combinational, its CPS else
Expand Down Expand Up @@ -127,7 +129,7 @@ maxCombPath (MapOp _ op) = maxCombPath op
maxCombPath (ReduceOp par _ op) | isComb op = maxCombPath op * ceilLog par
-- since connecting each op to a copy, and all are duplicates,
-- maxCombPath is either internal to each op, or from combining two of them
maxCombPath (ReduceOp par numComb op) = max (maxCombPath op) maxCombPathFromOutputToInput
maxCombPath (ReduceOp numTokens par op) = max (maxCombPath op) maxCombPathFromOutputToInput
where
-- since same output goes to both inputs, just take max of input comb path
-- plus output path as that is max path
Expand Down
3 changes: 1 addition & 2 deletions src/Core/Aetherling/Analysis/Metrics.hs
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{-|
Module: Aetherling.Analysis.Metrics
Description: This provides helper functions and metrics like space used in the
analysis of ops
Description: Types and helper functions for analyzing Aetherling ops
-}
module Aetherling.Analysis.Metrics where
import Aetherling.Operations.Types
Expand Down
18 changes: 10 additions & 8 deletions src/Core/Aetherling/Analysis/PortsAndThroughput.hs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{-|
Module: Aetherling.Analysis.PortsAndThroughput
Description: Determines the input and output ports of an op, the clocks per
Description: Compute interfaces of Aetherling ops

Determines the input and output ports of an op, the clocks per
sequence used to process the inputs on those ports, and the resulting throughput
-}
module Aetherling.Analysis.PortsAndThroughput where
Expand Down Expand Up @@ -53,12 +55,12 @@ inPorts (DuplicateOutputs _ op) = inPorts op

inPorts (MapOp par op) = renamePorts "I" $ liftPortsTypes par (inPorts op)
-- take the first port of the op and duplicate it par times, don't duplicate both
-- ports of reducer as reducing numComb things in total, not per port
inPorts (ReduceOp par numComb op) = renamePorts "I" $ map scaleSeqLen $
-- ports of reducer as reducing numTokens things in total, not per port
inPorts (ReduceOp numTokens par op) = renamePorts "I" $ map scaleSeqLen $
liftPortsTypes par $ portToDuplicate $ inPorts op
where
scaleSeqLen (T_Port name origSLen tType pct) =
T_Port name (origSLen * (numComb `ceilDiv` par)) tType pct
T_Port name (origSLen * (numTokens `ceilDiv` par)) tType pct
portToDuplicate ((T_Port name sLen tType pct):_) = [T_Port name sLen tType pct]
portToDuplicate [] = []

Expand Down Expand Up @@ -236,16 +238,16 @@ clocksPerSequence (MapOp _ op) = cps op
-- reduce needs to get a complete sequence. If less than parallel,
-- need to write to register all but last, if fully parallel or more,
-- reduce is combinational
clocksPerSequence (ReduceOp par numComb op) |
isComb op = combinationalCPS * (numComb `ceilDiv` par)
clocksPerSequence (ReduceOp numTokens par op) |
isComb op = combinationalCPS * (numTokens `ceilDiv` par)
-- Why not including tree height? Because can always can pipeline.
-- Putting inputs in every clock where can accept inputs.
-- Just reset register every numComb/par if not fully parallel.
-- Just reset register every numTokens/par if not fully parallel.
-- What does it mean to reduce a linebuffer?
-- can't. Can't reduce anything with a warmup as this will create
-- an asymmetry between inputs and outputs leading to horrific tree
-- structure
clocksPerSequence (ReduceOp par numComb op) = cps op * (numComb `ceilDiv` par)
clocksPerSequence (ReduceOp numTokens par op) = cps op * (numTokens `ceilDiv` par)

clocksPerSequence (NoOp _) = combinationalCPS
clocksPerSequence (Underutil denom op) = denom * cps op
Expand Down
10 changes: 5 additions & 5 deletions src/Core/Aetherling/Analysis/Space.hs
Original file line number Diff line number Diff line change
Expand Up @@ -69,16 +69,16 @@ space (MapOp par op) = (space op) |* par
-- area of reduce is area of reduce tree, with area for register for partial
-- results and counter for tracking iteration time if input is sequence of more
-- than what is passed in one clock
space (ReduceOp par numComb op) | par == numComb = (space op) |* (par - 1)
space rOp@(ReduceOp par numComb op) =
space (ReduceOp numTokens par op) | par == numTokens = (space op) |* (par - 1)
space rOp@(ReduceOp numTokens par op) =
reduceTreeSpace |+| (space op) |+| (registerSpace $ map pTType $ outPorts op)
|+| (counterSpace $ numComb * (denominator opThroughput) `ceilDiv` (numerator opThroughput))
|+| (counterSpace $ numTokens * (denominator opThroughput) `ceilDiv` (numerator opThroughput))
where
reduceTreeSpace = space (ReduceOp par par op)
-- need to be able to count all clocks in steady state, as that is when
-- will be doing reset every nth
-- thus, divide numComb by throuhgput in steady state to get clocks for
-- numComb to be absorbed
-- thus, divide numTokens by throuhgput in steady state to get clocks for
-- numTokens to be absorbed
-- only need throughput from first port as all ports have same throuhgput
(PortThroughput _ opThroughput) = portThroughput op $ head $ inPorts op

Expand Down
22 changes: 5 additions & 17 deletions src/Core/Aetherling/Operations/AST.hs
Original file line number Diff line number Diff line change
@@ -1,26 +1,14 @@
{-|
Module: Aetherling.Operations.AST
Description: Provides the Aetherling Abstract Syntax Tree (AST) and functions
Description: Aetherling's AST

Provides the Aetherling Abstract Syntax Tree (AST) and functions
for identifying errors in that tree.
-}
module Aetherling.Operations.AST where
import Aetherling.Operations.Types

{-|
The Aetherling Operations. These are split into four groups:
1. Leaf, non-modifiable rate - these are arithmetic, boolean logic,
and bit operations that don't contain any other ops and don't have
a parameter for making them run with a larger or smaller throughput.
2. Leaf, modifiable rate - these are ops like linebuffers,
and space-time type reshapers that have a parameter for changing their
throughput and typically are not mapped over to change their throuhgput.
These ops don't have child ops
3. Parent, non-modifiable rate - these ops like composeSeq and composePar have
child ops that can have their throughputs' modified, but the parent
op doesn't have a parameter that affects throughput
4. Parent, modifiable rate - map is the canonical example. It has child ops
and can have its throughput modified by changing parallelism.
-}
-- | The operations that can be used to create dataflow DAGs in Aetherling
data Op =
-- LEAF OPS
Add TokenType
Expand Down Expand Up @@ -77,7 +65,7 @@ data Op =

-- HIGHER ORDER OPS
| MapOp {mapParallelism :: Int, mappedOp :: Op}
| ReduceOp {reduceParallelism :: Int, reduceNumCombined :: Int, reducedOp :: Op}
| ReduceOp {reduceNumTokens :: Int, reduceParallelism :: Int, reducedOp :: Op}

-- TIMING HELPERS
| NoOp [TokenType]
Expand Down
52 changes: 50 additions & 2 deletions src/Core/Aetherling/Operations/Properties.hs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{-|
Module: Aetherling.Operations.Properties
Description: Describes properties that are intrinsic to operators that do not
Description: Properties of Aetherling ops that don't require analysis

Describes properties that are intrinsic to operators that do not
require any analysis, like if the operator has a combinational path from at
least one input port to one output port.
-}
Expand Down Expand Up @@ -41,7 +43,7 @@ isComb (ArrayReshape _ _) = True
isComb (DuplicateOutputs _ _) = True

isComb (MapOp _ op) = isComb op
isComb (ReduceOp par numComb op) | par == numComb = isComb op
isComb (ReduceOp numTokens par op) | par == numTokens = isComb op
isComb (ReduceOp _ _ op) = False

isComb (NoOp tTypes) = True
Expand All @@ -52,3 +54,49 @@ isComb (Delay _ op) = False
isComb (ComposePar ops) = length (filter isComb ops) > 0
isComb (ComposeSeq ops) = length (filter isComb ops) > 0
isComb (Failure _) = True

hasInternalState :: Op -> Bool
hasInternalState (Add t) = False
hasInternalState (Sub t) = False
hasInternalState (Mul t) = False
hasInternalState (Div t) = False
hasInternalState (Max t) = False
hasInternalState (Min t) = False
hasInternalState (Ashr _ t) = False
hasInternalState (Shl _ t) = False
hasInternalState (Abs t) = False
hasInternalState (Not t) = False
hasInternalState (And t) = False
hasInternalState (Or t) = False
hasInternalState (XOr t) = False
hasInternalState Eq = False
hasInternalState Neq = False
hasInternalState Lt = False
hasInternalState Leq = False
hasInternalState Gt = False
hasInternalState Geq = False
hasInternalState (LUT _) = False

-- this is meaningless for this units that don't have both and input and output
hasInternalState (MemRead _) = True
hasInternalState (MemWrite _) = True
hasInternalState (LineBuffer _ _ _ _ _) = True
hasInternalState (Constant_Int _) = False
hasInternalState (Constant_Bit _) = False

hasInternalState (SequenceArrayRepack _ _ _) = True
hasInternalState (ArrayReshape _ _) = False
hasInternalState (DuplicateOutputs _ _) = False

hasInternalState (MapOp _ op) = hasInternalState op
hasInternalState (ReduceOp numTokens par op) | par == numTokens = hasInternalState op
hasInternalState (ReduceOp _ _ op) = True

hasInternalState (NoOp tTypes) = False
hasInternalState (Underutil denom op) = hasInternalState op
-- since pipelined, this doesn't affect clocks per stream
hasInternalState (Delay _ op) = hasInternalState op

hasInternalState (ComposePar ops) = length (filter hasInternalState ops) > 0
hasInternalState (ComposeSeq ops) = length (filter hasInternalState ops) > 0
hasInternalState (Failure _) = True
4 changes: 3 additions & 1 deletion src/Core/Aetherling/Operations/Types.hs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{-|
Module: Aetherling.Operations.Types
Description: Describes Aetherling's type system and the ports of operations
Description: Type system for interfaces of Aetherling's Ops

Describes Aetherling's type system and the ports of operations
that accept/emit tokens of those types. PortThroughput is also defined here
as it is a type used for comparing ports when composing sequences of ops.
-}
Expand Down
Loading