# SST Memory Model

There are a few assumptions built into this model
- There is only one type of component, which has no subcomponents
- There is no use of SST's statistics features
- If this is modeling a parallel run, that parallel run is using --parallel-load=SINGLE of a Python configuration file

## Threats to validity and other notes:

- I model data spent on vectors by multiplying by their size (usually the number of components or links). However, some space might get wasted by capacity.
  - Although because of demand paging maybe that's not a huge concern.
- This model is not accounting for storage spent on in-flight events. It's only acccounting for the cost of the model (components and links).
- The model will not account for memory fragmentation

## User supplied parameters to the model

In [15]:
numComps = 1000*1000
numLinks = numComps*8
numGhostComps = 0
numCrossPartitionLinks = 0

# Compiler\environment specific values. You can compute these for your compiler using the following small C++ program:
#
#  #include <cstdio>
#  #include <cstdint>
#  #include <string>
#  #include <vector>
#  #include <map>
#  #define S(LABEL, T) printf("sizeof_" LABEL " = %zu\n", sizeof(T));
#  int main() { typedef std::map<int, int> map; S("int", int); S("float", float); S("double", double); S("ptr", void*);
#    S("uintptr", uintptr_t); S("string", std::string); S("vector", std::vector<int>); S("map", map); return 0; }
sizeof_int              = 4
sizeof_float            = 4
sizeof_double           = 8
sizeof_ptr              = 8
sizeof_uintptr          = 8
sizeof_string           = 32
sizeof_vector           = 24
sizeof_map              = 48

# SST uses several string identifiers stored as std::string objects. These identifiers will be embedded inside of the C++
# string if they're less than the Small String Optimization (SSO) size. You can determine what this is for your compiler
# with the following small program:
#
#  #include <iostream>
#  #include <string>
#  int main() { std::cout << "sso_buf_size = " << std::string().capacity() << std::endl; return 0; }
sso_buf_size = 15

# These values may be a little difficult to estimate, but are the values for `.capacity()` of strings supplied in the users
# model configuration. If you expect these too all be less-than sso_buf_size you can set these to zero. Later in the script
# we will reset these values to 0 if they're less-than sso_buf_size.
avgStringCap_componentName = 0
avgStringCap_componentType = 0
avgStringCap_paramValue    = 0
avgStringCap_linkName      = 0
avgStringCap_portName      = 0
avgStringCap_linkLatency   = 0

# We need to account for data in the user-written component itself
# Look at the header file for your component and sum all of the data members it stores. Also check to see if any data is
# heap allocated (e.g. allocated using `new`) in the constructor. If there are event handlers allocated also account
# for the 40 bytes each of those take.
userComponentData = \
    12  * sizeof_int + sizeof_double + sizeof_float + sizeof_string + sizeof_vector + 11 * sizeof_ptr + 624 * sizeof_int + 40

# This is the number of parameters the component takes in the SST_ELI_DOCUMENT_PARAMS macro
numParamsInUserComponent = 13

# Aside from the data stored for the key and value themselves, this value represents how much data std::map stores per element.
# Most implementations are a red-black tree, and each node of the tree might have three pointers (to two children and the parent)
# and a color value. Unfortunately I don't know of a way to easily compute this.
cost_per_element_in_map   = 32

# If you would like to model padding set these values. To determine this I compare the
# difference reported by C++'s sizeof() operator and what one would expect from summing the sizeof()
# each field in the datastructure. If you would like to do the same, set all these values to zero, run the notebook,
# and then use the code in the cell at the end of this notebook
paddingFor_rankInfo         =  0
paddingFor_params           =  7
paddingFor_configStatistics =  0 #22
paddingFor_configComponent  =  54
paddingFor_configLink       =  7
paddingFor_componentInfo    =  18
paddingFor_link             =  0

# Parameterizing to certain benchmarks

In [16]:
# Here are some parameter values for a handful of benchmarks. Set to '' if you don't
# want to change the values from above

model = ''

#model = 'phold'
#model = 'gol'
#model = 'pp'
#model = 'none'

if model == 'phold':
    numComps = 1000*1000
    numLinks = (numComps*9)
    numGhostComps = 0
    numCrossPartitionLinks = 0
    userComponentData = 12 * sizeof_int + sizeof_double + sizeof_float + sizeof_string + sizeof_vector + 11 * sizeof_ptr + 624 * sizeof_int + 40
    numParamsInUserComponent = 13

if model == 'gol':
    numComps = 1000*1000
    numLinks = (numComps*8)
    numGhostComps = 0
    numCrossPartitionLinks = 0
    userComponentData = sizeof_int*2 + sizeof_ptr * 8 + 9 * 40
    numParamsInUserComponent = 3
  
elif model == 'pp':
    numComps = 1000*1000
    numLinks = (numComps*4)
    numGhostComps = 0
    numCrossPartitionLinks = 0
    userComponentData = 8*4 + sizeof_ptr * 4 + 4 * 40
    numParamsInUserComponent = 4

elif model == 'none':
    numComps = 0
    numLinks = 0
    numGhostComps = 0
    numCrossPartitionLinks = 0
    userComponentData = 0
    numParamsInUserComponent = 0

### Clean up user parameters

In [17]:
def zero_if_under_sso(value):
    return 0 if value < sso_buf_size else value

avgStringCap_componentName = zero_if_under_sso(avgStringCap_componentName)
avgStringCap_componentType = zero_if_under_sso(avgStringCap_componentType)
avgStringCap_paramValue    = zero_if_under_sso(avgStringCap_componentName)
avgStringCap_linkName      = zero_if_under_sso(avgStringCap_linkName)
avgStringCap_portName      = zero_if_under_sso(avgStringCap_portName)
avgStringCap_linkLatency   = zero_if_under_sso(avgStringCap_linkLatency)

## The Model Itself

### Misc. utility values and functions used in the remainder of the script:

In [18]:
sizeof_bool            = 1
sizeof_uint8           = 1
sizeof_uint16          = 2
sizeof_uint32          = 4
sizeof_uint64          = 8

avgLinksPerComp = 0 if numComps == 0 else (numLinks/numComps)*2  # multiplied by 2 since links are directional in the sim representation

def sumTupSeconds(iterable):
    return sum(x[1] for x in iterable)

def format_size(value_in_bytes, decimals=2):
    if value_in_bytes == 0:
        return "0 Bytes"
    units = ["B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB"]
    power = 0
    while value_in_bytes >= 1024 and power < len(units) - 1:
        value_in_bytes /= 1024
        power += 1
    return f"{value_in_bytes:.{decimals}f} {units[power]}"

### Size of relevant SST data structures

In [19]:
# I mark some string-based identifiers we may want to replace with ints with [1], and for [2]
# I mark some statistics fields that we may want to remove for certain builds of SST (when
# the user doesn't need them).

sizeof_componentId_t   = sizeof_uint64
sizeof_linkId_t        = sizeof_uint32
sizeof_statisticId_t   = sizeof_uint64
sizeof_simTime_t       = sizeof_uint64

fields_rankInfo = [
    ('rank'     , sizeof_uint32),
    ('thread'   , sizeof_uint32),
    ('(vtable)' , sizeof_ptr)]
sizeof_rankInfo = sumTupSeconds(fields_rankInfo) + paddingFor_rankInfo

fields_params = [
    ('my_data'        , sizeof_map),
    ('data'           , sizeof_vector),
    ('allowedKeys'    , sizeof_vector),
    ('verify_enabled' , sizeof_bool)]
sizeof_params = sumTupSeconds(fields_params) + paddingFor_params

fields_configStatistic = [
    ('id'     , sizeof_statisticId_t),
    ('params' , sizeof_params),
    ('shared' , sizeof_bool),
    ('name'   , sizeof_string)]   # [1]
sizeof_configStatistic = sumTupSeconds(fields_configStatistic) + paddingFor_configStatistics

fields_configComponent = [
    ('id'               , sizeof_componentId_t),
    ('graph'            , sizeof_ptr),
    ('name'             , sizeof_string),  # [1]
    ('slot_num'         , sizeof_int),
    ('type'             , sizeof_string),  # [1]
    ('weight'           , sizeof_float),
    ('rank'             , sizeof_rankInfo),
    ('links'            , sizeof_vector),
    ('params'           , sizeof_params),
    ('statLoadLevel'    , sizeof_uint8),   # [2]
    ('portModules'      , sizeof_map),
    ('enabledStatNames' , sizeof_map),     # [2]
    ('enabledAllStats'  , sizeof_bool),    # [2]
    ('allStatConfig'    , sizeof_configStatistic), # [2]
    ('subComponents'    , sizeof_vector),
    ('coords'           , sizeof_vector),
    ('nextSubID'        , sizeof_uint16),
    ('nextStatID'       , sizeof_uint16), # [2]
    ('visited'          , sizeof_bool),
    ('statistics'       , sizeof_map),    # [2]
    ('(vtable)'         , sizeof_ptr)]
sizeof_configComponent = sumTupSeconds(fields_configComponent) + paddingFor_configComponent

fields_configLink = [
    ('id'          , sizeof_linkId_t),
    ('name'        , sizeof_string),  # [1]
    ('component'   , 2*sizeof_componentId_t),
    ('port'        , 2*sizeof_string), # [1]
    ('latency'     , 2*sizeof_simTime_t),
    ('latency_str' , 2*sizeof_string),
    ('order'       , sizeof_linkId_t),
    ('no_cut'      , sizeof_bool),
    ('(vtable)'    , sizeof_ptr)]
sizeof_configLink = sumTupSeconds(fields_configLink) + paddingFor_configLink

fields_componentInfo = [
    ('id'                  , sizeof_componentId_t),
    ('parent_info'         , sizeof_ptr),
    ('name'                , sizeof_string), # [1]
    ('type'                , sizeof_string), # [1]
    ('link_map'            , sizeof_ptr),
    ('component'           , sizeof_ptr),
    ('subComponents'       , sizeof_map),
    ('params'              , sizeof_ptr),
    ('defaultTimeBase'     , sizeof_ptr),
    ('portModules'         , sizeof_ptr),
    ('stat_configs_'       , sizeof_ptr),   # [2]
    ('enabled_stat_names_' , sizeof_ptr),   # [2]
    ('enabled_all_stats_'  , sizeof_bool),  # [2]
    ('all_stat_config_'    , sizeof_ptr),   # [2]
    ('statLoadLevel'       , sizeof_uint8), # [2]
    ('coordinates'         , sizeof_vector),
    ('subIDIndex'          , sizeof_uint64),
    ('slot_name'           , sizeof_string),
    ('slot_num'            , sizeof_int),
    ('share_flags'         , sizeof_uint64)]
sizeof_componentInfo = sumTupSeconds(fields_componentInfo) + paddingFor_componentInfo

fields_link = [
    ('send_queue'      , sizeof_ptr),
    ('delivery_info'   , sizeof_uintptr),
    ('defaultTimeBase' , sizeof_simTime_t),
    ('latency'         , sizeof_simTime_t),
    ('pair_link'       , sizeof_ptr),
    ('current_time'    , sizeof_ptr),
    ('type'            , sizeof_uint16),
    ('mode'            , sizeof_uint16),
    ('tag'             , sizeof_linkId_t),
    ('attached_tools'  , sizeof_ptr)]
sizeof_link = sumTupSeconds(fields_link) + paddingFor_link

fields_baseComponent = [
    ('my_info'                        , sizeof_ptr),
    ('sim_'                           , sizeof_ptr),
    ('isExtension'                    , sizeof_bool),
    ('clock_handlers'                 , sizeof_vector),
    ('portModules'                    , sizeof_vector),
    ('m_explicitlyEnabledSharedStats' , sizeof_map),
    ('m_explicitlyEnabledUniqueStats' , sizeof_map),
    ('m_enabled_all_stats_'           , sizeof_map),
    ('(vtable)'                       , sizeof_ptr)]
sizeof_baseComponent = sumTupSeconds(fields_baseComponent)

print("Size of Params:          ", sizeof_params)
print("Size of ConfigStatistic: ", sizeof_configStatistic)
print("Size of ConfigComponent: ", sizeof_configComponent)
print("Size of ConfigLink:      ", sizeof_configLink)
print("Size of ComponentInfo:   ", sizeof_componentInfo)
print("Size of Link:            ", sizeof_link)
print("Size of BaseComponent:   ", sizeof_baseComponent)

Size of Params:           104
Size of ConfigStatistic:  145
Size of ConfigComponent:  638
Size of ConfigLink:       216
Size of ComponentInfo:    288
Size of Link:             64
Size of BaseComponent:    217


### Pointed to data from key structures

In [20]:
pointedToFieldsAvgs_params = [
    ('my_data' , 0),

    # my_data is a map of (integer) keys to string values. We model the data for these values here
    ('my_data.string' , 0),

    # What this data field is used for is a bit of a mystery to me. I'm thinking maybe it somehow
    # relates to subcomponents? Anyway, for ping pong we only see one entry for this so I'm going
    # to model that, that's the case in general.
    ('data', 1 * sizeof_ptr),

    # I'm assuming each pointed to map in data is unique so this accounts for the cost of allocating
    # those maps
    ('data.maps', 1 * sizeof_map + sizeof_string + cost_per_element_in_map),

    # and this represents the cost of the strings within those maps
    ('data.maps.strings',  numParamsInUserComponent * avgStringCap_paramValue),
    
    ('allowedKeys', 0)]
heapDataAvgFor_params = sumTupSeconds(pointedToFieldsAvgs_params)

pointedToFieldsAvgs_configComp = [
    ('name',   avgStringCap_componentName),
    ('type',   avgStringCap_componentType),
    ('links',  avgLinksPerComp * sizeof_uint32),
    ('params', heapDataAvgFor_params),
    ('coords', 3 * sizeof_uint64)]
heapDataAvgFor_configComp = sumTupSeconds(pointedToFieldsAvgs_configComp)

pointedToFieldsAvgs_configLink = [
    ('name',        avgStringCap_linkName),
    ('port',        2 * avgStringCap_portName),
    ('latency_str', 2 * avgStringCap_linkLatency)]
heapDataAvgFor_configLink = sumTupSeconds(pointedToFieldsAvgs_configLink)

# This accounts for pointed-to data in the ConfigGraph where we have one instance of each per component.
# Note, one threat to validity is that (later on) we should really be multiplying by the capacity of 'comps'
# rather than the number of components.
pointedToFieldsAvgs_perComponent_configGraph = [
    ('comps'       , sizeof_ptr),
    ('compsByName' , sizeof_string + sizeof_componentId_t + cost_per_element_in_map),

    # This accounts for the data stored for the string values used as the keys of `compsByName`
    ('compsByName.string' , avgStringCap_componentName)]
heapDataAvgPerComponentFor_configGraph = sumTupSeconds(pointedToFieldsAvgs_perComponent_configGraph)

# And similiarly, this accounts for pointed-to data in the ConfigGraph where we have one instance of
# each per link.
pointedToFieldsAvgs_perLink_configGraph = [
    ('links',       sizeof_ptr)]
heapDataAvgPerLinkFor_configGraph = sumTupSeconds(pointedToFieldsAvgs_perLink_configGraph)

pointedToFieldsAvgs_componentInfo = [
    ('name',             avgStringCap_componentName),
    ('type',             avgStringCap_componentType),
    ('link_map type',    48),
    ('link_map content', 59 + avgLinksPerComp * cost_per_element_in_map),
    ('params',           0),
    ('coordinates',      3 * sizeof_uint64)]
heapDataAvgFor_componentInfo = sumTupSeconds(pointedToFieldsAvgs_componentInfo)

print("Average heap data for params:            ", heapDataAvgFor_params)
print("Average heap data for config components: ", heapDataAvgFor_configComp)
print("Average heap data for config links:      ", heapDataAvgFor_configLink)
print("Average heap data for componentInfos:    ", heapDataAvgFor_componentInfo)
print("Average heap data in config graph per component: ", heapDataAvgPerComponentFor_configGraph)
print("Average heap data in config graph per link:      ", heapDataAvgPerLinkFor_configGraph)

Average heap data for params:             120
Average heap data for config components:  208.0
Average heap data for config links:       0
Average heap data for componentInfos:     643.0
Average heap data in config graph per component:  80
Average heap data in config graph per link:       8


---

## Results from Model

### Graph Construction

In [21]:
gc_mem_per_comp = sizeof_configComponent + heapDataAvgFor_configComp + heapDataAvgPerComponentFor_configGraph
gc_mem_per_link = sizeof_configLink      + heapDataAvgFor_configLink + heapDataAvgPerLinkFor_configGraph

gc_mem_per_ghost_comp = sizeof_configComponent + heapDataAvgPerComponentFor_configGraph
gc_mem_per_ghost_link = sizeof_configLink      + heapDataAvgPerLinkFor_configGraph

print("Expected memory per component: ", gc_mem_per_comp)
print("Expected memory per link:      ", gc_mem_per_link)

Expected memory per component:  926.0
Expected memory per link:       224


### Link Preparation

In [22]:
lp_mem_per_comp = sizeof_componentInfo + heapDataAvgFor_componentInfo
lp_mem_per_link = sizeof_link
lp_mem_per_ghost_link = sizeof_link

print("Expected memory per componentInfo: ", lp_mem_per_comp)
print("Expected memory per link:          ", lp_mem_per_link)

Expected memory per componentInfo:  931.0
Expected memory per link:           64


### Wireup

In [23]:
# This class might be padded, I'm not accounting for that
memPer_simulatedComponent = sizeof_baseComponent + userComponentData

print('Expected memory per simulated component: ', memPer_simulatedComponent)

Expected memory per simulated component:  2957


## Toy models for each representation

In [24]:
# This cell uses what we built for the model above to boil things down to a simple equation based on number of
# components and links for each representation.

cfgMemPerComp = gc_mem_per_comp
cfgMemPerLink = gc_mem_per_link
cfgMemPerDComp = gc_mem_per_ghost_comp
cfgMemPerDLink = gc_mem_per_ghost_link
cfgEstimate = (cfgMemPerComp * numComps) + (cfgMemPerLink * numLinks) + (cfgMemPerDComp * numGhostComps) + (cfgMemPerDLink * numCrossPartitionLinks)
print("If we just account for config representation:")
print(f"  {cfgMemPerComp:,} * numComps  + {cfgMemPerLink:,} * numLinks + {cfgMemPerDComp:,} * numGComps + {cfgMemPerDLink:,} * numGLinks ({format_size(cfgEstimate)})")
print()

simMemPerComp  = lp_mem_per_comp + memPer_simulatedComponent
simMemPerLink  = lp_mem_per_link*2
simMemPerDComp = 0
simMemPerDLink = lp_mem_per_ghost_link*2
simEstimate = (simMemPerComp * numComps) + (simMemPerLink * numLinks) + (simMemPerDComp * numGhostComps) + (simMemPerDLink * numCrossPartitionLinks)
print("If we just account for simulation representation:")
print(f"  {simMemPerComp:,} * numComps  + {simMemPerLink:,} * numLinks + {simMemPerDComp:,} * numGComps + {simMemPerDLink:,} * numGLinks ({format_size(simEstimate)})")
print()

If we just account for config representation:
  926.0 * numComps  + 224 * numLinks + 718 * numGComps + 224 * numGLinks (2.53 GB)

If we just account for simulation representation:
  3,888.0 * numComps  + 128 * numLinks + 0 * numGComps + 128 * numGLinks (4.57 GB)



## Overall toy model

In [25]:
memPerComp  = cfgMemPerComp  + simMemPerComp
memPerLink  = cfgMemPerLink  + simMemPerLink
memPerDComp = cfgMemPerDComp + simMemPerDComp
memPerDLink = cfgMemPerDLink + simMemPerDLink
estimate    = (memPerComp * numComps) + (memPerLink * numLinks) + (memPerDComp * numGhostComps) + (memPerDLink * numCrossPartitionLinks)

print()
print("# EQUATION FOR MODELING SST MEMORY HIGH-WATER MARK:\n")
print(f"    {memPerComp:,} * numComps  + {memPerLink:,} * numLinks + {memPerDComp:,} * numGComps + {memPerDLink:,} * numGLinks ({format_size(estimate)})")
print()


# EQUATION FOR MODELING SST MEMORY HIGH-WATER MARK:

    4,814.0 * numComps  + 352 * numLinks + 718 * numGComps + 352 * numGLinks (7.11 GB)



---

# How to compute padding

In [26]:
# Find the main function in `src/sst/core/main.cc` and add the following code:
#
#   std::cout << "real_sizeof_rankInfo        = " << sizeof(RankInfo)        << std::endl;
#   std::cout << "real_sizeof_params          = " << sizeof(Params)          << std::endl;
#   std::cout << "real_sizeof_configStatistic = " << sizeof(ConfigStatistic) << std::endl;
#   std::cout << "real_sizeof_configComponent = " << sizeof(ConfigComponent) << std::endl;
#   std::cout << "real_sizeof_configLink      = " << sizeof(ConfigLink)      << std::endl;
#   std::cout << "real_sizeof_componentInfo   = " << sizeof(ComponentInfo)   << std::endl;
#   std::cout << "real_sizeof_link            = " << sizeof(Link)            << std::endl;
#   return 0
#
# Then recompile sst and simply run it. To populate the "real" values below:
real_sizeof_rankInfo        = 16
real_sizeof_params          = 104
real_sizeof_configStatistic = 160
real_sizeof_configComponent = 624
real_sizeof_configLink      = 216
real_sizeof_componentInfo   = 288
real_sizeof_link            = 64

# Take the result of the following and copy and paste them into the top of the script
print("paddingFor_rankInfo         = ", real_sizeof_rankInfo        - sumTupSeconds(fields_rankInfo))
print("paddingFor_params           = ", real_sizeof_params          - sumTupSeconds(fields_params))
print("paddingFor_configStatistics = ", real_sizeof_configStatistic - sumTupSeconds(fields_configStatistic))
print("paddingFor_configComponent  = ", real_sizeof_configComponent - sumTupSeconds(fields_configComponent))
print("paddingFor_configLink       = ", real_sizeof_configLink      - sumTupSeconds(fields_configLink))
print("paddingFor_componentInfo    = ", real_sizeof_componentInfo   - sumTupSeconds(fields_componentInfo))
print("paddingFor_link             = ", real_sizeof_link            - sumTupSeconds(fields_link))

paddingFor_rankInfo         =  0
paddingFor_params           =  7
paddingFor_configStatistics =  15
paddingFor_configComponent  =  40
paddingFor_configLink       =  7
paddingFor_componentInfo    =  18
paddingFor_link             =  0
