In [8]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np

As explained in the report, hardcode latencies for critical points and establish a function to obtain the CPU latency between Core #0 and Core#p. Such function trivially interpolates the CPU latency assigning the same value found at the critical point until the next one is reached. 

In [9]:
latencies = {
  "1": 0.14,   # CCX
  "4": 0.35,   # CCD
  "8": 0.35,   # NUMA
  "32": 0.41,  # socket 
  "64": 0.65,  # node 
  "128": 1.82, # cluster
}
def get_latency(p):
  for k in list(latencies)[::-1]:
    if int(k) <= p:
      return latencies[k]
  return latencies[list(latencies.keys())[0]]

Build up a matrix from a list of cpu cores (hereby intended as processes used by the MPI benchmark) and the generated latency values, as explained above.

In [10]:
df = pd.DataFrame(data={
  "cpu": range(1, 256 + 1),
  "avg_latency": [get_latency(i) for i in range(256)]
}).dropna()

Load measured values for the Barrier collective using the Linear algorithm and calculate its estimate based on an empirical formula, as explained in the report.

In [11]:
actual = pd.read_csv("data/20240221_090626_epyc_barrier1_msize2.txt")
df["actual_barrier"] = actual
df["estimate_barrier"] = df.apply(lambda e: 2 * e["avg_latency"] + 0.15 * df.loc[0 : e["cpu"] - 1]["avg_latency"].sum(), axis=1)

In [12]:
df

Unnamed: 0,cpu,avg_latency,actual_barrier,estimate_barrier
0,1,0.14,0.19,0.3010
1,2,0.14,0.52,0.3220
2,3,0.14,0.46,0.3430
3,4,0.14,0.90,0.3640
4,5,0.35,0.96,0.8365
...,...,...,...,...
251,252,1.82,45.00,47.2540
252,253,1.82,40.26,47.5270
253,254,1.82,38.89,47.8000
254,255,1.82,13.33,48.0730


In [13]:
fig = px.scatter(df, x="cpu", y="actual_barrier")
fig.add_trace(go.Scatter(x=df["cpu"], y=df["estimate_barrier"], mode="lines", name="est_barr"))