# Evaluate the Performance and Accuracy as a Function of Runtime
Longer MD simulations are good because they produce more accurate estimates of stability and are less affected by the inefficiencies of task startup.
However, they take longer.
We explore the tradeoff between these effects in this notebook.

In [49]:
%matplotlib inline
from matplotlib import pyplot as plt
from pathlib import Path
import pandas as pd

Configuration

In [50]:
target_system = 'polaris'

## Load Runtime Results
We saved the results of tests with different MOFs and different system configurations in `runtime.json`

In [51]:
runtimes = pd.read_json('runtimes.json', lines=True)
print(f'Loaded {len(runtimes)} experiments')

Loaded 718 experiments


Get a short name for the LAMMPS executable

In [52]:
runtimes['build'] = runtimes.lammps_cmd.apply(lambda x: x[0]).apply(lambda x: Path(x).parent.name[6:])

Get only the target system and LAMMPS executable

In [53]:
runtimes = runtimes[runtimes.host.str.startswith(target_system)]
runtimes = runtimes[runtimes.build.str.len() > 0]
print(f'Downselected to {len(runtimes)} experiments')

Downselected to 718 experiments


In [54]:
runtimes['rate'] = runtimes['timesteps'] / runtimes['runtime']

## Plot Strain Over Timesteps
See how much the measurements of strain change over time. Get the relative difference between the strain and that computed with the maximum timestep count for each MOF

In [55]:
def error_from_best_estimate(group):
    best_est = group[group.timesteps == group['timesteps'].max()]['strain'].mean()
    return (1 - group['strain'] / best_est) * 100

In [56]:
runtimes['error'] = runtimes.groupby('mof', group_keys=False).apply(error_from_best_estimate)

In [57]:
avg_by_length = runtimes.groupby('timesteps')[['error']].agg(['mean', 'std'])

In [58]:
fig, ax = plt.subplots(figsize=(3.5, 2.5))

ax.errorbar(avg_by_length.index, avg_by_length['error']['mean'], fmt='--o', yerr=avg_by_length['error']['std'])
ax.set_xscale('log')


ax.set_xlabel('Timesteps')
ax.set_ylabel('Error, Relative (%)')
fig.tight_layout()
fig.savefig('timestep-comparison.png', dpi=320)

## Compare Builds
Compare builds with a runtime of 1e6 timesteps

In [59]:
subset = runtimes.query('timesteps == 1000000')

In [60]:
summary = subset.groupby('build')['rate'].agg(['mean', 'std'])
summary

Unnamed: 0_level_0,mean,std
build,Unnamed: 1_level_1,Unnamed: 2_level_1
kokkos-nompi,716.562612,40.063568


In [61]:
fig, ax = plt.subplots(figsize=(3.5, 2.))

ax.bar(summary.index, summary['mean'], yerr=summary['std'])

ax.set_ylabel('Rate (steps/s)')

Text(0, 0.5, 'Rate (steps/s)')

In [62]:
best_build = summary['mean'].idxmax()
runtimes.query(f'build=="{best_build}"', inplace=True)
print(f'Filtered to only {best_build}')

Filtered to only kokkos-nompi


## Plot Timestep Rate vs Timestep Count
We should see faster simulations at larger timesteps

In [63]:
avg_by_length = runtimes.groupby('timesteps')[['runtime', 'rate']].mean()

In [64]:
fig, ax = plt.subplots(figsize=(3.5, 2.5))

ax.semilogx(avg_by_length.index, avg_by_length['rate'], '--o')

ax.set_xlabel('Timesteps')
ax.set_ylabel('Rate (steps/s)')

Text(0, 0.5, 'Rate (steps/s)')

We need $10^5$ steps on Polaris to get full performance on a GPU