# POSIX Consistency Performance

Analyze the results of an ensemble of IOR tests that demonstrate the performance loss at small transfer sizes that result from enforcing strict POSIX consistency.  The IOR jobs themselves ran under the following conditions:

1. DataWarp (lock-free, no cache) shared file
2. DataWarp (lock-free, no cache) file per process
3. Lustre (locking, cache) shared file
4. Lustre (locking, cache) file per process

Both instances of DataWarp represent lock-free and no-cache implementations of strong consistency, and their results are similar.  However we use the Lustre shared file to illustrate lock contention for transfers less than the lock granularity used (1 MiB), and we contrast that with Lustre file pr process to show the benefits of caching in the absence of lock contention.  Although Lustre file per process does still use locking, there is no contention and we can use it to approximate the behavior of fully lockless I/O.

All of the jobs which this script analyzes were run on the Cori TDS (Gerty) system at NERSC using eight Haswell nodes, two DataWarp nodes, and four Lustre OSSes.

In [None]:
%matplotlib inline

In [None]:
import re
import glob
import pandas
import matplotlib
import matplotlib.pyplot as plt
matplotlib.rcParams.update({'font.size': 18})

In [None]:
REX_RESULT = re.compile('^write.*(MPIIO|POSIX)')
REX_USE_DW = re.compile('^\s*test filename\s*=.*cray.*dws')
FILE_MODES = [ 'ssf', 'fpp' ]

In [None]:
### Parse all IOR output files
results = []
for output_file in glob.glob('out.*'):
    target_fs = 'lustre'
    with open(output_file, 'r') as fp:
        for line in fp:
            ### extract the file system
            match = REX_USE_DW.search(line)
            if match is not None:
                target_fs = 'datawarp'
                continue
            ### extract the performance, sff/fpp, and transfer size
            match = REX_RESULT.search(line)
            if match is not None:
                fields = line.split()
                mean_rate = float(fields[3])
                file_mode = int(fields[10])
                xfer_size = int(fields[17])
                results.append((target_fs, FILE_MODES[file_mode], xfer_size, mean_rate))

In [None]:
### Convert list of tuples into a multiindexed dataframe
index = pandas.MultiIndex.from_tuples([i[0:3] for i in results], names=['target_fs', 'file_mode', 'xfer_size'])
df = pandas.DataFrame([i[3] for i in results], index=index, columns=['performance'])

In [None]:
df_plot = pandas.DataFrame()
for target_fs, sub_df in df.groupby(level='target_fs'):
    for file_mode, sub_sub_df in sub_df.groupby(level='file_mode'):
        tmp_df = sub_sub_df.reset_index()[['xfer_size','performance']] \
                           .sort_values(by='xfer_size') \
                           .set_index('xfer_size')
        label = target_fs + "," + file_mode
        if df_plot is None:
            df_plot = pandas.DataFrame(index=tmp_df.index)
        df_plot[label] = tmp_df['performance'] / tmp_df['performance'].max()

In [None]:
fig, ax = plt.subplots()
fig.set_size_inches(10, 6)
fig.suptitle("Performance of POSIX/Non-POSIX Writes")
ax.set_xscale('log', basex=2)

df_plot.plot(ax=ax, marker='o', linestyle='-', linewidth=3, markersize=10)

ax.grid()
ax.set_xlabel("Transfer Size (bytes)")
ax.set_ylabel("Fraction Peak Performance")

### Redo the labels to correspond to the accompanying text
convert_labels = {
    "datawarp,fpp": "Option #1:\nNo Cache/No Locks",
    "lustre,fpp": "Violate POSIX\n(Cache+Lockless)",
    "lustre,ssf": "Option #4:\nCache+Locks"
}
### Also reorder the labels for clarity in the explanation
new_labels = []
handles, labels = ax.get_legend_handles_labels()
for idx, label in enumerate(labels):
    labels[idx] = convert_labels[label]
    ### Also differentiate between POSIX and non-POSIX cases
    if "Violate" not in convert_labels[label]:
        handles[idx].set_linestyle("--")
    else:
        handles[idx].set_linewidth(4)
        handles[idx].set_markersize(12)
new_order = [i for i, j in sorted(enumerate(labels),
                                  key=lambda x:x[1],
                                  reverse=False)]
### Apply newly minted labels
ax.legend(handles=[handles[i] for i in new_order],
          labels=[labels[i] for i in new_order],
          prop={'size':16})

### Change x axis labels to be human-readable
labels = ax.get_xticks().tolist()
for idx, label in enumerate(labels):
    if label > 2**30:
        labels[idx] = "%d GiB" % (label / 2**30)
    elif label > 2**20:
        labels[idx] = "%d MiB" % (label / 2**20)
    elif label > 2**10:
        labels[idx] = "%d KiB" % (label / 2**10)
    else:
        labels[idx] = "%d" % label
ax.set_xlabel("Transfer Size")

ax.set_xticklabels(labels)

### Legend is big, so give it a little more room
ylim = ax.get_ylim()
ax.set_ylim((-0.1, ylim[1]*1.0))
xlim = ax.get_xlim()
ax.set_xlim((xlim[0], xlim[1]*1.5))
fig.savefig('consistency-performance.png', dpi=200)