# BuildStream (master) benchmarks of the Debian-like bst project

The following graphs show different benchmarking results for the time taken and the peak memory usage when invoking `bst show` and `bst build` on the *base-files/base-files.bst* element in the [Debian-like bst project](https://gitlab.com/jennis/debian-stretch-bst/tree/jennis/use_remote_file). Note that for build, benchmarks were obtained for 4, 8 and 12 builders (`bst --builders n build`). 

These benchmarks were all obtained on the same hardware (a Codethink developer machine configured to be a GitLab runner). The hardware specs are as followed:

* Linux (Debian stable)
* x86_64
* 16G RAM
* 500Gb SSD
* Intel i7-3770
* 8 cores @ 3.40 GHz

-----

The CI job which generates the data can be found [here](https://gitlab.com/jennis/benchmark_debian).

The Debian-like bst project can be found [here](https://gitlab.com/jennis/debian-stretch-bst/tree/jennis/use_remote_file).

## To view the interactive graphs please select: Kernel -> 'Restart & Run All' -> 'Restart & Run All Cells'

In [None]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')

In [None]:
# When we read the results from the files, we'll need to convert the
# datetime strings to datetime objects for plotting
from datetime import datetime
def datestring_to_datetime(date_string):
    '''Convert a datestring to a datetime object'''
    date_string = date_string[:-6]  # Remove hours and minutes of the timezone
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%S")

In [None]:
# Read the results files and save the data
import csv

# The following data is the same in time_results.csv and memory_results.csv
commits = []
dates = []
mrs = []
branches = []

# The following data is only contained within time_results.csv
show_times = []
build_4_times = []
build_8_times = []
build_12_times = []
show_cached_times = []

# The following data is only contained within memory_results.csv
show_memories = []
build_4_memories = []
build_8_memories = []
build_12_memories = []
show_cached_memories = []

# Read in the results from time_results.csv
with open('time_results.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    
    for i, row in enumerate(readCSV):
        if i == 0:
            # The first line contains the fields, ignore this
            continue
        
        commits.append(row[0])
        dates.append(datestring_to_datetime(row[1]))
        mrs.append(row[2])
        branches.append(row[3])
        show_times.append(float(row[4]))
        build_4_times.append(float(row[5]))
        build_8_times.append(float(row[6]))
        build_12_times.append(float(row[7]))
        show_cached_times.append(float(row[8]))

# Read in the results from memory_results.csv
with open('memory_results.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    
    for i, row in enumerate(readCSV):
        if i == 0:
            # The first line contains the fields, ignore this
            continue
        
        show_memories.append(float(row[4]))  # XXX: Might need to divide this by 1000 for Mbytes
        build_4_memories.append(float(row[5]))
        build_8_memories.append(float(row[6]))
        build_12_memories.append(float(row[7]))
        show_cached_memories.append(float(row[8]))

In [None]:
import matplotlib.pyplot as plt
import mplcursors
# Note mplcursors is required to show commit info when we hover over a data point
# The package can be installed with pip3: pip3 install mplcursors

## Using the interactive features

At the bottom left of each graph is a toolbar providing tools which allow you to interact with the graph. Hovering over each button will provide a brief description (in the bottom right) of what that button does.

Additionally, if you click on a data point, you will see an information box with the MR and branch. Right click on the box containing this information to remove it.

In my experience, the interactivity is a bit clunky, but here is generally how I do things:

* The home button resets to the original view.
* The square is for zooming, click this and then click + drag over the region you wish to 'zoom' into.
* If you then wish to click on a data point, re-click the 'zoom' button (this effectively 'unclicks' it) and then click on the desired data point.
* To remove the information box containing the MR and branch, right click on it.

In [None]:
# Time taken to show Debian's base files
%matplotlib notebook

fig1, ax1 = plt.subplots(figsize=(8, 5), dpi= 110, facecolor='w', edgecolor='k')

# Plot date vs time (cache and no cache)
ax1.plot(dates, show_times,'x:', label='Uncached elements')
ax1.plot(dates, show_cached_times, 'x:g', label='Cached elements')
# This is hacky, but allows us to see commits when we hover over
scatter1 = ax1.scatter(dates + dates, show_times + show_cached_times, marker='x')

# Labelling
ax1.set_title("Time taken to 'bst show' Debian's base-files")
ax1.set_xlabel('Date')
ax1.set_ylabel('Time (s)')
ax1.legend(loc='lower left')
ax1.grid()

# Limit the y axis
ax1.set_ylim([0, max([max(show_times), max(show_cached_times)])+5])
fig1.autofmt_xdate() # Make the xaxis pretty

# Add commit info to data points when mouse is hovered over
labels = branches + branches
cursor = mplcursors.cursor(scatter1, hover=False)
cursor.connect(
    "add", lambda sel: sel.annotation.set_text(labels[sel.target.index]))

In [None]:
# Maximum memory when showing Debian's base-files
fig2, ax2 = plt.subplots(figsize=(8, 5), dpi= 110, facecolor='w', edgecolor='k')

# Plot date vs time (cache and no cache)
ax2.plot(dates, show_memories,'x:', label='Uncached elements')
ax2.plot(dates, show_cached_memories, 'x:g', label='Cached elements')
# HACKY scatter
scatter2 = ax2.scatter(dates + dates, show_memories + show_cached_memories, marker='x')

# Labelling
ax2.set_title("Maxmimum memory when invoking 'bst show' on Debian's base-files")
ax2.set_xlabel('Date')
ax2.set_ylabel('Max memory (Mbytes)')
ax2.legend(loc='lower left')
ax2.grid() 

# Limit the y axis
ax2.set_ylim([0, max([max(show_memories), max(show_cached_memories)]) + 50])
fig2.autofmt_xdate() # Make the xaxis pretty

# Add commit info to data points when mouse is hovered over
labels = branches + branches
cursor = mplcursors.cursor(scatter2, hover=False)
cursor.connect(
    "add", lambda sel: sel.annotation.set_text(labels[sel.target.index]))


In [None]:
# Time taken to build Debian's base-files
fig3, ax3 = plt.subplots(figsize=(8, 5), dpi= 110, facecolor='w', edgecolor='k')

# Plot date vs time (cache and no cache)
ax3.plot(dates, build_4_times,'x:', label="4 builders")
ax3.plot(dates, build_8_times,'x:r', label="8 builders")
ax3.plot(dates, build_12_times,'x:g', label="12 builders")

# Hacky scatter for hovering over data points
scatter3 = ax3.scatter(dates + dates + dates, build_4_times + build_8_times + build_12_times, marker='x')

# Labelling
ax3.set_title("Time taken to 'build' Debian's base-files")
ax3.set_xlabel('Date')
ax3.set_ylabel('Time (s)')
ax3.legend(loc='upper right')
ax3.grid()

# Limit the y axis
ax3.set_ylim([0, max([max(build_4_times), max(build_8_times), max(build_12_times)]) + 50])
fig3.autofmt_xdate() # Make the xaxis pretty

# Add commit info to data points when mouse is hovered over
labels = branches + branches + branches
cursor = mplcursors.cursor(scatter3, hover=False)
cursor.connect(
    "add", lambda sel: sel.annotation.set_text(labels[sel.target.index]))

In [None]:
# Maximum memory when building Debian's base-files
fig4, ax4 = plt.subplots(figsize=(8, 5), dpi= 110, facecolor='w', edgecolor='k')

# Plot date vs time (cache and no cache)
# Note this is a line graph with a scatter plot on top, so that we can attach info to data points
#line = ax4.plot(dates, build_with_cache_memory,'x:')
#scatter4 = ax4.scatter(dates, build_with_cache_memory, marker='x')
ax4.plot(dates, build_4_memories,'x:', label="4 builders")
ax4.plot(dates, build_8_memories,'x:r', label="8 builders")
ax4.plot(dates, build_12_memories,'x:g', label="12 builders")

# Hacky scatter for hovering over data points
scatter4 = ax4.scatter(dates + dates + dates, build_4_memories + build_8_memories + build_12_memories, marker='x')

# Title and label axis and add grid
ax4.set_title("Max memory usage when 'building' Debian's base-files")
ax4.set_xlabel('Date')
ax4.set_ylabel('Memory (Mbytes)')
ax4.legend(loc='lower left')
ax4.grid()

ax4.set_ylim([0, max([max(build_4_memories), max(build_8_memories), max(build_12_memories)]) + 50]) # Limit the y axis
fig4.autofmt_xdate() # Make the xaxis pretty

# Add commit info to data points when mouse is hovered over
labels = branches + branches + branches
cursor = mplcursors.cursor(scatter4, hover=False)
cursor.connect(
    "add", lambda sel: sel.annotation.set_text(labels[sel.target.index]))
