## Behavior Tests
This notebook is used to test the behavior (not the performance) of the prototype. We use two tests: crypto only and file sharing via IPFS. Both tests store the results in a .csv file, which we will then plot.

### Generate test files containing random data

In [72]:
print('Generating...')
!head -c 1048576 </dev/urandom >testfile_1_MiB.bin
!head -c 10485760 </dev/urandom >testfile_10_MiB.bin
!head -c 104857600 </dev/urandom >testfile_100_MiB.bin
!head -c 524288000 </dev/urandom >testfile_500_MiB.bin
!head -c 1073741824 </dev/urandom >testfile_1_GiB.bin
#!head -c 5368709120 </dev/urandom >testfile_5_GiB.bin
!sha256sum testfile_1_MiB.bin|cut -d' ' -f1 > testfile_1_MiB.bin.sha256sum
!sha256sum testfile_10_MiB.bin|cut -d' ' -f1 > testfile_10_MiB.bin.sha256sum
!sha256sum testfile_100_MiB.bin|cut -d' ' -f1 > testfile_100_MiB.bin.sha256sum
!sha256sum testfile_500_MiB.bin|cut -d' ' -f1 > testfile_500_MiB.bin.sha256sum
!sha256sum testfile_1_GiB.bin|cut -d' ' -f1 > testfile_1_GiB.bin.sha256sum
#!sha256sum testfile_5_GiB.bin|cut -d' ' -f1 > testfile_5_GiB.bin.sha256sum
print('Done!')

Generating...
Done!


### Generate baseline crypto results
This test will encrypt/decrypt the various test files using multiple ciphers via the IPFS client Python module. This gives us a baseline to compare the IPFS results against.

In [73]:
import jcipfsclient as ipfs
import time

# Test configuration
files = ['testfile_1_MiB.bin','testfile_10_MiB.bin','testfile_100_MiB.bin','testfile_500_MiB.bin']
ciphers = ['plain','ChaCha20','Salsa20','AES_256_CTR']
rounds = 10

print('Processing crypto baseline...')

with open('baseline_crypto_duration_results.csv', 'w') as results:
  # .csv header
  delimiter = ';'
  results.write('File;SHA256;Cipher;Round;Time_Encrypt_Wall_Start;Time_Encrypt_Wall_Stop;Time_Decrypt_Wall_Start;Time_Decrypt_Wall_Stop;Time_Encrypt_Duration_Wall;Time_Decrypt_Duration_Wall;Time_Encrypt_Duration_Cpu;Time_Decrypt_Duration_Cpu;Time_Total_Duration_Wall;Time_Total_Duration_Cpu;Match\n')

  # Run the test
  for cipherMode in ciphers:
    for file in files:
      for round in range(0, rounds):
        chunkSize = 1024*1024*10
        base64Key = ipfs.genKey(cipherMode)
        
        # Encrypt file
        filenameEncrypted = file + '.encrypted'
        with open(file, 'rb') as fileOriginal:
          with open(filenameEncrypted, 'wb') as fileEncrypted:
            timestampEncryptWallStart = time.time()
            timestampEncryptCpuStart = time.process_time()
            for chunk in ipfs.encrypt(fileOriginal, base64Key, chunkSize, cipherMode):
              fileEncrypted.write(chunk)
            timestampEncryptCpuStop = time.process_time()
            timestampEncryptWallStop = time.time()
            timestampEncryptCpuDuration = timestampEncryptCpuStop - timestampEncryptCpuStart
            timestampEncryptWallDuration = timestampEncryptWallStop - timestampEncryptWallStart
        
        # Decrypt file
        filenameDecrypted = file + '.decrypted'
        with open(filenameEncrypted, 'rb') as fileEncrypted:
          with open(filenameDecrypted, 'wb') as fileDecrypted:
            timestampDecryptWallStart = time.time()
            timestampDecryptCpuStart = time.process_time()
            for chunk in ipfs.decrypt_from_file(fileEncrypted, base64Key, chunkSize, cipherMode):
              fileDecrypted.write(chunk)
            timestampDecryptCpuStop = time.process_time()
            timestampDecryptWallStop = time.time()
            timestampDecryptCpuDuration = timestampDecryptCpuStop - timestampDecryptCpuStart
            timestampDecryptWallDuration = timestampDecryptWallStop - timestampDecryptWallStart
      
        # Compare decrypted file to original (hash has to be the same)
        same = '?'
        hashFileDecrypted = !sha256sum $filenameDecrypted|cut -d' ' -f1
        hashFileDecrypted = hashFileDecrypted.nlstr.rstrip()
        with open(file + '.sha256sum', 'r') as fileOriginalHash:
          hashOriginal = fileOriginalHash.readlines()
          hashOriginal = hashOriginal[0].rstrip()
          if hashFileDecrypted == hashOriginal:
            same = 'yes'
          else:
            same = 'no'
            print('Warning: hash mismatch between original and decrypted (file: \'' + file + '\', cipher: ' + cipherMode + ')!')
        
        # Write results to .csv file and clean up test files / storage
        results.write(file + delimiter + hashOriginal + delimiter + cipherMode + delimiter + str(round) + delimiter + str(timestampEncryptWallStart) + delimiter + str(timestampEncryptWallStop) + delimiter + str(timestampDecryptWallStart) + delimiter + str(timestampDecryptWallStop) + delimiter + str(timestampEncryptWallDuration) + delimiter + str(timestampDecryptWallDuration) + delimiter + str(timestampEncryptCpuDuration) + delimiter + str(timestampDecryptCpuDuration) + delimiter + str(timestampEncryptWallDuration + timestampDecryptWallDuration) + delimiter + str(timestampEncryptCpuDuration + timestampDecryptCpuDuration) + delimiter + same + '\n')
        !rm $filenameEncrypted $filenameDecrypted

print('Done!')

Processing crypto baseline...
Done!


### Generate inter-notebook results
This test will exchange the various test files between two JupyterLab instances using IPFS (and encryption/decryption) via the IPFS client Python module. We will launch a web server on a second JupyterLab instance that allows for automated testing (i.e. remote control of the second instance / IPFS peer node). Note that we also [have to join](./IPFS.ipynb#Join-the-IPFS-network) our IPFS nodes to the IPFS private network.
**SECURITY WARNING:** Do not expose this web server directly to the Internet (i.e. use a secure network / tunnel / VPN)!

#### Second JupyterLab instance (web server)

In [None]:
from http.server import BaseHTTPRequestHandler, HTTPServer
from socket import getfqdn
from os import getenv
import jcipfsclient as ipfs
import json as JSON
import time

address = '0.0.0.0'
port = 4000

# Local IPFS peer node address
node = getenv('IPFS_NODE')
nodeApiUrl = 'http://' + node + ':5001'

# Web server endpoints
class RequestHandler(BaseHTTPRequestHandler):
  def do_GET(self):
    self.send_response(200,)
    self.send_header("Content-type", "application/json")
    self.end_headers()
    if self.path == "/" or self.path == "/hello":
      response = {'Hello': str(getfqdn())}
    if self.path == "/hash":
      hashFileDownloaded = !sha256sum testfile.download|cut -d' ' -f1
      hashFileDownloaded = hashFileDownloaded.nlstr.rstrip()
      response = {'hashFileDownloaded': hashFileDownloaded}
    if self.path == "/garbagecollect":
      !rm testfile.download
      ipfs.collectGarbage(nodeApiUrl)
      response = {'collectGarbage': 'complete'}
    self.wfile.write(bytes(JSON.dumps(response), 'utf-8'))

  def do_POST(self):
    self.send_response(200)
    self.send_header('Content-Type', 'application/json')
    self.end_headers()
    length = int(self.headers.get('Content-Length'))
    body = self.rfile.read(length)
    body = body.decode("utf-8")
    if self.path == "/download":
      metadata = JSON.loads(body)
      timestampDownloadWallStart = time.time()
      ipfs.getFile(nodeApiUrl, metadata['cid'], 'testfile.download', metadata['base64Key'], int(metadata['chunkSize']), metadata['cipherMode'])
      timestampDownloadWallStop = time.time()
      timestampDownloadWallDuration = timestampDownloadWallStop - timestampDownloadWallStart
      response = {'timestampDownloadWallStart': str(timestampDownloadWallStart), 'timestampDownloadWallStop': str(timestampDownloadWallStop), 'timestampDownloadWallDuration': str(timestampDownloadWallDuration)}
    self.wfile.write(bytes(JSON.dumps(response), 'utf-8'))

# Launch the web server
server = HTTPServer((address, port), RequestHandler)
print('Web server started at http://' + address + ':' + str(port))
try:
  server.serve_forever()
except KeyboardInterrupt:
  pass
finally:
  server.server_close()

print('Web server stopped')

#### First JupyterLab instance (test)

In [None]:
import jcipfsclient as ipfs
import time
from os import getenv
import requests

# Test configuration
files = ['testfile_1_MiB.bin','testfile_10_MiB.bin','testfile_100_MiB.bin','testfile_500_MiB.bin']
ciphers = ['plain','ChaCha20','Salsa20','AES_256_CTR']
rounds = 10
remoteHostUrl = 'http://notebook.jupyter-2.localhost:4000'

# Local IPFS peer node address
ipfsnode = getenv('IPFS_NODE')
nodeApiUrl = 'http://' + ipfsnode + ':5001'

print('Processing IPFS file sharing...')

with open('inter_notebook_file_sharing_duration_results.csv', 'w') as results:
  # .csv header
  delimiter = ';'
  results.write('File;SHA256;Cipher;Round;Time_Upload_Wall_Start;Time_Upload_Wall_Stop;Time_Download_Wall_Start;Time_Download_Wall_Stop;Time_Upload_Duration_Wall;Time_Download_Duration_Wall;Time_Total_Duration_Wall;Match\n')
  
  # Run the test
  for cipherMode in ciphers:
    for file in files:
      for round in range(0, rounds):
        chunkSize = 1024*3
        
        # Upload (and encrypt) file to local IPFS node
        timestampUploadWallStart = time.time()
        metadata = ipfs.addFile(nodeApiUrl=nodeApiUrl, file=file, base64Key=None, chunkSize=chunkSize, cipherMode=cipherMode)
        timestampUploadWallStop = time.time()
      
        # Instruct remote host to download (and decrypt) file from private IPFS network
        response = requests.post(remoteHostUrl + '/download', json = metadata, timeout=None)
        response = response.json()
        timestampDownloadWallStart = response['timestampDownloadWallStart']
        timestampDownloadWallStop = response['timestampDownloadWallStop']
        timestampDownloadWallDuration = response['timestampDownloadWallDuration']
        
        # Compare downloaded (plaintext) file to original (hash has to be the same)
        same = '?'
        response = requests.get(remoteHostUrl + '/hash', timeout=None)
        response = response.json()
        hashFileDownloaded = response['hashFileDownloaded']
        with open(file + '.sha256sum', 'r') as fileOriginalHash:
          hashOriginal = fileOriginalHash.readlines()
          hashOriginal = hashOriginal[0].rstrip()
          if hashFileDownloaded == hashOriginal:
            same = 'yes'
          else:
            same = 'no'
            print('Warning: hash mismatch between original and downloaded (file: \'' + file + '\', cipher: ' + cipherMode + ')!')
        
        # Write results to .csv file and clean up test files / storage
        results.write(file + delimiter + hashOriginal + delimiter + cipherMode + delimiter + str(round) + delimiter + str(timestampUploadWallStart) + delimiter + str(timestampUploadWallStop) + delimiter + timestampDownloadWallStart + delimiter + timestampDownloadWallStop  + delimiter + str(timestampUploadWallStop - timestampUploadWallStart) + delimiter + timestampDownloadWallDuration + delimiter + str(float(timestampUploadWallStop - timestampUploadWallStart) + float(timestampDownloadWallDuration)) + delimiter + same + '\n')
        ipfs.rmPin(nodeApiUrl, metadata['cid'])
        ipfs.collectGarbage(nodeApiUrl)
        requests.get(remoteHostUrl + '/garbagecollect', timeout=None)

print('Done!')

### Generate plots from generated .csv files
We will use the pandas, matplotlib, and ipywidgets Python libraries to visualize our measurements (all are pre-installed in our JupyterLab Docker image). The plots will be saved as .svg files.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import ipywidgets as widget
!mkdir plots

# Read results
crypto = pd.read_csv('baseline_crypto_duration_results.csv', sep=';')
ipfs = pd.read_csv('inter_notebook_file_sharing_duration_results.csv', sep=';')

# Configure boxplot
x_label_order_file = ['testfile_1_MiB.bin','testfile_10_MiB.bin','testfile_100_MiB.bin','testfile_500_MiB.bin', 'testfile_1_GiB.bin']
x_label_order_cipher = ['plain','ChaCha20','Salsa20','AES_256_CTR']
figsizeFile = (11,4)
figsizeCipher = (8,4)
config = {
  'x_label_order': x_label_order_cipher,
  'fontsize': 12,
  'figsize': figsizeCipher,
  'grid': False,
  'boxprops': {
    "linestyle": "-",
    "linewidth": "1",
    "color":"black"
  },
  'whiskerprops': {
    "linestyle": "-",
    "linewidth": "1",
    "color":"black"
  },
  'medianprops': {
    "linestyle": "--",
    "linewidth": "1",
    "color":"black"
  },
  'capprops': {
    "linestyle": "-",
    "linewidth": "1",
    "color":"black"
  },
  'flierprops': {
    "linestyle": "-",
    "linewidth": "1",
    "color":"black"
  }
}

# Plot <column> per file
def create_boxplots_file(testname, df, column, config):
  print(testname + ': ' + column)
  outputs = []
  #max = df[column].max() * 1.05
  output = widget.Output(layout={'margin': '12px'})
  with output:
    df['File'] = pd.Categorical(df['File'], config['x_label_order'])
    plot = df.boxplot(column=column, by='File', grid=config['grid'], fontsize=config['fontsize'], figsize=config['figsize'], boxprops=config['boxprops'], whiskerprops=config['whiskerprops'], medianprops=config['medianprops'], capprops=config['capprops'], flierprops=config['flierprops'])
    plot.set_title('')
    plot.get_figure().suptitle('')
    plot.set_xlabel('File', fontsize=config['fontsize'])
    plot.set_ylabel('Total Duration (sec)', fontsize=config['fontsize'])
    plot.set_ylim(ymin=0)
    plot.get_figure().savefig('./plots/' + str(testname) + '-' + column + '.svg')
    plt.show()
  outputs.append(output)
  return outputs

# Plot <column> per cipher per file
def create_boxplots_cipher(testname, df, column, config):
  print(testname + ': ' + column)
  outputs = []
  #max = df[column].max() * 1.05
  for file in df['File'].groupby(df['File']).unique():
    output = widget.Output(layout={'margin': '12px'})
    with output:
      df['Cipher'] = pd.Categorical(df['Cipher'], config['x_label_order'])
      plot = df.loc[df['File'] == str(file[0])][['Cipher', column]].boxplot(column=column, by='Cipher', grid=config['grid'], fontsize=config['fontsize'], figsize=config['figsize'], boxprops=config['boxprops'], whiskerprops=config['whiskerprops'], medianprops=config['medianprops'], capprops=config['capprops'], flierprops=config['flierprops'])
      plot.set_title(str(file[0]), fontsize=config['fontsize'])
      plot.get_figure().suptitle('')
      plot.set_xlabel('Cipher', fontsize=config['fontsize'])
      plot.set_ylabel('Total Duration (sec)', fontsize=config['fontsize'])
      plot.set_ylim(ymin=0)
      plot.get_figure().savefig('./plots/' + str(testname) + '-' + column + '-' + str(file[0]) + '.svg')
      plt.show()
    outputs.append(output)
  return outputs

#crypto.head()
#ipfs.head()

#### Plot baseline crypto results per file

In [None]:
# Plot Time_Total_Duration_Wall
config['figsize'] = figsizeFile
config['x_label_order'] = x_label_order_file
widget.HBox(create_boxplots_file('crypto-file', crypto, 'Time_Total_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Total_Duration_Cpu
widget.HBox(create_boxplots_file('crypto-file', crypto, 'Time_Total_Duration_Cpu', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Encrypt_Duration_Wall
widget.HBox(create_boxplots_file('crypto-file', crypto, 'Time_Encrypt_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Encrypt_Duration_Cpu
widget.HBox(create_boxplots_file('crypto-file', crypto, 'Time_Encrypt_Duration_Cpu', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Decrypt_Duration_Wall
widget.HBox(create_boxplots_file('crypto-file', crypto, 'Time_Decrypt_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Decrypt_Duration_Cpu
widget.HBox(create_boxplots_file('crypto-file', crypto, 'Time_Decrypt_Duration_Cpu', config), layout=widget.Layout(flex_flow='row wrap'))

#### Plot baseline crypto results per cipher

In [None]:
# Plot Time_Total_Duration_Wall
config['figsize'] = figsizeCipher
config['x_label_order'] = x_label_order_cipher
widget.HBox(create_boxplots_cipher('crypto-cipher', crypto, 'Time_Total_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Total_Duration_Cpu
widget.HBox(create_boxplots_cipher('crypto-cipher', crypto, 'Time_Total_Duration_Cpu', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Encrypt_Duration_Wall
widget.HBox(create_boxplots_cipher('crypto-cipher', crypto, 'Time_Encrypt_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Encrypt_Duration_Cpu
widget.HBox(create_boxplots_cipher('crypto-cipher', crypto, 'Time_Encrypt_Duration_Cpu', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Decrypt_Duration_Wall
widget.HBox(create_boxplots_cipher('crypto-cipher', crypto, 'Time_Decrypt_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Decrypt_Duration_Cpu
widget.HBox(create_boxplots_cipher('crypto-cipher', crypto, 'Time_Decrypt_Duration_Cpu', config), layout=widget.Layout(flex_flow='row wrap'))

#### Plot IPFS file sharing results per file

In [None]:
# Plot Time_Total_Duration_Wall
config['figsize'] = figsizeFile
config['x_label_order'] = x_label_order_file
widget.HBox(create_boxplots_file('ipfs-file', ipfs, 'Time_Total_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Upload_Duration_Wall
widget.HBox(create_boxplots_file('ipfs-file', ipfs, 'Time_Upload_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Download_Duration_Wall
widget.HBox(create_boxplots_file('ipfs-file', ipfs, 'Time_Download_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

#### Plot IPFS file sharing results per cipher

In [None]:
# Plot Time_Total_Duration_Wall
config['figsize'] = figsizeCipher
config['x_label_order'] = x_label_order_cipher
widget.HBox(create_boxplots_cipher('ipfs-cipher', ipfs, 'Time_Total_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Upload_Duration_Wall
widget.HBox(create_boxplots_cipher('ipfs-cipher', ipfs, 'Time_Upload_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))

In [None]:
# Plot Time_Download_Duration_Wall
widget.HBox(create_boxplots_cipher('ipfs-cipher', ipfs, 'Time_Download_Duration_Wall', config), layout=widget.Layout(flex_flow='row wrap'))