# Thin Partition

## Broadcast Operation

Inside the `thin_bcast_results.csv` there are the results of the broadcast operation applied to the `THIN` partition.

We have 4 columns:
* Algorithm: the algorithm used to perform the broadcast operation (pipeline, chain, basic_linear);
* Processes: the nummber of processes (from 2 to 48, since in partition we have setted ) used to perform for each size of the messages (from 2 byte to 1038576 byte, that corresponds to 1MB);
* Size: the size of the message in Bytes;
* Latency: the latency of the broadcast operation in microseconds.

So, for every algorithms, for each process, we have the latency of the broadcast operation for each size of the message. 

## Objective

The objective is to analyse the collected data and compare the baseline algorithm (basic_linear) with the other two algorithms (chain and pipeline), trying also to understand/infer the performance model behind the algorithms, taking into account the architecture on which they are being executed. 

## Starting point 

First of all, since was made an unique sbatch script, we need to separate the data of the different algorithms.

In [1]:
# load libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [4]:
# import data from csv
df = pd.read_csv('thin_bcast_results.csv')

# separate the data into 3 new csv files, one for each algorithm (basic_linear, chain, pipeline)
df_basic_linear = df[df['Algorithm'] == 'basic_linear']
df_chain = df[df['Algorithm'] == 'chain']
df_pipeline = df[df['Algorithm'] == 'pipeline']

# create new csv files for each algorithm
df_basic_linear.to_csv('thin_bcast_basic_linear.csv', index=False)
df_chain.to_csv('thin_bcast_chain.csv', index=False)
df_pipeline.to_csv('thin_bcast_pipeline.csv', index=False)
