# Evolution of Computational Energy Efficiency

### This notebook analyzes the evolution of energy efficiency in large-scale computer systems over time using OpenDC with different workloads and system configurations.

### Steps:

1. **Run the Energy Consumption Scripts**:
   - Start by running the energy consumption scripts to simulate systems in OpenDC. These scripts will generate the necessary data for analysis.

   - Bitbrains and Surf Lisa are two different traces. "energy_consumption" runs Surf Lisa on the chosen systems and "energy_consumption_bitbrains" runs Bitbrains on the chosen systems.

   - The results of these simulations will be stored in CSV files.
   
   - Scripts can be executed with the following bash commands:
      ```bash
      ./energy_consumption
      ./energy_consumption_bitbrains
2. **Load the Simulation Results**:
   - The simulation results for the SURF workload will be saved in `modeling_results_surf.csv`.
   - The simulation results for the Bitbrains workload will be saved in `modeling_results_bitbrains.csv`.

3. **Plot the Results**:
   - Use this notebook to load the CSV files, plot the data, and visualize the results.
   - The plots will help in understanding the energy efficiency and computational performance of the systems.

4. **Analyze the Findings**:
   - Examine the plots to analyze how different systems perform in terms of energy efficiency and computational work done.
   - Compare the results to draw insights into the evolution of computational energy efficiency over time.

### Note:

Ensure that you have the required libraries installed (e.g., `pandas`, `matplotlib`, `numpy`). You can install them using `pip` if necessary:

```bash
pip install pandas matplotlib numpy


In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as ticker

### Plot energy efficiency metric

In [None]:
 

df = pd.read_csv('modeling_results_surf.csv', header=None, names=['System', 'Year', 'PureEnergyMetric'])
df['Year'] = df['Year'].astype(int)

plt.figure(figsize=(10, 14))
plt.plot(df['Year'], df['PureEnergyMetric'], marker='o', linestyle='-', color='black')
plt.xlabel('Year', fontsize= 18)
plt.ylabel('Total Computations per KWh (MFLOPS/KWh)', fontsize= 18)

plt.title('Evolution of Computational Energy Efficiency')
plt.yscale('log')

plt.gca().set_ylim(10, 10**10)

plt.legend(fontsize=12)

plt.yticks(fontsize= 18)
plt.xticks(fontsize= 18)
plt.xticks(df['Year'])  
for i, row in df.iterrows():
    plt.annotate(row['System'], (row['Year'], row['PureEnergyMetric']),
                textcoords="offset points", xytext=(5,5), ha='center', fontsize=13)
    
plt.savefig('surf_intro_log.pdf', format='pdf') # change the file name according to the plot
plt.show()

### Plot energy consumption 

In [None]:


df = pd.read_csv('energy_consumption_bitbrains.csv', header=None, names=['System', 'Year', 'PowerMetric'])
df['Year'] = df['Year'].astype(int)

plt.figure(figsize=(8, 10))
plt.plot(df['Year'], df['PowerMetric'], marker='o', linestyle='-', color='black')
plt.xlabel('Year', fontsize=14)
plt.ylabel('Power kW', fontsize=14)
#plt.yscale('log')

plt.yticks(fontsize= 14)
plt.xticks(fontsize= 14)
plt.xticks(df['Year'], fontsize= 13)  

for i, row in df.iterrows():
    plt.annotate(row['System'], (row['Year'], row['PowerMetric']),
                textcoords="offset points", xytext=(5,5), ha='center', fontsize=13)
plt.savefig('energy_consumption_pre90.pdf', format='pdf')  # change the file name according to the plot
plt.show()


### Plot the impact of task duplication

##### Can be used to conduct a sensitivity analysis between different methods of scaling such as employing task duplication.

In [None]:



df1 = pd.read_csv('modeling_results.csv', header=None, names=['System', 'Year', 'PowerMetric']) # add the results of experiments with duplication to a csv file 
df1['Year'] = df1['Year'].astype(int)

df2 = pd.read_csv('modeling_results_no_dup.csv', header=None, names=['System', 'Year', 'PowerMetric']) # add the results of experiments without duplication to a csv file 
df2['Year'] = df2['Year'].astype(int)

plt.figure(figsize=(8, 10))

plt.plot(df1['Year'], df1['PowerMetric'], marker='o', linestyle='-', color='darkgreen', label='Task Duplication', linewidth=3)
for i, row in df1.iterrows():
    plt.annotate(row['System'], (row['Year'], row['PowerMetric']),
                textcoords="offset points", xytext=(5,5), ha='center', fontsize=13)

plt.plot(df2['Year'], df2['PowerMetric'], marker='o', linestyle='--', color='red', label='No Task Duplication', linewidth= 2)

ax = plt.gca()
ax.set_ylim(10, 10**10)
plt.xlabel('Year')
plt.ylabel('Computations (MFlops)')
plt.xticks(df1['Year'], fontsize= 13)  
plt.yticks(fontsize= 10)
plt.legend()  
plt.yscale('log')

plt.savefig('comparison_duplication_bitbrains_log.pdf', format='pdf')

plt.show()


### Plot bitbrains and surf together


In [None]:
# read the first dataset
df1 = pd.read_csv('modeling_results_surf.csv', header=None, names=['System', 'Year', 'PureEnergyMetric'])
df1['Year'] = df1['Year'].astype(int)

# read the second dataset
df2 = pd.read_csv('modeling_results.csv', header=None, names=['System', 'Year', 'PureEnergyMetric'])
df2['Year'] = df2['Year'].astype(int)

plt.figure(figsize=(8, 10))

# plot the first dataset
plt.plot(df1['Year'], df1['PureEnergyMetric'], marker='o', linestyle='-', color='black', label='SURF Lisa Workload')
for i, row in df1.iterrows():
    plt.annotate(row['System'], (row['Year'], row['PureEnergyMetric']),
                textcoords="offset points", xytext=(5,5), ha='center', fontsize=13, color='black')

# plot the second dataset
plt.plot(df2['Year'], df2['PureEnergyMetric'], marker='o', linestyle='-', color='blue', label='Bitbrains Workload')

plt.xlabel('Year', fontsize=14)
plt.ylabel('Total Computations per KWh (MFLOPS/KWh)', fontsize=14)
plt.yscale('log')
ax = plt.gca()
ax.set_ylim(10, 10**10)

plt.yticks(fontsize=14)
plt.xticks(df1['Year'], fontsize=14)
plt.legend() 
plt.savefig('surf_and_bitbrains_FINAL.pdf', format='pdf')
plt.show()

### Plot energy efficiency for top500

In [None]:
df = pd.read_csv('top500_metric.csv', header=None, names=['System', 'Year', 'Rmax', 'Power'])
df['Year'] = df['Year'].astype(int)
df['PureEnergyMetric'] = df['Rmax']/ df['Power']
plt.figure(figsize=(8, 10))
plt.plot(df['Year'], df['Power'], marker='o', linestyle='-', color='black')
plt.xlabel('Year', fontsize= 14)
plt.ylabel('KWh', fontsize= 14)

plt.yscale('log')
ax = plt.gca()
ax.set_ylim(10, 10**10)

plt.yticks(fontsize= 14)
plt.xticks(df['Year'], fontsize= 14)  

for i, row in df.iterrows():
    plt.annotate(row['System'], (row['Year'], row['Power']),
                textcoords="offset points", xytext=(5,5), ha='center', fontsize=13)
plt.savefig('top500_energy.pdf', format='pdf')
plt.show()

### Plot Surf Lisa trace with Koomey's Model

In [None]:
efficiency_1953 = 1e2  # 1.E+02 computations per kilowatt-hour
year_1953 = 1953
year_2021 = 2021
doubling_time = 1.57  # Doubling time in years of koomey

years_difference = year_2021 - year_1953
doubling_periods = years_difference / doubling_time

initial_efficiency_2021 = efficiency_1953 * (2 ** doubling_periods)

df = pd.read_csv('modeling_results_surf.csv', header=None, names=['System', 'Year', 'PureEnergyMetric'])
df['Year'] = df['Year'].astype(int)

plt.figure(figsize=(25, 27))
plt.plot(df['Year'], df['PureEnergyMetric'],label= 'Fixed-Work Trace', marker='o', linestyle='-', color='saddlebrown', linewidth=3)
plt.xlabel('Year', fontsize=30)
plt.ylabel('Total Computations per KWh (MFLOPS/KWh)', fontsize=30)

ax = plt.gca()
ax.set_yscale('log')
ax.set_ylim(10, 10**18)

plt.yticks(fontsize=25)
plt.xticks(df['Year'], fontsize=24)
for i, row in df.iterrows():
    plt.annotate(row['System'], (row['Year'], row['PureEnergyMetric']),
                 textcoords="offset points", xytext=(5,5), ha='center', fontsize=27)

first_year = 1993

growth_rate = np.log(2) / doubling_time

years = np.linspace(year_2021, first_year, 100)

efficiency = initial_efficiency_2021 * np.exp(growth_rate * (years - year_2021))

plt.plot(years, efficiency, label='Koomey\'s Law', linestyle='--', color='blue', linewidth=3)

plt.legend(fontsize=30)

plt.savefig('surf-koomey.pdf', format='pdf')
plt.show()


### Combine all energy efficiency models in one plot


In [None]:
efficiency_1953 = 1e2  # 1.E+02 computations per kilowatt-hour
year_1953 = 1953
year_1993 = 1993
doubling_time = 1.57  # Doubling time in years

years_difference = year_1993 - year_1953
doubling_periods = years_difference / doubling_time

initial_efficiency_1993 = efficiency_1953 * (2 ** doubling_periods)

df_surf = pd.read_csv('modeling_results_surf.csv', header=None, names=['System', 'Year', 'PureEnergyMetric'])
df_surf['Year'] = df_surf['Year'].astype(int)

df_top500 = pd.read_csv('top500_metric.csv', header=None, names=['System', 'Year', 'Rmax', 'Power'])
df_top500['Year'] = df_top500['Year'].astype(int)
df_top500['PureEnergyMetric'] = df_top500['Rmax'] / df_top500['Power']

df_fixed_time = pd.read_csv('modeling_results_bitbrains.csv', header=None, names=['System', 'Year', 'PureEnergyMetric'])
df_fixed_time['Year'] = df_fixed_time['Year'].astype(int)

plt.figure(figsize=(8, 10))

final_year = 2024

growth_rate = np.log(2) / doubling_time

years = np.linspace(year_1993, final_year, 100)

efficiency = initial_efficiency_1993 * np.exp(growth_rate * (years - year_1993))

#plot all energy efficiency models
plt.plot(years, efficiency, label='Koomey\'s Law', linestyle=':', color='blue')

plt.plot(df_surf['Year'], df_surf['PureEnergyMetric'], label='Fixed-Work Trace', marker='o', linestyle='-', color='deeppink')

plt.plot(df_fixed_time['Year'], df_fixed_time['PureEnergyMetric'], label='Fixed-Time Trace', marker='s', linestyle='-.', color='darkgreen')

plt.plot(df_top500['Year'], df_top500['PureEnergyMetric'], label='TOP500', marker='^', linestyle='--', color='darkorange')

plt.xlabel('Year', fontsize=14)
plt.ylabel('Total Computations per KWh (MFLOPS/KWh)', fontsize=14)
ax = plt.gca()
ax.set_yscale('log')
ax.set_ylim(10, 10**18)
plt.yticks(fontsize=14)
plt.xticks(fontsize=14)

for i, row in df_surf.iterrows():
    plt.annotate(row['System'], (row['Year'], row['PureEnergyMetric']),
                 textcoords="offset points", xytext=(5,5), ha='center', fontsize=13)


plt.legend(fontsize=12)

plt.savefig('combined_plot_with_koomey_accessible.pdf', format='pdf')
plt.show()