# Efficiency Techniques Application

This notebook will go through some of our functions using Cython-optimized functions and compare the running speed with the original ones.

In [1]:
import re
from typing import List

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from scipy.stats import ttest_1samp
from scipy.stats import ttest_ind
from scipy.stats import wilcoxon
from scipy.stats import mannwhitneyu
from sklearn.utils import resample

In [2]:
import efficiency.py_efficiency as ef
import final_func as fn

ModuleNotFoundError: No module named 'efficiency.py_efficiency'

### Comparison

- Without Cython: fn
- With Cython: ef

In [None]:
# Load data
pit = pd.read_csv('data/pit_stops.csv')
results = pd.read_csv('data/results.csv')
status = pd.read_csv('data/status.csv')
lap = pd.read_csv("data/lap_times.csv")

### 1. function: merge_df()

In [None]:
# Before
%timeit -r 100 -n 1 fn.merge_data([pit, results, status])

In [None]:
# After
%timeit -r 100 -n 1 ef.merge_data([pit, results, status])

### 2. function: process_data()

In [None]:
merge_df = fn.merge_data([pit, results, status])

In [None]:
# Before
%timeit -r 100 -n 1 ef.process_data(merge_df)

In [None]:
# After
%timeit -r 100 -n 1 fn.process_data(merge_df)

In [None]:
merge_df = fn.process_data(merge_df)

### 3. function: pit_stop_group()

In [None]:
# Before
%timeit -r 100 -n 1 ef.pit_stop_group(merge_df)

In [None]:
# After
%timeit -r 100 -n 1 fn.pit_stop_group(merge_df)

In [None]:
# Before
%timeit -r 100 -n 1 fn.pit_stop_group(merge_df, by='total_stops')

In [None]:
# After
%timeit -r 100 -n 1 ef.pit_stop_group(merge_df, by='total_stops')

In [None]:
df_group = fn.pit_stop_group(merge_df, by='total_stops')

### 4. function: front_back_division()

In [None]:
# Before
%timeit -r 100 -n 1 fn.front_back_division(merge_df, top_num=5)

In [None]:
# After
%timeit -r 100 -n 1 ef.front_back_division(merge_df, top_num=5)

In [None]:
# Before
%timeit -r 100 -n 1 fn.front_back_division(merge_df, select_col='abs_deviation_mean', top_num=5)

In [None]:
# After
%timeit -r 100 -n 1 ef.front_back_division(merge_df, select_col='abs_deviation_mean', top_num=5)

### 5. function: lap_data_process()

In [None]:
# Before
%timeit -r 100 -n 1 fn.lap_data_process(results, lap)

In [None]:
# After
%timeit -r 100 -n 1 ef.lap_data_process(results, lap)

### Conclusion

Our project does not need a lot of calculations, and is already very fast. After Cython-optimizing the functions, we discover that the functions are not faster after the 'optimization'. Instead, many of them are slightly slower than the original ones.