## Following trips

If it can be assumed that the environmental as well as loading conditions are the same within the small time window between two trips in the same direction with Aurora and Tycho Brahe, the difference in energy consumption between the ships must depend on how the ships are run. We cannot change the weather, but we can change how the ships are run. So if one of the ships is doing it better at a certain trip, it should also be possible to operate the other ship in the same, more optimal way. A comparison between the trips in the same direction that are closest to each other for the two vessels are compared in this section.

In [None]:
# %load imports.py
# %load ../imports.py
%matplotlib inline
%load_ext autoreload
%autoreload 2

import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as pltb
import matplotlib.pyplot as plt
import seaborn as sns
width=20
height=3
plt.rcParams["figure.figsize"] = (width,height)
sns.set(rc={'figure.figsize':(width,height)})

#import seaborn as sns
import os
from collections import OrderedDict

from IPython.display import display

pd.options.display.max_rows = 999
pd.options.display.max_columns = 999
pd.set_option("display.max_columns", None)

import sys
import os

from sklearn.metrics import r2_score
import seaborn as sns

import statsmodels.api as sm

from d2e2f.visualization import visualize
import re
from myst_nb import glue

In [None]:
ship_names = ['tycho','aurora']

df = pd.DataFrame()
for ship_name in ship_names:
    df_ = catalog.load(f'{ship_name}.trip_statistics_joined_thrusters')
    df_['ship'] = ship_name
    df_['start_time'] = pd.to_datetime(df_['start_time'], utc=True)
    df_['end_time'] = pd.to_datetime(df_['end_time'], utc=True)
    
    df = df.append(df_, ignore_index=True)

df.sort_values(by='start_time', inplace=True)

df['E']/=3600  #--> [kWh]


In [None]:
start = df.iloc[0]['start_time'].date()
end = df.iloc[-1]['start_time'].date()
glue('start',str(start))
glue('end',str(end))

In [None]:
facegrid = sns.relplot(data=df, x='start_time',y='E', hue='ship', height=3, aspect=3);
facegrid.set_ylabels('E (Energy consumption per trip) [kWh]')

glue('fig:tycho-aurora', facegrid.fig, display=False)

{numref}`fig:tycho-aurora` shows the energy consumption for trips with the two ships between {glue:}`start` and {glue:}`end`. The trips are filtered to contain only trips where there exist one trip with Aurora and one trip with Tycho Brahe within the time window of 1 hour as shown in {numref}`fig:tycho-aurora-cut`. It can be seen that there is a period in the begining of May (where Tycho Brahe was taken out of service.) that has been excluded. 

```{glue:figure} fig:tycho-aurora
:figwidth: 800px
:name: "fig:tycho-aurora"

Mean power for all trips and both ships 
```

```{glue:figure} fig:tycho-aurora-cut
:figwidth: 800px
:name: "fig:tycho-aurora-cut"

Mean power for all trips and both ships 
```


In [None]:
ships = df.groupby(by='ship')
df_tycho=ships.get_group('tycho')
df_aurora=ships.get_group('aurora')
mask = df_aurora['start_time'].apply(lambda x: ((x-df_tycho['start_time']).abs() < "0 days 01:00:00").any())
df_aurora = df_aurora.loc[mask].copy()
df_cut = pd.concat([df_tycho, df_aurora])

In [None]:
facegrid = sns.relplot(data=df_cut, x='start_time',y='E', hue='ship', height=3, aspect=3);
facegrid.set_ylabels("E (Energy consumption) [kWh]", clear_inner=False)
glue('fig:tycho-aurora-cut', facegrid.fig, display=False)

In [None]:
grid = sns.displot(df_cut, x='sog', hue='ship', kind="kde", bw_adjust=1, height=3, aspect=3)
ax = grid.fig.axes[0]
ax.set_xticks(np.arange(3,6,0.25))

ax.annotate('Tycho higher speeds', (4.45,0.8))
ax.annotate('Aurora higher speeds', (4.80,0.17))

glue('fig:sog_distribution',grid.fig, display=False)

In [None]:
grid = sns.displot(df_cut, x='E', hue='ship', kind="kde", bw_adjust=1, height=3, aspect=3)
glue('fig:energy_distribution',grid.fig, display=False)

In [None]:
df_mean = df_cut.groupby(by='ship').mean()[['sog','E', 'distance']]

df_mean.rename(
    columns={'sog':'sog [m/s]',
            'E':'E [kWh]',
            'distance':'Trip distance [m]',}, inplace=True)

formatter = {
'sog [m/s]' : '{:.2f}',
'E [kWh]' : '{:.0f}',
'Trip distance [m]' : '{:.0f}',
}
glue('tab:mean',df_mean.style.format(formatter))

In [None]:
P_pct_diff = int(np.round((df_tycho['E'].mean() - df_aurora['E'].mean()) / df_aurora['E'].mean()*100))
glue('E_pct_diff', P_pct_diff)

{numref}`fig:sog_distribution` shows the distribution of trip average speed for all the trips with the two ships. It can be seen that Tycho Brahe has a higher average trip speed more ofthen than Aurora. In the higher speed range (above 4.75 m/s) Aurora has more trips however. Aurora sometimes have emergency trips, carrying ambulances, which probably explaines this. 

```{glue:figure} fig:sog_distribution
:name: "fig:sog_distribution"
Distribution of Speed over ground [m/s]
```

The similar trends can also be seen in the trip Energy consumption distribution as a as seen in {numref}`fig:energy_distribution`.

```{glue:figure} fig:energy_distribution
:name: "fig:energy_distribution"
Distribution of Energy consumption [kWh]
```

Eventhough Aurora often has a lower average trip speed, the emergency trips is balancing this, so that the total mean speed is the same for both ships as seen in {numref}`tab:mean`. This table also show that the mean energy consumption differs {glue:}`E_pct_diff`% in favor of Aurora. Aurora has a little bit lower energy consumption, but the difference is quite small. 

```{glue:figure} tab:mean
:name: "tab:mean"
Trip mean values
```

The sister ships should therefore have similar possibilites to minimize the energy consumption. Two following trips: one with Aurora and one with Tycho Brahe in the same direction and closest in time can be paired. These pairs can be compared, and since the time window is now very small (less than 1 hour), the difference in energy consumption should come from the operation of the ships. To further constrain the pairs, only pairs where the mean wind speed recorded onboard the ships differs less than {glue:}`w_max_diff` m/s between the ships is included in the anlysis.


In [None]:
df_following = pd.DataFrame()
for trip_direction, df_aurora_ in df_aurora.groupby(by='trip_direction'):
    
    df_tycho_ = df_tycho.groupby(by='trip_direction').get_group(trip_direction)
    
    close_indexes = df_aurora_['start_time'].apply(lambda x: ((x-df_tycho_['start_time']).abs()).idxmin())
    
    df_aurora_.reset_index(inplace=True)
    df_tycho_ = df_tycho_.loc[close_indexes].reset_index()
    
    df = pd.merge(df_aurora_, df_tycho_, how='inner', left_index=True, right_index=True, 
                  suffixes = ('_aurora', '_tycho'))
    df['trip_direction'] = trip_direction    
    df['E_min'] = df[['E_aurora','E_tycho']].min(axis=1)
    df['E_max'] = df[['E_aurora','E_tycho']].max(axis=1)
    df['E_tot'] = df[['E_aurora','E_tycho']].sum(axis=1)
    df['w_diff'] = (df['w_aurora'] - df['w_tycho']).abs()
    df['time_diff'] = (df['start_time_aurora'] - df['start_time_tycho']).abs()
    
    df['E_aurora - E_tycho'] = df['E_aurora'] - df['E_tycho']
    df['sog_aurora - sog_tycho'] = df['sog_aurora'] - df['sog_tycho']
    
    df['energy_saving'] = (df['E_max'] - df['E_min'])
    
    df_following = df_following.append(df)
    

In [None]:
w_max_diff = 1
glue('w_max_diff',w_max_diff)
mask = ((df_following['w_diff'] < w_max_diff) &
        (df_following['time_diff'] <= '0 days 01:00:00'))



df_following = df_following.loc[mask]
for trip_direction, df in df_following.groupby('trip_direction'):
    facegrid = sns.relplot(data=df, x='E_aurora',y='E_tycho', height=3, aspect=3);
    glue(f'fig:E_following_{trip_direction}', facegrid.fig, display=False)

    facegrid = sns.relplot(data=df, x='sog_aurora - sog_tycho',y='E_aurora - E_tycho', height=3, aspect=3);
    glue(f'fig:sog_following_{trip_direction}', facegrid.fig, display=False)  
    
    facegrid = sns.relplot(data=df, x='w_aurora',y='w_tycho', height=3, aspect=3);
    glue(f'fig:w_following_{trip_direction}', facegrid.fig, display=False)  

The mean power of the following trip paris is shown in {numref}`fig:E_following_Helsingborg-Helsingør` and {numref}`fig:E_following_Helsingør-Helsingborg` for the two directions. The mean power of Aurora (x-axis) is plotted against Tycho Brahe (y-axis). It is very clear that the mean power differs a lot between the ships for trips that are very close in time, where the environment etc. should be very similar. 

In [None]:
df_energy_saving = df_following.groupby(by='trip_direction')[['energy_saving','E_tot']].sum()
df_energy_saving['energy_saving_pct'] = df_energy_saving['energy_saving']/df_energy_saving['E_tot']*100

df_table = df_energy_saving[['energy_saving_pct']].copy()

df_table.rename(
    columns={'trip_direction':'Direction',
            'energy_saving_pct':'Energy saving [%]',}, inplace=True)

formatter = {
'Energy saving [%]' : '{:.0f}',
}
glue('tab:energy_saving_following',df_table.style.format(formatter))

```{glue:figure} fig:E_following_Helsingborg-Helsingør
:figwidth: 800px
:name: "fig:E_following_Helsingborg-Helsingør"

Comparison of mean power between the two ships for trips (Helsingborg-Helsingør) that are closest in time (1 hour maximum).

```

```{glue:figure} fig:E_following_Helsingør-Helsingborg
:figwidth: 800px
:name: "fig:E_following_Helsingør-Helsingborg"

Comparison of mean power between the two ships for trips (Helsingør-Helsingborg) that are closest in time (1 hour maximum).

```

The energy saving potential is estimated by calculating the amount of energy that could be saved if both vessels are always operated as the better of the two (in the pairs). The result from this calculation is shown in {numref}`tab:energy_saving_following`.

```{glue:figure} tab:energy_saving_following
:name: "tab:energy_saving_following"
Energy saving potential
```

The difference in speed (x-axis) is plotted againast the difference in mean power (y-axis) for the trip pairs in {numref}`fig:sog_following_Helsingborg-Helsingør` and {numref}`fig:sog_following_Helsingør-Helsingborg`. It can be seen from these figures that the mean power difference has a high correlation with the speed difference and therefore seems to be the main explaination to why the energy consumption differs. This is however not the only explaination which will be further investigated in the Outliers section below.

```{glue:figure} fig:sog_following_Helsingborg-Helsingør
:figwidth: 800px
:name: "fig:sog_following_Helsingborg-Helsingør"

Comparison of mean power difference and mean speed difference between the two ships for trips (Helsingborg-Helsingør) that are closest in time (1 hour maximum).

```

```{glue:figure} fig:sog_following_Helsingør-Helsingborg
:figwidth: 800px
:name: "fig:sog_following_Helsingør-Helsingborg"

Comparison of mean power difference and mean speed difference between the two ships for trips (Helsingør-Helsingborg) that are closest in time (1 hour maximum).

```


Analysing the difference between pair trips between Aurora and Tycho Brahe estimates that there is an energy saving potential of at least 7%. 

## Outliers

In [None]:
group_directions = df_following.groupby('trip_direction')
df_ = group_directions.get_group('Helsingør-Helsingborg').copy()
mask = df_['sog_aurora - sog_tycho'].abs() < 0.5
df_ = df_.loc[mask].copy()

df_['outlier_factor'] = (df_['E_aurora - E_tycho']/df_['sog_aurora - sog_tycho']**(1)).abs()
df_ = df_.sort_values(by='outlier_factor', ascending=False)

fig,ax=plt.subplots()
fig.set_size_inches(15,6)
df_.plot(x='sog_aurora - sog_tycho', y='E_aurora - E_tycho', style='b.', label='pairs', ax=ax)

outlier_1 = df_.iloc[0]
outlier_2 = df_.iloc[7]

df_.loc[[outlier_1.name]].plot(x='sog_aurora - sog_tycho', y='E_aurora - E_tycho', style='ro', label='outlier 1', ax=ax)
df_.loc[[outlier_2.name]].plot(x='sog_aurora - sog_tycho', y='E_aurora - E_tycho', style='go', label='outlier 2', ax=ax)

ax.set_xlabel('sog_aurora - sog_tycho')
ax.set_ylabel('E_aurora - E_tycho')
ax.legend();
glue("fig:outliers", fig, display=False)

Some of the pairs have similar speeds but still differ in energy consumption. Two of these "outliers" have been selected according to {numref}`fig:outliers`.

```{glue:figure} fig:outliers
:name: "fig:outliers"
:figwidth: 800px
Two outliers where Energy consumption differs a lot but speed does not differ.
```
A closer look at the time series of these pairs will be conducted below.


### Outlier 1
Outlier 1 is the red dot in {numref}`fig:outliers` where the energy consumption is lower for Aurora eventhough the average speed is the same for both ships (sog_aurora - sog_tycho = 0). The track plot is shown below together with time series of Power, cumulative energy consumption (energy consumed since start of trip) and speed over ground (sog).

In [None]:
df_tycho_time = catalog.load("tycho.data_with_trip_columns")
df_aurora_time = catalog.load("aurora.data_with_trip_columns")

In [None]:
trip_tycho = df_tycho_time.groupby(by='trip_no').get_group(outlier_1.trip_no_tycho)
trip_aurora = df_aurora_time.groupby(by='trip_no').get_group(outlier_1.trip_no_aurora)

E = cumtrapz(y=trip_tycho['P'], x=trip_tycho['trip_time'])/3600
trip_tycho['E'] = np.concatenate([[0],E])

E = cumtrapz(y=trip_aurora['P'], x=trip_aurora['trip_time'])/3600
trip_aurora['E'] = np.concatenate([[0],E])

display(visualize.plot_map(trip_tycho))
display(visualize.plot_map(trip_aurora))


fig,ax=plt.subplots()
fig.set_size_inches(15,6)
trip_tycho.plot(x='trip_time', y='P', ax=ax, label='tycho');
trip_aurora.plot(x='trip_time', y='P', ax=ax, label='aurora');
ax.set_ylabel('P [kW]')

fig,ax=plt.subplots()
fig.set_size_inches(15,6)
trip_tycho.plot(x='trip_time', y='E', ax=ax, label='tycho');
trip_aurora.plot(x='trip_time', y='E', ax=ax, label='aurora');
ax.set_ylabel('E [kWh]');

fig,ax=plt.subplots()
fig.set_size_inches(15,6)
trip_tycho.plot(x='trip_time', y='sog', ax=ax, label='tycho');
trip_aurora.plot(x='trip_time', y='sog', ax=ax, label='aurora');
ax.set_ylabel('sog [m/s]');


It can be seen that Aurora has a lower and stable speed that is maintined for a longer part of the trip giving a lower energy consumption.

### Outlier 2
Outlier 2 is the green dot in {numref}`fig:outliers` where the energy consumption is higher for Aurora eventhough the average speed is the same for both ships (sog_aurora - sog_tycho = 0). The track plot is shown below together with time series of Power, cumulative energy consumption (energy consumed since start of trip) and speed over ground (sog).

In [None]:
from scipy.integrate import cumtrapz

In [None]:
trip_tycho = df_tycho_time.groupby(by='trip_no').get_group(outlier_2.trip_no_tycho)
trip_aurora = df_aurora_time.groupby(by='trip_no').get_group(outlier_2.trip_no_aurora)

E = cumtrapz(y=trip_tycho['P'], x=trip_tycho['trip_time'])/3600
trip_tycho['E'] = np.concatenate([[0],E])

E = cumtrapz(y=trip_aurora['P'], x=trip_aurora['trip_time'])/3600
trip_aurora['E'] = np.concatenate([[0],E])


display(visualize.plot_map(trip_tycho))
display(visualize.plot_map(trip_aurora))

fig,ax=plt.subplots()
fig.set_size_inches(15,6)
trip_tycho.plot(x='trip_time', y='P', ax=ax, label='tycho');
trip_aurora.plot(x='trip_time', y='P', ax=ax, label='aurora');
ax.set_ylabel('P [kW]')

fig,ax=plt.subplots()
fig.set_size_inches(15,6)
trip_tycho.plot(x='trip_time', y='E', ax=ax, label='tycho');
trip_aurora.plot(x='trip_time', y='E', ax=ax, label='aurora');
ax.set_ylabel('E [kWh]');

fig,ax=plt.subplots()
fig.set_size_inches(15,6)
trip_tycho.plot(x='trip_time', y='sog', ax=ax, label='tycho');
trip_aurora.plot(x='trip_time', y='sog', ax=ax, label='aurora');
ax.set_ylabel('sog [m/s]');

In this case Tycho has a more stable speed and Aurora is making a large detour that is giving a higher energy consumption.