# 02_1 Engine Model
Due to NDA agreements no data can be displayed.

In this notebook a power prediction model is shown. It is based on the CFD calculations for a certain hull shape with varying draft, trim and speed.  
The data is given and prepared as .csv file to be read.  

Even the model is more predicting the theoretical power demand by the vessel, the model is called "Engine Model".

### Imports

In [None]:
# Package and library import
import pandas as pd
import numpy as np
import math

import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px

from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_squared_error

### Read data from "Engine_Model_Data"

In [None]:
# Read data
df = pd.read_csv('../data/Engine_Model_Data.csv')

# Remove withspace and replace by '_'
df.columns = df.columns.map(lambda h: ''.join(h).replace(' ', '_'))

In [None]:
# Correct the wrong .dot in Power [kW]
df['Power_(PD)_[kW]'] *= 1000 

### Visualize the data

In [None]:
px.scatter(df, x='Power_(PD)_[kW]', y='Speed_[kn]', color = 'Power_(PD)_[kW]', color_continuous_scale=['#ff6600','#ff6600'])

In [None]:
px.scatter(df, x='Trim_[m]', y='Mean_Draft_[m]')

### Drop features

In [None]:
lst_drop = ['Dynamic_draft_AP_[m]', 'Dynamic_mean_draft_[m]', 'Dynamic_draft_FP_[m]', 'Dynamic_trim_[m]', 'Volume_[m^3]', 'Draft_AP_[m]', 'Draft_FP_[m]']
df = df.drop(lst_drop, axis = 1)

In [None]:
df.info(verbose=True)

### Correlation matrix

In [None]:
plt.figure(figsize = (10,10))
sns.heatmap(df.corr(), annot = True, cmap = 'RdYlGn')

### Model Details

In [None]:
# Define target
X = df.drop(['Power_(PD)_[kW]'], axis = 1)
y = df['Power_(PD)_[kW]']

In [None]:
# Train-Test-Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 42)

The test train split is done with 25% as it is commen.

In [None]:
# Fit RandomForest Regression model
model = RandomForestRegressor()
model.fit(X_train, y_train)

In [None]:
# Predict
y_predict = model.predict(X_test)

### MSE

As model metric the mean squared error (MSE) is used.

In [None]:
mse = mean_squared_error(y_test, y_predict)
rmse = math.sqrt(mean_squared_error(y_test, y_predict))
# print the predicted value
print("Mean Squared Error : % d" % mse) 
print("Root Mean Squared Error : % d" % rmse) 

In [None]:
# plot predicted data 
plt.scatter(y_test, model.predict(X_test), color = 'blue')  

# specify title and labels
plt.title('Power Prediction Theoretical Ship Engine Model with (RandomForestRegression)')  
plt.xlabel('Power') 
plt.ylabel('Power predicted') 
plt.show() 

## Model Evaluation
Use the Model with "test" data to evaluate the model. a test set of values is used to predict and check the outcome of the model.


In [None]:
# Input order: Draft [m], Trim [m], Speed [kn] 
Value_set = [[6.8, 5.8, 8.0]]

In [None]:
Value_predict = model.predict(Value_set)

print('Test Value out put : % f' % Value_predict) 

The validation of the model shows the following:

DecisionTreeRegressor: The values are predicted precise in the case a datapoint is given als input. For datapoints between given model points, the result is rounded either up or down with a treshhold of .5

RandomForestRegressor: the predicted values are close to the given datapoints in the target. The differences are small and hence selected for further use to predict the theoretical required Power of the engine to move the vessel.

### Use pickle to safe model 
The model is safed with the help of pickle and made available for use in the "Featureengineering" notebook to predict the power with the draft, trim and speeds from the ship.

In [None]:
#import pickle

# safe model
#RandForestReg_EngineModel = '../models/RFReg_Engine_Model.sav'
#pickle.dump(model, open(RandForestReg_EngineModel, 'wb'))

Safeing the model is enabled to not generate unwanted models ann data.

## Summary
The idea is to predict the power with a model which was trained on different, external data. This might improve or simplified the model, but his has to be tested and proven in the modeling part.