# Model Prototyping Notebook

This notebook prototypes a baseline machine learning model for crew fatigue prediction. It includes data loading, model training, evaluation, and saving the model.

## Steps:
- Load processed crew data.
- Split data into training and testing sets.
- Train a simple linear regression model.
- Evaluate the model using Mean Squared Error (MSE).
- Save the trained model for future inference.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import joblib

# Load processed data
data = pd.read_csv('../datasets/processed/crew_data_processed.csv')
print('Data loaded:', data.shape)

# For demonstration, use fatigue_score as both feature and target
X = data[['fatigue_score']]
y = data['fatigue_score']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print('Training set:', X_train.shape, '| Test set:', X_test.shape)

In [2]:
# Train linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Evaluate the model
predictions = model.predict(X_test)
mse = mean_squared_error(y_test, predictions)
print('Mean Squared Error:', mse)

In [3]:
# Save the trained model
joblib.dump(model, '../models/crew_fatigue_model.pkl')
print('Model saved as crew_fatigue_model.pkl')