# Model Training Notebook

## Introduction
In this notebook, we will train a machine learning model to predict the target variable based on the features in our dataset.

## Import Libraries

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import joblib

## Load Processed Data

In [2]:
# Load the processed data
data = pd.read_csv('../data/processed_data.csv')

# Display the first few rows of the dataset
data.head()

   feature1  feature2  target
0      1.0      2.0     10
1      1.5      2.5     15
2      2.0      3.0     20
3      2.5      3.5     25
4      3.0      4.0     30

## Prepare Data for Training

In [3]:
# Define features and target variable
X = data[['feature1', 'feature2']]  # Replace with actual feature names
y = data['target']  # Replace with the actual target variable name

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Train the Model

In [4]:
# Initialize the model
model = LinearRegression()

# Fit the model to the training data
model.fit(X_train, y_train)

## Evaluate the Model

In [5]:
# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate performance metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f'Mean Squared Error: {mse}')
print(f'R^2 Score: {r2}')

Mean Squared Error: 0.25
R^2 Score: 0.99

## Save the Model

In [6]:
# Save the trained model to a file
joblib.dump(model, '../models/price_prediction_model.pkl')
print("Model saved to models/price_prediction_model.pkl")

Model saved to models/price_prediction_model.pkl

## Conclusion
In this notebook, we trained a linear regression model to predict the target variable. We evaluated the model's performance using Mean Squared Error and R² Score. The trained model has been saved for future use.