#üöÄ Assignment 1: Predicting Vehicle Fuel Efficiency

**Objective**: Use your new Neural Network skills to predict a car's fuel efficiency (MPG) based on its characteristics (Horsepower, Weight, etc.)
##üõ†Ô∏è Step 0: Setup & Data Loading

We will use the famous "Auto MPG" dataset.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score, mean_absolute_error
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Load dataset from Seaborn
df = sns.load_dataset('mpg')

# Quick look at the data
print(df.head())

# Cleaning: Remove rows with missing values
df = df.dropna()

# We want to predict 'mpg' using these features:
features = ['cylinders', 'displacement', 'horsepower', 'weight', 'acceleration']
X = df[features]
y = df['mpg']

print("\n‚úÖ Data Loaded and Cleaned!")

##üßπ Step 1: Pre-processing (The "Cleaning" Phase)

Neural Networks are very sensitive to the scale of numbers. If one column is 2000 (Weight) and another is 15 (Acceleration), the math gets messy. We need to **Normalize** the data.

###üìù Your Task:

Split the data into Training and Testing sets, then scale them.

In [None]:
# 1. Split the data (80% Train, 20% Test)
# TODO: see the import code
X_train, X_test, y_train, y_test = ___train_test_split(X, y, test_size=0.2, random_state=42)

# 2. Scaling (Normalization)
scaler = StandardScaler()

# TODO: Fit the scaler on training data and transform both sets
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("‚úÖ Data Pre-processed!")

##üß† Step 2: Build Your Brain (Neural Network)

Now, design a Neural Network. Since this is a *Regression** problem (predicting a continuous number), your last layer should have 1 neuron with no activation.

###üìù Your Task:

Build a model with at least two hidden layers.

In [None]:
# TODO: Define the Sequential model
model = Sequential([
    # Layer 1: Try 64 neurons with 'relu' activation
    Dense(64, activation='relu', input_shape=(len(features),)),

     # Hidden Layer 2
    Dense(32, activation='relu'),

    # Output Layer
    Dense(1)
])

# Compile model
model.compile(
    optimizer='adam',
    loss='mse',
    metrics=['mae']

print("‚úÖ Model Constructed!")
model.summary()


##üöÄ Step 3: Training the Model

###üìù Your Task:

Train the model for 100 epochs.

In [None]:
# TODO: Train the model using .fit()
# Remember to use the SCALED features (X_train_scaled)
history = model.fit(
    X_train_scaled,
    y_train,
    epochs=100,
    validation_split=0.2,
    verbose=1
)

print("‚úÖ Training Complete!")

##üìä Step 4: Evaluation

Let's see if your AI is actually a good "Engineer."

###üìù Your Task:

Generate predictions and calculate the scores.

In [None]:
# Predictions
predictions = model.predict(X_test_scaled).flatten()

# Metrics
r2 = r2_score(y_test, predictions)
mae = mean_absolute_error(y_test, predictions)

print(f"Final R2 Score: {r2:.4f}")
print(f"Mean Absolute Error: {mae:.2f} MPG")

# Visualization: Predicted vs Actual
plt.figure(figsize=(8,6))
plt.scatter(y_test, predictions, alpha=0.5)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')
plt.xlabel('Actual MPG')
plt.ylabel('Predicted MPG')
plt.title('How well did we do?')
plt.show()

In [None]:
model_big = Sequential([
    Dense(128, activation='relu', input_shape=(len(features),)),
    Dense(64, activation='relu'),
    Dense(1)
])

model_big.compile(optimizer='adam', loss='mse')

model_big.fit(X_train_scaled, y_train, epochs=100, validation_split=0.2, verbose=0)

pred_big = model_big.predict(X_test_scaled).flatten()
print("R2 with Bigger Model:", r2_score(y_test, pred_big))


In [None]:
model_dropout = Sequential([
    Dense(128, activation='relu', input_shape=(len(features),)),
    Dropout(0.3),
    Dense(64, activation='relu'),
    Dropout(0.3),
    Dense(1)
])

model_dropout.compile(optimizer='adam', loss='mse')

history_dropout = model_dropout.fit(
    X_train_scaled,
    y_train,
    epochs=100,
    validation_split=0.2,
    verbose=0
)

pred_dropout = model_dropout.predict(X_test_scaled).flatten()
print("R2 with Dropout:", r2_score(y_test, pred_dropout))
