# Decision Tree Regression - Taxi Fare Prediction

- Dataset: NYC Yellow Taxi Trip Data (March 2016)

### Exercise: Build a Decision Tree Model

In this exercise, you will build a Decision Tree Regressor to predict taxi fare amounts. Fill in the blanks to complete the code.

# Dataset URL - https://www.kaggle.com/datasets/elemento/nyc-yellow-taxi-trip-data?select=yellow_tripdata_2016-03.csv


# What Uber uses for Dynamic Pricing
- https://www.uber.com/blog/research/dynamic-pricing-and-matching-in-ride-hailing-platforms/
- https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3258234&uclick_id=3356e8d9-7bcd-442d-967c-86ae22ef8e57

* **Import Libraries:** This cell imports the necessary libraries for our model - `pandas` for data manipulation, and `sklearn` for machine learning.

**Hints:**
- Import `train_test_split` from sklearn's model_selection module
- Import `DecisionTreeRegressor` from sklearn's tree module
- Import evaluation metrics from sklearn's metrics module

**Documentation:**
- [train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html)
- [DecisionTreeRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html)
- [mean_absolute_error](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_error.html)
- [r2_score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html)

In [1]:
# Install dependencies as needed:
# pip install kagglehub[pandas-datasets]
import kagglehub
from kagglehub import KaggleDatasetAdapter

# Set the path to the file you'd like to load
file_path = "yellow_tripdata_2016-03.csv"

# Load the latest version
df = kagglehub.load_dataset(
  KaggleDatasetAdapter.PANDAS,
  "elemento/nyc-yellow-taxi-trip-data",
  file_path,
  # Provide any additional arguments like
  # sql_query or pandas_kwargs. See the
  # documenation for more information:
  # https://github.com/Kaggle/kagglehub/blob/main/README.md#kaggledatasetadapterpandas
)

print("First 5 records:", df.head())

  df = kagglehub.load_dataset(


Using Colab cache for faster access to the 'nyc-yellow-taxi-trip-data' dataset.
First 5 records:    VendorID tpep_pickup_datetime tpep_dropoff_datetime  passenger_count  \
0         1  2016-03-01 00:00:00   2016-03-01 00:07:55                1   
1         1  2016-03-01 00:00:00   2016-03-01 00:11:06                1   
2         2  2016-03-01 00:00:00   2016-03-01 00:31:06                2   
3         2  2016-03-01 00:00:00   2016-03-01 00:00:00                3   
4         2  2016-03-01 00:00:00   2016-03-01 00:00:00                5   

   trip_distance  pickup_longitude  pickup_latitude  RatecodeID  \
0           2.50        -73.976746        40.765152           1   
1           2.90        -73.983482        40.767925           1   
2          19.98        -73.782021        40.644810           1   
3          10.78        -73.863419        40.769814           1   
4          30.43        -73.971741        40.792183           3   

  store_and_fwd_flag  dropoff_longitude  dropoff_

In [10]:
import sklearn
dir(sklearn.tree)

['BaseDecisionTree',
 'DecisionTreeClassifier',
 'DecisionTreeRegressor',
 'ExtraTreeClassifier',
 'ExtraTreeRegressor',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '_classes',
 '_criterion',
 '_export',
 '_partitioner',
 '_reingold_tilford',
 '_splitter',
 '_tree',
 '_utils',
 'export_graphviz',
 'export_text',
 'plot_tree']

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_absolute_error, r2_score

* **Load Dataset & Extract Features:** This cell loads the taxi trip data CSV file, removes duplicate rows, fills missing values, and extracts datetime features.

**Hints:**
- Use `pd.read_csv()` to load CSV files
- Use `.drop_duplicates()` to remove duplicate rows
- Use `.fillna()` to fill missing values
- Use `pd.to_datetime()` to parse datetime strings
- Extract `.dt.hour`, `.dt.dayofweek`, `.dt.month` from datetime

**Documentation:**
- [pd.read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html)
- [DataFrame.drop_duplicates](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html)
- [DataFrame.fillna](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html)
- [pd.to_datetime](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html)
- [DatetimeIndex.hour](https://pandas.pydata.org/docs/reference/api/pandas.DatetimeIndex.hour.html)

In [3]:
df.head()

Unnamed: 0,VendorID,tpep_pickup_datetime,tpep_dropoff_datetime,passenger_count,trip_distance,pickup_longitude,pickup_latitude,RatecodeID,store_and_fwd_flag,dropoff_longitude,dropoff_latitude,payment_type,fare_amount,extra,mta_tax,tip_amount,tolls_amount,improvement_surcharge,total_amount
0,1,2016-03-01 00:00:00,2016-03-01 00:07:55,1,2.5,-73.976746,40.765152,1,N,-74.004265,40.746128,1,9.0,0.5,0.5,2.05,0.0,0.3,12.35
1,1,2016-03-01 00:00:00,2016-03-01 00:11:06,1,2.9,-73.983482,40.767925,1,N,-74.005943,40.733166,1,11.0,0.5,0.5,3.05,0.0,0.3,15.35
2,2,2016-03-01 00:00:00,2016-03-01 00:31:06,2,19.98,-73.782021,40.64481,1,N,-73.974541,40.67577,1,54.5,0.5,0.5,8.0,0.0,0.3,63.8
3,2,2016-03-01 00:00:00,2016-03-01 00:00:00,3,10.78,-73.863419,40.769814,1,N,-73.96965,40.757767,1,31.5,0.0,0.5,3.78,5.54,0.3,41.62
4,2,2016-03-01 00:00:00,2016-03-01 00:00:00,5,30.43,-73.971741,40.792183,3,N,-74.17717,40.695053,1,98.0,0.0,0.0,0.0,15.5,0.3,113.8


In [4]:
df = df.drop_duplicates()
df = df.fillna(0)


df['pickup_datetime'] = pd.to_datetime(df['tpep_pickup_datetime'])

df['pickup_hour'] = df['pickup_datetime'].dt.hour
df['pickup_day'] = df['pickup_datetime'].dt.dayofweek
df['pickup_month'] = df['pickup_datetime'].dt.month

print(f"Samples: {len(df)}")

Samples: 12210951


## Quick Data Insights

* **Explore Data:** This cell displays basic statistics about fare amounts and trip distances.

**Hints:**
- Use `.min()`, `.max()`, `.mean()` for statistics
- Use `.nunique()` to count unique values

**Documentation:**
- [Series.min](https://pandas.pydata.org/docs/reference/api/pandas.Series.min.html)
- [Series.max](https://pandas.pydata.org/docs/reference/api/pandas.Series.max.html)
- [Series.mean](https://pandas.pydata.org/docs/reference/api/pandas.Series.mean.html)
- [DataFrame.head](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html)

In [5]:
# Calculate and display basic statistics for the 'fare_amount' column
print("Fare Amount Stats:")
print(f"   Min: ${df['fare_amount'].min():.2f}")
print(f"   Max: ${df['fare_amount'].max():.2f}")
print(f"   Avg: ${df['fare_amount'].mean():.2f}")

# Calculate and display basic statistics for the 'trip_distance' column
print(f"\nTrip Distance Stats:")
print(f"   Min: {df['trip_distance'].min():.2f} miles")
print(f"   Max: {df['trip_distance'].max():.2f} miles")
print(f"   Avg: {df['trip_distance'].mean():.2f} miles")

# Count and display the number of unique vendors
print(f"\nUnique Vendors: {df['VendorID'].nunique()}")

Fare Amount Stats:
   Min: $-376.00
   Max: $429496.72
   Avg: $12.80

Trip Distance Stats:
   Min: 0.00 miles
   Max: 19072628.80 miles
   Avg: 6.13 miles

Unique Vendors: 2


* **Payment Type Analysis:** This cell shows the distribution of payment types.

**Hints:**
- Use `.value_counts()` to count occurrences

**Documentation:**
- [Series.value_counts](https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html)

In [6]:
# Analyze the distribution of payment types in the dataset
# Payment types: 1=Credit card, 2=Cash, 3=No charge, 4=Dispute
print("Payment Type Distribution:")
print(df['payment_type'].value_counts().head())

Payment Type Distribution:
payment_type
1    8127391
2    4020407
3      46913
4      16240
Name: count, dtype: int64


## Data Cleaning

* **Filter Invalid Data:** This cell removes rows with invalid fare amounts or trip distances.

**Hints:**
- Use boolean indexing to filter rows
- Filter out negative fares and zero/negative distances

**Documentation:**
- [DataFrame indexing](https://pandas.pydata.org/docs/user_guide/indexing.html)

In [29]:
df.columns

Index(['VendorID', 'tpep_pickup_datetime', 'tpep_dropoff_datetime',
       'passenger_count', 'trip_distance', 'pickup_longitude',
       'pickup_latitude', 'RatecodeID', 'store_and_fwd_flag',
       'dropoff_longitude', 'dropoff_latitude', 'payment_type', 'fare_amount',
       'extra', 'mta_tax', 'tip_amount', 'tolls_amount',
       'improvement_surcharge', 'total_amount', 'pickup_datetime',
       'pickup_hour', 'pickup_day', 'pickup_month'],
      dtype='object')

In [9]:
# Remove rows with invalid fare amounts and trip distances
original_len = len(df)  # Store the original number of rows in the dataset

# Filter out rows with fare amounts less than or equal to 0 or greater than 200
df = df[(df['fare_amount'] >= 0) & (df['fare_amount'] < 200)]

# Filter out rows with trip distances less than or equal to 0 or greater than 100
df = df[(df['trip_distance'] > 0) & (df['trip_distance'] < 100)]

# Print the number of removed and remaining records
print(f"Removed {original_len - len(df)} invalid records")
print(f"Remaining samples: {len(df)}")

Removed 75371 invalid records
Remaining samples: 12135580


## Feature Selection and Model Training

* **Select Features:** This cell selects relevant features for predicting fare amount, including temporal and location features.

**Hints:**
- Select numeric features that influence fare: trip distance, passenger count, rate code, payment type
- Include datetime features: pickup hour, day of week, month
- Include location features: pickup/dropoff coordinates
- Use double brackets `df[[...]]` to select multiple columns

**Documentation:**
- [DataFrame column selection](https://pandas.pydata.org/docs/user_guide/indexing.html#basics)

In [10]:
X = df[["trip_distance", "passenger_count" , "RatecodeID", "payment_type" ,
      "pickup_hour", "pickup_day", "pickup_month" ,
        "pickup_longitude", "pickup_latitude","dropoff_longitude" , "dropoff_latitude"]]

y = df['fare_amount']

print(f"Features shape: {X.shape}")
print(f"Target shape: {y.shape}")
print(f"Features to use: trip_distance, passenger_count, RatecodeID, payment_type, pickup_hour, pickup_day, pickup_month, pickup_longitude, pickup_latitude, dropoff_longitude, dropoff_latitude")

Features shape: (12135580, 11)
Target shape: (12135580,)
Features to use: trip_distance, passenger_count, RatecodeID, payment_type, pickup_hour, pickup_day, pickup_month, pickup_longitude, pickup_latitude, dropoff_longitude, dropoff_latitude


* **Train-Test Split:** This cell splits the data into training and testing sets.

**Hints:**
- Use `train_test_split()` function
- Set `test_size=0.2` for 80-20 split
- Set `random_state=42` for reproducibility

**Documentation:**
- [train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html)

In [12]:
# Split the dataset into training and testing sets
# Use an 80-20 split and set a random state for reproducibility
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Print the number of samples in the training and testing sets
print(f"Training samples: {len(X_train)}")
print(f"Testing samples: {len(X_test)}")

Training samples: 9708464
Testing samples: 2427116


* **Train Decision Tree:** This cell creates and trains the Decision Tree Regressor model.

**Hints:**
- Create model with `max_depth=10` to prevent overfitting
- Use `.fit()` method to train the model

**Documentation:**
- [DecisionTreeRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html)
- [DecisionTreeRegressor.fit](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html#sklearn.tree.DecisionTreeRegressor.fit)
- [DecisionTreeRegressor.predict](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html#sklearn.tree.DecisionTreeRegressor.predict)

In [13]:
# Train a Decision Tree Regressor model
# Initialize the model with a maximum depth of 10 to prevent overfitting
model = DecisionTreeRegressor(max_depth=10, random_state=42)

# Train the model using the training data
model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Print a confirmation message
print("Model trained!")

Model trained!


* **Evaluate Model:** This cell calculates and displays the model's performance metrics.

**Hints:**
- MAE (Mean Absolute Error) measures average prediction error
- R2 Score measures how well the model explains variance (1.0 is perfect)

**Documentation:**
- [mean_absolute_error](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_error.html)
- [r2_score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html)

In [14]:
# Evaluate the model's performance using MAE and R² metrics
# Calculate the Mean Absolute Error (MAE) between actual and predicted values
mae = mean_absolute_error(y_test, y_pred)

# Calculate the R² score to measure the model's variance explanation
r2 = r2_score(y_test, y_pred)

# Print the evaluation metrics
print(f"MAE: ${mae:.2f}")
print(f"R²:  {r2:.4f}")

MAE: $1.25
R²:  0.9541


* **Make Prediction:** This cell demonstrates how to make a prediction for a new sample.

**Hints:**
- Create a DataFrame with the same columns as training data
- Use `model.predict()` to make predictions

**Documentation:**
- [pd.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)
- [DecisionTreeRegressor.predict](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html#sklearn.tree.DecisionTreeRegressor.predict)

In [None]:
sample = pd.DataFrame(
    {
        'trip_distance': [5.0],
        'passenger_count': [2],
        'RatecodeID': [1],
        'payment_type': [1],
        'pickup_hour': [18],
        'pickup_day': [4],
        'pickup_month': [3],
        'pickup_longitude': [-73.98],
        'pickup_latitude': [40.75],
        'dropoff_longitude': [-73.95],
        'dropoff_latitude': [40.78]
    }
)
prediction = model.predict(sample)[0]
print(f"Predicted fare for 5-mile trip: ${prediction:.2f}")

## Feature Importance

* **Analyze Feature Importance:** This cell shows which features are most important for predicting fare.

**Hints:**
- Use `model.feature_importances_` to get importance scores
- Higher values indicate more important features

**Documentation:**
- [DecisionTreeRegressor.feature_importances_](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html)

In [15]:
# Analyze the importance of each feature in the Decision Tree model
# Get the list of feature names from the dataset
feature_names = X.columns.tolist()

# Retrieve the feature importance scores from the trained model
importances = model.feature_importances_

# Print the importance of each feature
print("Feature Importances:")
for name, importance in zip(feature_names, importances):
    print(f"   {name}: {importance:.4f}")

Feature Importances:
   trip_distance: 0.9474
   passenger_count: 0.0000
   RatecodeID: 0.0378
   payment_type: 0.0028
   pickup_hour: 0.0053
   pickup_day: 0.0013
   pickup_month: 0.0000
   pickup_longitude: 0.0010
   pickup_latitude: 0.0007
   dropoff_longitude: 0.0016
   dropoff_latitude: 0.0022


# How is Dynamic Pricing Implemented in Real Taxi Services?

## Real-world taxi fare prediction systems consider many more factors:
- **Time of day** - Rush hour surge pricing
- **Weather conditions** - Rain/snow increases demand
- **Special events** - Concerts, sports games
- **Real-time demand** - Supply and demand in the area
- **Traffic conditions** - Estimated time of arrival

## Here is an article about how Uber implements dynamic pricing:
https://www.uber.com/us/en/marketplace/pricing/

# PyTorch Neural Network - Taxi Fare Prediction

### Exercise: Build a Neural Network with PyTorch

In this exercise, you will build a simple Neural Network using PyTorch to predict taxi fare amounts. Fill in the blanks to complete the code.

* **Import Libraries:** This cell imports the necessary libraries including PyTorch for deep learning.

**Hints:**
- Import `torch.nn` as `nn` for neural network layers
- Import `torch.optim` for optimizers
- Import `DataLoader` and `TensorDataset` from torch.utils.data

**Documentation:**
- [torch.nn](https://pytorch.org/docs/stable/nn.html)
- [torch.optim](https://pytorch.org/docs/stable/optim.html)
- [DataLoader](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader)
- [TensorDataset](https://pytorch.org/docs/stable/data.html#torch.utils.data.TensorDataset)

In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, r2_score
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

import warnings
warnings.filterwarnings('ignore')

* **Load and Prepare Data:** This cell loads the dataset, cleans it, extracts datetime features, and selects features including temporal and location data.

**Hints:**
- Use `.dropna()` to remove missing values
- Use `.drop_duplicates()` to remove duplicates
- Filter invalid fare and distance values
- Extract datetime features (hour, day of week, month)
- Include location coordinates in feature selection

**Documentation:**
- [pd.read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html)
- [DataFrame.dropna](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dropna.html)
- [pd.to_datetime](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html)

In [2]:
import kagglehub
from kagglehub import KaggleDatasetAdapter

# Set the path to the file you'd like to load
file_path = "yellow_tripdata_2016-03.csv"

# Load the latest version
df = kagglehub.load_dataset(
  KaggleDatasetAdapter.PANDAS,
  "elemento/nyc-yellow-taxi-trip-data",
  file_path,
  # Provide any additional arguments like
  # sql_query or pandas_kwargs. See the
  # documenation for more information:
  # https://github.com/Kaggle/kagglehub/blob/main/README.md#kaggledatasetadapterpandas
)

print("First 5 records:", df.head())

  df = kagglehub.load_dataset(


Using Colab cache for faster access to the 'nyc-yellow-taxi-trip-data' dataset.
First 5 records:    VendorID tpep_pickup_datetime tpep_dropoff_datetime  passenger_count  \
0         1  2016-03-01 00:00:00   2016-03-01 00:07:55                1   
1         1  2016-03-01 00:00:00   2016-03-01 00:11:06                1   
2         2  2016-03-01 00:00:00   2016-03-01 00:31:06                2   
3         2  2016-03-01 00:00:00   2016-03-01 00:00:00                3   
4         2  2016-03-01 00:00:00   2016-03-01 00:00:00                5   

   trip_distance  pickup_longitude  pickup_latitude  RatecodeID  \
0           2.50        -73.976746        40.765152           1   
1           2.90        -73.983482        40.767925           1   
2          19.98        -73.782021        40.644810           1   
3          10.78        -73.863419        40.769814           1   
4          30.43        -73.971741        40.792183           3   

  store_and_fwd_flag  dropoff_longitude  dropoff_

In [4]:

print(f"Samples after cleaning: {len(df)}")

df = df[(df['fare_amount'] > 0) & (df['fare_amount'] < 200)]
df = df[(df['trip_distance'] > 0) & (df['trip_distance'] < 100)]

df['pickup_datetime'] = pd.to_datetime(df['tpep_pickup_datetime'])
df['pickup_hour'] = df['pickup_datetime'].dt.hour
df['pickup_day'] = df['pickup_datetime'].dt.dayofweek
df['pickup_month'] = df['pickup_datetime'].dt.month

X = df[['trip_distance', 'passenger_count', 'RatecodeID', 'payment_type',
        'pickup_hour', 'pickup_day', 'pickup_month',
        'pickup_longitude', 'pickup_latitude', 'dropoff_longitude', 'dropoff_latitude']]
y = df['fare_amount']

print(f"Features: {X.shape[1]}")

Samples after cleaning: 12210952
Features: 11


* **Scale and Convert to Tensors:** This cell scales the features using StandardScaler and converts data to PyTorch tensors.

**Hints:**
- Use `scaler.fit_transform()` for training data
- Use `scaler.transform()` for test data (no fitting!)
- Use `torch.FloatTensor()` to create tensors

**Documentation:**
- [StandardScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)
- [StandardScaler.fit_transform](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler.fit_transform)
- [torch.FloatTensor](https://pytorch.org/docs/stable/tensors.html)
- [Tensor.reshape](https://pytorch.org/docs/stable/generated/torch.Tensor.reshape.html)

In [5]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert to PyTorch tensors
X_train_tensor = torch.FloatTensor(X_train_scaled)
y_train_tensor = torch.FloatTensor(y_train.values).reshape(-1, 1)
X_test_tensor = torch.FloatTensor(X_test_scaled)
y_test_tensor = torch.FloatTensor(y_test.values).reshape(-1, 1)

print(f"Train: {X_train_tensor.shape[0]} | Test: {X_test_tensor.shape[0]}")

Train: 9706778 | Test: 2426695


## Define Neural Network

* **Create Model Class:** This cell defines the neural network architecture with 2 hidden layers.

**Hints:**
- Use `nn.Linear(in_features, out_features)` for fully connected layers
- Use `nn.ReLU()` or `nn.Sigmoid()` for activation functions
- The network structure is: Input -> 64 neurons -> 32 neurons -> 1 output

**Documentation:**
- [nn.Module](https://pytorch.org/docs/stable/generated/torch.nn.Module.html)
- [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)
- [nn.ReLU](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)
- [nn.Sigmoid](https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html)

In [6]:
# Simple Neural Network
class TaxiFareNN(nn.Module):
    def __init__(self, input_size):
        super(TaxiFareNN, self).__init__()
        self.fc1 = nn.Linear(input_size, 64)
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 1)
        self.activation = nn.ReLU()

    def forward(self, x):
        x = self.activation(self.fc1(x))
        x = self.activation(self.fc2(x))
        x = self.fc3(x)
        return x

# Initialize model
input_size = X_train_tensor.shape[1]
model = TaxiFareNN(input_size)
print(model)

TaxiFareNN(
  (fc1): Linear(in_features=11, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=32, bias=True)
  (fc3): Linear(in_features=32, out_features=1, bias=True)
  (activation): ReLU()
)


## Train the Model

* **Training Loop:** This cell sets up the loss function, optimizer, and runs the training loop.

**Hints:**
- Use `nn.MSELoss()` for regression problems
- Use `optim.Adam()` as the optimizer
- Call `optimizer.zero_grad()` before each backward pass
- Call `loss.backward()` to compute gradients
- Call `optimizer.step()` to update weights

**Documentation:**
- [nn.MSELoss](https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html)
- [optim.Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html)
- [optimizer.zero_grad](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html)
- [Tensor.backward](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html)
- [optimizer.step](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.step.html)
- [model.train](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.train)

In [None]:
# Loss and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Create DataLoader
epochs = 100
train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=32)

# Training loop
for epoch in range(epochs):
    model.train()
    epoch_loss = 0
    for X_batch, y_batch in train_loader:
        optimizer.zero_grad()
        outputs = model(X_batch)
        loss = criterion(outputs, y_batch)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()

    if (epoch + 1) % 20 == 0:
        print(f"Epoch {epoch+1}/{epochs} - Loss: {epoch_loss/len(train_loader):.4f}")

print("Training complete!")

## Evaluate Model

* **Test the Model:** This cell evaluates the model on the test set.

**Hints:**
- Use `model.eval()` to set evaluation mode
- Use `torch.no_grad()` context to disable gradient computation
- Use `.numpy()` to convert tensor to numpy array

**Documentation:**
- [model.eval](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval)
- [torch.no_grad](https://pytorch.org/docs/stable/generated/torch.no_grad.html)
- [Tensor.numpy](https://pytorch.org/docs/stable/generated/torch.Tensor.numpy.html)

In [None]:
# Evaluate on test set
model.eval()
with torch.no_grad():
    y_pred = model(X_test_tensor).numpy()

mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("PyTorch Neural Network Results:")
print(f"   MAE: ${mae:.2f}")
print(f"   R2:  {r2:.4f}")