# Predicting Taxi Trip Duration

In this notebook, we will use the neural network library to construct several models for predicting taxi trip durations using the NYC taxi dataset. We will preprocess the data, build the models, and evaluate their performance.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from src.models.sequential import Sequential
from src.layers.linear import Linear
from src.layers.sigmoid import Sigmoid
from src.layers.relu import ReLU
from src.layers.binary_cross_entropy import BinaryCrossEntropy

# Load the dataset
dataset = np.load("../data/nyc_taxi_data.npy", allow_pickle=True).item()
X_train, y_train, X_test, y_test = dataset["X_train"], dataset["y_train"], dataset["X_test"], dataset["y_test"]

# Display the shapes of the datasets
print(f"Training data shape: {X_train.shape}, Training labels shape: {y_train.shape}")
print(f"Test data shape: {X_test.shape}, Test labels shape: {y_test.shape}")

## Data Preprocessing

We will preprocess the data to ensure it is suitable for training our neural network. This may include normalization, handling missing values, and selecting relevant features.

In [2]:
# Example preprocessing steps
from sklearn.preprocessing import StandardScaler

# Normalize the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert to DataFrame for easier manipulation
X_train_df = pd.DataFrame(X_train_scaled)
X_test_df = pd.DataFrame(X_test_scaled)

## Model Construction

We will construct a neural network model using the Sequential class from our library. We will experiment with different architectures.

In [3]:
# Constructing a simple neural network model
model = Sequential()
model.add(Linear(input_dim=X_train.shape[1], output_dim=10))  # Hidden layer with 10 nodes
model.add(ReLU())
model.add(Linear(input_dim=10, output_dim=1))  # Output layer
model.add(Sigmoid())

# Compile the model (not implemented in this example, just a placeholder)
# model.compile(optimizer='adam', loss=BinaryCrossEntropy())

## Model Training

We will train the model using the training data and evaluate its performance on the test data.

In [4]:
# Placeholder for training the model
# model.fit(X_train_scaled, y_train, epochs=100, batch_size=32)

# Evaluate the model
# predictions = model.predict(X_test_scaled)
# Evaluate performance (e.g., RMSE, MAE)
# performance = evaluate_model(y_test, predictions)
# print(performance)

## Conclusion

In this notebook, we constructed a neural network model for predicting taxi trip durations. We will continue to experiment with different architectures and hyperparameters to improve performance.