# Capstone Project: Predicting NYC Taxi Fare

Imagine stepping off a plane at JFK Airport, tired and eager to get to Midtown Manhattan. As you hail a taxi, you might wonder—how much will this ride cost?  

This project aims to answer that question by predicting NYC taxi fares using trip data. It includes:
- An **interactive map** that visualizes key patterns, such as pickup and dropoff locations, high-demand areas like Midtown Manhattan and the Financial District, and popular landmarks.
- Two predictive models—**Linear Regression** and **Random Forest Regression**—built to estimate taxi fares based on trip details such as distance, passenger count, and time of day.

Through data exploration, modeling, and visualization, this project highlights the factors influencing fare variability and provides practical tools for decision-making. 


## Interactive Map
To visualize the data, an interactive map was created that includes:
- Visualizing pickup and drop-off locations in different layers.
- Using polygons to highlight high-demand areas like Midtown Manhattan and the Financial District.
- Including landmarks like JFK Airport, Times Square, and Central Park.

In [11]:
import pandas as pd
import folium

# Load your dataset
file_path = 'data/taxi_trip_map_data.csv'  # Replace with your actual file path
data = pd.read_csv(file_path)

# Calculate the average latitude and longitude for the map's initial center
center_lat = data['pickup_latitude'].mean()
center_lon = data['pickup_longitude'].mean()

# Initialize the map
nyc_map = folium.Map(location=[center_lat, center_lon], zoom_start=12)

# Create separate layers for pickup and dropoff points
pickup_layer = folium.FeatureGroup(name='Pickup Locations')
dropoff_layer = folium.FeatureGroup(name='Dropoff Locations')

# Add pickup points to the pickup layer
for _, row in data.iterrows():
    folium.CircleMarker(
        location=[row['pickup_latitude'], row['pickup_longitude']],
        radius=2,
        color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.6,
        popup=f"Pickup: {row['pickup_datetime']}"
    ).add_to(pickup_layer)

# Add dropoff points to the dropoff layer
for _, row in data.iterrows():
    folium.CircleMarker(
        location=[row['dropoff_latitude'], row['dropoff_longitude']],
        radius=2,
        color='green',
        fill=True,
        fill_color='green',
        fill_opacity=0.6,
        popup=f"Dropoff: {row['pickup_datetime']}"
    ).add_to(dropoff_layer)

# Add layers to the map
pickup_layer.add_to(nyc_map)
dropoff_layer.add_to(nyc_map)

# Add a layer control to toggle between layers
folium.LayerControl().add_to(nyc_map)

# Save the map to an HTML file
# nyc_map.save("nyc_interactive_map_with_layers.html")
# print("Map saved as 'nyc_interactive_map_with_layers.html'. Open it in your browser to view.")


<folium.map.LayerControl at 0x16ad76b50>

## Next Steps: Project Overview and Data Analysis

With the interactive map providing a visual understanding of NYC taxi trip patterns, we now dive into the Project Overview. This section introduces the dataset, highlights its key features, and explores important trends through data analysis. By understanding factors like trip distance, passenger count, and time of day, we can uncover insights that drive fare prediction and improve model performance.
