# Car Recommendation Engine - Python Demo

This notebook demonstrates how to use the car recommendation engine that imports functionality from `notebooks/rec_engine.ipynb` and uses data from `data/data.csv`.

In [1]:
# Import the car recommendation system
from car_recommendation_system import CarRecommendationEngine, create_recommendation_system, run_example_recommendations

import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

## Quick Start - Run All Examples

This will load the data and run multiple example scenarios.

In [2]:
# Run comprehensive examples (this will show budget, performance, and luxury recommendations)
rec_engine = run_example_recommendations('data/data.csv')

Car Recommendation Engine - Example Usage
Loading original data from data/data.csv...
Original dataset shape: (11914, 16)
Vehicles dataset not found at vehicles_dataset.csv
Combined dataset shape: (11914, 16)
Estimating HP for 69 vehicles...
Selected 16 features for recommendation engine
Data preprocessing completed successfully!
Final dataset shape: (11914, 19)
Feature matrix shape: (11914, 122)
Numerical features used: ['Year', 'Engine HP', 'Engine Cylinders', 'highway MPG', 'city mpg', 'Efficiency', 'Age', 'MSRP', 'Number of Doors', 'Price_per_HP', 'price_to_efficiency']
Categorical features used: ['Market Category', 'Vehicle Size', 'Vehicle Style', 'Transmission Type', 'Driven_Wheels', 'Engine Fuel Type']

1. Dataset Information:
Available Features:

Numerical Features:
 Year: 1990.00 - 2017.00 (avg: 2010.38)
 Engine HP: 55.00 - 1001.00 (avg: 248.70)
 Engine Cylinders: 0.00 - 16.00 (avg: 5.63)
 highway MPG: 12.00 - 354.00 (avg: 26.64)
 city mpg: 7.00 - 137.00 (avg: 19.73)
 Efficien

## Manual Setup and Usage

If you prefer to set up the system manually:

In [3]:
# Manual initialization
rec_engine = create_recommendation_system('data/data.csv', 'data/vehicles_dataset.csv')

if rec_engine:
    print("Car Recommendation Engine initialized successfully!")
    print(f"Dataset contains {len(rec_engine.df_rec)} cars")
else:
    print("Failed to initialize. Check that 'data/data.csv' exists.")

Loading original data from data/data.csv...
Original dataset shape: (11914, 16)
Loading vehicles dataset: (1002, 18)
Processed vehicles dataset: (1002, 16)
Combined dataset shape: (12916, 16)
Estimating HP for 944 vehicles...
Selected 16 features for recommendation engine
Data preprocessing completed successfully!
Final dataset shape: (12916, 19)
Feature matrix shape: (12916, 138)
Numerical features used: ['Year', 'Engine HP', 'Engine Cylinders', 'highway MPG', 'city mpg', 'Efficiency', 'Age', 'MSRP', 'Number of Doors', 'Price_per_HP', 'price_to_efficiency']
Categorical features used: ['Market Category', 'Vehicle Size', 'Vehicle Style', 'Transmission Type', 'Driven_Wheels', 'Engine Fuel Type']
Car Recommendation Engine initialized successfully!
Dataset contains 12916 cars


In [4]:
# Display dataset information
if rec_engine:
    rec_engine.get_feature_info()

Available Features:

Numerical Features:
 Year: 1990.00 - 2025.00 (avg: 2011.43)
 Engine HP: 55.00 - 1001.00 (avg: 243.80)
 Engine Cylinders: 0.00 - 16.00 (avg: 5.57)
 highway MPG: 0.00 - 9711.00 (avg: 29.81)
 city mpg: 0.00 - 8254.35 (avg: 22.65)
 Efficiency: 0.00 - 4368.77 (avg: 13.88)
 Age: 0.00 - 35.00 (avg: 13.57)
 MSRP: 2000.00 - 2065902.00 (avg: 41307.84)
 Number of Doors: 2.00 - 5.00 (avg: 3.48)
 Price_per_HP: 7.14 - 2307.69 (avg: 153.18)
 price_to_efficiency: 0.00 - 1207.94 (avg: 12.75)

Categorical Features:
 Market Category: 87 unique values
 Options: Factory Tuner,Luxury,High-Performance, Luxury,Performance, Luxury,High-Performance, Luxury, Performance...
 Vehicle Size: 3 unique values
 Options: Compact, Midsize, Large
 Vehicle Style: 16 unique values
 Options: Coupe, Convertible, Sedan, Wagon, 4dr Hatchback...
 Transmission Type: 5 unique values
 Options: MANUAL, AUTOMATIC, AUTOMATED_MANUAL, DIRECT_DRIVE, UNKNOWN
 Driven_Wheels: 4 unique values
 Options: rear wheel drive, 

In [5]:
# Show some sample cars from the dataset
if rec_engine:
    rec_engine.display_sample_data(5)

Sample data (5 random cars):

Audi R8 (2015)
 Price: $153,900.0
 Engine: 525.0 HP, 10.0 cylinders
 MPG: 19.0 highway, 12.0 city
 Category: Luxury,High-Performance
 Size: Compact

Dodge Viper (2017)
 Price: $87,895.0
 Engine: 645.0 HP, 10.0 cylinders
 MPG: 19.0 highway, 12.0 city
 Category: Exotic,High-Performance
 Size: Compact

Nissan Truck (1997)
 Price: $2,837.0
 Engine: 134.0 HP, 4.0 cylinders
 MPG: 23.0 highway, 19.0 city
 Category: Unknown
 Size: Compact

Ford Escape (2024)
 Price: $31,985.0
 Engine: 100.0 HP, 3.0 cylinders
 MPG: 2.0 highway, 1.7 city
 Category: Unknown
 Size: Midsize

GMC Sierra 1500 Classic (2007)
 Price: $21,465.0
 Engine: 195.0 HP, 6.0 cylinders
 MPG: 20.0 highway, 14.0 city
 Category: Flex Fuel
 Size: Large


## Custom Recommendations

Use the exact same function signature as in the original notebook:

In [6]:
# Example 1: Using the exact notebook function signature
user_preferences = {
    'Vehicle Style': '4dr Hatchback',
    'Vehicle Size': 'Compact',
    'Engine HP': 100,
    'Year': 2015,
    'Transmission Type': 'Automatic',
    'Market Category': 'Hatchback,Hybrid'
}

if rec_engine:
    print("Using exact notebook function signature:")
    recommendations = rec_engine.get_recommendations_by_preference(
        user_preferences, 
        rec_engine.preprocessor, 
        rec_engine.feature_matrix, 
        rec_engine.df_rec, 
        top_n=10
    )
    
    print(f"\nFound {len(recommendations)} recommendations:")
    display(recommendations)

Using exact notebook function signature:

Found 10 recommendations:


Unnamed: 0,Make,Model,Year,Market Category,Transmission Type,Vehicle Size,Engine HP,Engine Fuel Type,MSRP,Efficiency,Price_per_HP,price_to_efficiency,similarity_score
7692,Toyota,Prius,2015,"Hatchback,Hybrid",AUTOMATIC,Compact,134.0,regular unleaded,30005.0,36.940299,223.91791,12.311381,-0.232206
7694,Toyota,Prius,2015,"Hatchback,Hybrid",AUTOMATIC,Compact,134.0,regular unleaded,28435.0,36.940299,212.201493,12.991137,-0.304937
7677,Toyota,Prius Prime,2017,"Hatchback,Hybrid",AUTOMATIC,Compact,121.0,regular unleaded,33100.0,44.628099,273.553719,13.482809,-0.665999
7695,Toyota,Prius,2015,"Hatchback,Hybrid",AUTOMATIC,Compact,134.0,regular unleaded,26985.0,36.940299,201.380597,13.689197,-0.932834
7700,Toyota,Prius,2016,"Hatchback,Hybrid",AUTOMATIC,Compact,121.0,regular unleaded,30000.0,42.975207,247.933884,14.325069,-1.518764
7704,Toyota,Prius,2017,"Hatchback,Hybrid",AUTOMATIC,Compact,121.0,regular unleaded,30015.0,42.975207,248.057851,14.31791,-1.536887
7691,Toyota,Prius,2015,"Hatchback,Hybrid",AUTOMATIC,Compact,134.0,regular unleaded,25765.0,36.940299,192.276119,14.337395,-1.617196
7705,Toyota,Prius,2017,"Hatchback,Hybrid",AUTOMATIC,Compact,121.0,regular unleaded,29135.0,42.975207,240.785124,14.750371,-1.961519
7699,Toyota,Prius,2016,"Hatchback,Hybrid",AUTOMATIC,Compact,121.0,regular unleaded,28650.0,42.975207,236.77686,15.000072,-2.172215
7702,Toyota,Prius,2016,"Hatchback,Hybrid",AUTOMATIC,Compact,121.0,regular unleaded,28100.0,42.975207,232.231405,15.293668,-2.472332


In [7]:
# Example 2: Simplified usage (parameters auto-filled)
my_preferences = {
    'Make': 'Toyota',
    'Vehicle Size': 'Compact',
    'MSRP': 30000,  # Maximum budget
    'highway MPG': 35,  # Minimum highway MPG
    'Transmission Type': 'AUTOMATIC'
}

if rec_engine:
    print("Simplified usage:")
    print(f"Preferences: {my_preferences}")
    
    simple_recs = rec_engine.get_recommendations_by_preference(my_preferences, top_n=5)
    
    if not simple_recs.empty:
        print(f"\nTop 5 matches:")
        display(simple_recs)
    else:
        print("No matches found. Try adjusting preferences.")

Simplified usage:
Preferences: {'Make': 'Toyota', 'Vehicle Size': 'Compact', 'MSRP': 30000, 'highway MPG': 35, 'Transmission Type': 'AUTOMATIC'}

Top 5 matches:


Unnamed: 0,Make,Model,Year,Market Category,Transmission Type,Vehicle Size,Engine HP,Engine Fuel Type,MSRP,Efficiency,Price_per_HP,price_to_efficiency,similarity_score
5428,Volkswagen,Golf SportWagen,2015,Unknown,AUTOMATIC,Compact,170.0,regular unleaded,29345.0,17.647059,172.617647,6.013651,-0.031154
5436,Volkswagen,Golf SportWagen,2016,Unknown,AUTOMATIC,Compact,170.0,regular unleaded,29385.0,17.647059,172.852941,6.005465,-0.069759
5435,Volkswagen,Golf SportWagen,2016,Unknown,AUTOMATIC,Compact,170.0,regular unleaded,27025.0,17.647059,158.970588,6.529902,-0.369739
5458,Volkswagen,Golf,2016,Hatchback,AUTOMATIC,Compact,170.0,regular unleaded,27425.0,17.941176,161.323529,6.541906,-0.402996
5452,Volkswagen,Golf,2015,Hatchback,AUTOMATIC,Compact,170.0,regular unleaded,27395.0,17.941176,161.147059,6.54907,-0.40503


## Interactive Widget Interface

Create an interactive interface with sliders and dropdowns (requires ipywidgets):

In [8]:
# Create interactive interface (only works in Jupyter with ipywidgets installed)
if rec_engine:
    try:
        rec_engine.create_interactive_interface()
    except ImportError:
        print("Interactive interface requires ipywidgets. Install with: pip install ipywidgets")
    except Exception as e:
        print(f"Interactive interface not available: {e}")

Creating interactive interface...
Note: This requires Jupyter notebook with ipywidgets installed


interactive(children=(IntSlider(value=1990, continuous_update=False, description='Year:', max=2025, min=1990, …

## Advanced Usage Examples

Different types of car buyers and their preferences:

In [9]:
# Family buyer preferences
family_preferences = {
    'Vehicle Size': 'Midsize',
    'Number of Doors': 4,
    'MSRP': 35000,
    'highway MPG': 25,
    'Vehicle Style': 'SUV'
}

if rec_engine:
    print("Family-Friendly Cars:")
    family_recs = rec_engine.get_recommendations_by_preference(family_preferences, top_n=5)
    
    if not family_recs.empty:
        for idx, car in family_recs.iterrows():
            mpg = car.get('highway MPG', 'N/A')
            price = car.get('MSRP', 0)
            print(f"  • {car['Make']} {car['Model']} ({car.get('Year', 'N/A')}) - ${price:,} - {mpg} hwy MPG")

Family-Friendly Cars:
  • Ford Edge (2016) - $33,785.0 - N/A hwy MPG
  • Subaru Forester (2017) - $34,295.0 - N/A hwy MPG
  • Ford Edge (2015) - $33,495.0 - N/A hwy MPG
  • Nissan Murano (2015) - $32,620.0 - N/A hwy MPG
  • Dodge Durango (2024) - $31,200.0 - N/A hwy MPG


In [10]:
# Eco-friendly buyer preferences
eco_preferences = {
    'highway MPG': 40,
    'Market Category': 'Hybrid',
    'Vehicle Size': 'Compact'
}

if rec_engine:
    print("Eco-Friendly Cars:")
    eco_recs = rec_engine.get_recommendations_by_preference(eco_preferences, top_n=5)
    
    if not eco_recs.empty:
        for idx, car in eco_recs.iterrows():
            mpg = car.get('highway MPG', 'N/A')
            efficiency = car.get('Efficiency', 'N/A')
            print(f"  • {car['Make']} {car['Model']} - {mpg} hwy MPG - Efficiency: {efficiency:.2f}" if isinstance(efficiency, (int, float)) else f"  • {car['Make']} {car['Model']} - {mpg} hwy MPG")

Eco-Friendly Cars:
  • Lexus CT 200h - N/A hwy MPG - Efficiency: 30.97
  • Lexus CT 200h - N/A hwy MPG - Efficiency: 30.97
  • Lexus CT 200h - N/A hwy MPG - Efficiency: 30.97
  • Toyota Prius - N/A hwy MPG - Efficiency: 36.94
  • Toyota Prius - N/A hwy MPG - Efficiency: 36.94


In [11]:
# Sports car buyer preferences
sports_preferences = {
    'Engine HP': 300,
    'Market Category': 'Performance',
    'Vehicle Style': 'Coupe',
    'Transmission Type': 'MANUAL'
}

if rec_engine:
    print("Sports Cars:")
    sports_recs = rec_engine.get_recommendations_by_preference(sports_preferences, top_n=5)
    
    if not sports_recs.empty:
        for idx, car in sports_recs.iterrows():
            hp = car.get('Engine HP', 'N/A')
            price = car.get('MSRP', 0)
            score = car.get('similarity_score', 0)
            print(f"  • {car['Make']} {car['Model']} - {hp} HP - ${price:,} - Score: {score:.3f}")

Sports Cars:
  • Ford Mustang - 310.0 HP - $29,645.0 - Score: -0.039
  • Ford Mustang - 310.0 HP - $29,300.0 - Score: -0.046
  • Ford Mustang - 310.0 HP - $29,645.0 - Score: -0.073
  • Toyota Supra - 320.0 HP - $22,001.0 - Score: -0.150
  • Chevrolet Camaro - 323.0 HP - $26,005.0 - Score: -0.162


## Data Export

Save your recommendations to CSV files:

In [12]:
""" # Save recommendations to CSV
if rec_engine:
    my_final_preferences = {
        'Vehicle Size': 'Compact',
        'MSRP': 30000,
        'highway MPG': 30,
        'Year': 2015
    }
    
    final_recs = rec_engine.get_recommendations_by_preference(my_final_preferences, top_n=10)
    
    if not final_recs.empty:
        # Save to CSV
        final_recs.to_csv('my_car_recommendations.csv', index=False)
        print(f"Saved {len(final_recs)} recommendations to 'my_car_recommendations.csv'")
        
        # Display the saved recommendations
        print("\nSaved recommendations:")
        display(final_recs)
    else:
        print("No recommendations to save.") """

' # Save recommendations to CSV\nif rec_engine:\n    my_final_preferences = {\n        \'Vehicle Size\': \'Compact\',\n        \'MSRP\': 30000,\n        \'highway MPG\': 30,\n        \'Year\': 2015\n    }\n    \n    final_recs = rec_engine.get_recommendations_by_preference(my_final_preferences, top_n=10)\n    \n    if not final_recs.empty:\n        # Save to CSV\n        final_recs.to_csv(\'my_car_recommendations.csv\', index=False)\n        print(f"Saved {len(final_recs)} recommendations to \'my_car_recommendations.csv\'")\n        \n        # Display the saved recommendations\n        print("\nSaved recommendations:")\n        display(final_recs)\n    else:\n        print("No recommendations to save.") '

## Access Raw Data

Access the underlying preprocessed data for further analysis:

In [13]:
# Access the raw data
if rec_engine:
    print("Dataset Overview:")
    print(f"- Original data shape: {rec_engine.df_original.shape if rec_engine.df_original is not None else 'N/A'}")
    print(f"- Processed data shape: {rec_engine.df_rec.shape}")
    print(f"- Feature matrix shape: {rec_engine.feature_matrix.shape}")
    
    print("\nSample of processed data:")
    display(rec_engine.df_rec.head())
    
    print("\nFeature summary:")
    print(f"Numerical features: {len(rec_engine.numerical_features)}")
    print(f"Categorical features: {len(rec_engine.categorical_features)}")

Dataset Overview:
- Original data shape: (12916, 19)
- Processed data shape: (12916, 19)
- Feature matrix shape: (12916, 138)

Sample of processed data:


Unnamed: 0,Year,Engine HP,Engine Cylinders,highway MPG,city mpg,Efficiency,Age,MSRP,Number of Doors,Price_per_HP,Market Category,Vehicle Size,Vehicle Style,Transmission Type,Driven_Wheels,Engine Fuel Type,Make,Model,price_to_efficiency
0,2011,335.0,6.0,26.0,19.0,6.716418,14,46135.0,2.0,137.716418,"Factory Tuner,Luxury,High-Performance",Compact,Coupe,MANUAL,rear wheel drive,premium unleaded (required),BMW,1 Series M,1.455818
1,2011,300.0,6.0,28.0,19.0,7.833333,14,40650.0,2.0,135.5,"Luxury,Performance",Compact,Convertible,MANUAL,rear wheel drive,premium unleaded (required),BMW,1 Series,1.927019
2,2011,300.0,6.0,28.0,20.0,8.0,14,36350.0,2.0,121.166667,"Luxury,High-Performance",Compact,Coupe,MANUAL,rear wheel drive,premium unleaded (required),BMW,1 Series,2.200825
3,2011,230.0,6.0,28.0,18.0,10.0,14,29450.0,2.0,128.043478,"Luxury,Performance",Compact,Coupe,MANUAL,rear wheel drive,premium unleaded (required),BMW,1 Series,3.395586
4,2011,230.0,6.0,28.0,18.0,10.0,14,34500.0,2.0,150.0,Luxury,Compact,Convertible,MANUAL,rear wheel drive,premium unleaded (required),BMW,1 Series,2.898551



Feature summary:
Numerical features: 11
Categorical features: 6
