FareSpot is an advanced machine learning solution for predicting taxi fares with high accuracy. Built with Django and powered by LightGBM regression models, this application provides reliable fare estimates based on trip details such as pickup/dropoff locations, time of day, and passenger count.
One of the biggest challenges in this project was dealing with problematic real-world data:
The heatmaps below illustrate one of our major data quality issues - many coordinates showed pickups and dropoffs in impossible locations such as oceans and lakes, despite the dataset being from the USA:
Other significant data challenges included:
- Extreme outliers in fare amounts
- Missing values across multiple features
- Need for sophisticated scaling and encoding techniques
After rigorous data cleaning and preprocessing, our model achieved excellent performance metrics:
| Metric | Value |
|---|---|
| R² Score | 0.7846 |
| MSE | 0.0429 |
| RMSE | 0.2070 |
| MAE | 0.1447 |
- Accurate Fare Prediction: Uses machine learning to provide reliable fare estimates
- User-Friendly Interface: Simple web interface for entering trip details
- Geospatial Analysis: Incorporates location data for improved predictions
- Time-Based Factors: Accounts for time of day and date in predictions
- Robust Model: Handles various input combinations with consistent results
FareSpot/
├─ fare_prediction/ # Main application directory
│ ├─ models/ # Trained ML models
│ │ ├─ lgb_model.pkl # LightGBM model
│ │ └─ scaler.pkl # StandardScaler preprocessor
│ ├─ templates/ # HTML templates
│ ├─ views.py # View controllers
│ └─ urls.py # URL routing
├─ FareSpot/ # Django project settings
├─ Heatmaps/ # Geospatial visualization
│ ├─ Pickup_Heatmap.png # Pickup location density map
│ └─ Dropoff_Heatmap.png # Dropoff location density map
├─ Fare_Spot.png # Project logo
├─ fare_prediction_pipeline.ipynb # Model training notebook
├─ preprocessing_pipeline.ipynb # Data preprocessing notebook
└─ manage.py # Django management script
- Django: Web framework for building the application interface
- LightGBM Regressor: Gradient boosting framework for the prediction model
- StandardScaler: Feature preprocessing for improved model performance
- Jupyter Notebooks: For data exploration and model development
- Matplotlib/Seaborn: For generating geospatial heatmaps
- Data Collection: Gathered extensive taxi trip data
- Exploratory Data Analysis: Identified patterns and anomalies
- Data Cleaning: Addressed geolocation inconsistencies and outliers
- Feature Engineering: Created relevant time-based and distance features
- Preprocessing: Applied scaling and encoding techniques
- Model Selection: Evaluated multiple regression models
- Hyperparameter Tuning: Optimized LightGBM parameters
- Model Validation: Ensured robust performance across various scenarios
-
Clone the repository:
git clone https://github.com/Spafic/FareSpot.git cd FareSpot -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Run migrations:
python manage.py migrate
-
Start the server:
python manage.py runserver
-
Access the application: Open your browser and go to
http://127.0.0.1:8000/
- Navigate to the home page
- Enter trip details:
- Pickup location (latitude/longitude)
- Dropoff location (latitude/longitude)
- Time and date
- Number of passengers
- Click "Predict Fare"
- View the predicted fare amount
Contributions are welcome! Please fork the repository and submit a pull request.
This project is licensed under the MIT License.
For any inquiries, please contact:


