Skip to content

mahadev19/Uber-Data-Science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸš– Uber Data Science Analysis

This project explores Uber ride booking data to gain insights into ride patterns, user behavior, and trends using Python data science libraries. It also applies Machine Learning techniques for predictive modeling. The dataset is sourced from Kaggle and analyzed step by step in a Jupyter Notebook.


πŸ“‚ Dataset


πŸ› οΈ Technologies & Libraries

  • Python
  • Pandas – Data manipulation
  • NumPy – Numerical computations
  • Matplotlib & Seaborn – Data visualization
  • Scikit-learn – Machine Learning models
  • KaggleHub – To download dataset

πŸ“Š Analysis Performed

1. Data Preprocessing

  • Loaded CSV data
  • Handled missing values
  • Converted datetime columns into features (day, month, weekday, hour)
  • Encoded categorical variables

2. Exploratory Data Analysis (EDA)

  • Ride distribution by hours, weekdays, and months
  • Pickup vs Drop location trends
  • Ride status distribution (Completed, Cancelled, Incomplete)
  • Payment method preferences
  • Correlation heatmaps and count plots

3. Machine Learning Models

  • Feature Engineering: Extracted time-based and categorical features
  • Classification Models (predicting ride status):
    • Logistic Regression
    • Random Forest
    • Decision Tree
  • Clustering: KMeans for grouping rides by locations and patterns
  • Evaluation Metrics: Accuracy, Precision, Recall, F1-score

πŸš€ How to Run

  1. Clone this repository:
    git clone https://github.com/mahadev19/Uber-Data-Science.git

About

End to End Project On Uber

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published