Skip to content

praneeth7-bot/Vehicle-Recall-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NHTSA Vehicle Field Reliability Analysis

A data analysis project examining 6,000 vehicle safety complaints from the National Highway Traffic Safety Administration (NHTSA) to identify field failure patterns, high-risk components, and manufacturer safety trends.


Project Overview

This project simulates the kind of analysis performed by field reliability data teams — specifically the work of identifying which vehicle components fail most frequently, at what mileage, and under what conditions. The dataset mirrors the structure of real NHTSA complaint data available at nhtsa.gov.

Key questions answered:

  • Which vehicle components generate the most field complaints?
  • Which manufacturers have the highest crash-related failure rates?
  • How have complaint volumes trended year-over-year (2018–2023)?
  • What is the average mileage at first failure per component?
  • Which model + component combinations carry the highest crash risk?

Key Findings

Finding Detail
Top failure component Electrical System — 18% of all complaints
Highest crash-rate component Airbag & Brakes
Earliest average failure Battery system (~48k miles avg)
Most complaints filed 2021–2022 — peak complaint period
Highest volume state CA, TX, FL — top 3 combined = ~30% of all complaints

Charts

Top 10 Components by Complaint Volume

Chart 1

Year-over-Year Complaint Volume vs Crash Events

Chart 2

Average Mileage at Failure by Component

Chart 3

Crash Rate by Manufacturer

Chart 4


Project Structure

vehicle-recall-analysis/
├── README.md
├── data/
│   ├── nhtsa_complaints_raw.csv        ← original dataset (6,000 records)
│   ├── nhtsa_complaints_clean.csv      ← cleaned & validated dataset
│   ├── q1_components.csv               ← component complaint counts
│   ├── q2_manufacturers.csv            ← manufacturer crash rates
│   ├── q3_yearly_trend.csv             ← year-over-year trends
│   ├── q4_mileage.csv                  ← mileage at failure by component
│   ├── q5_high_risk.csv                ← high-risk model/component combos
│   └── q6_states.csv                   ← state-level complaint volume
├── notebooks/
│   ├── 01_data_cleaning.ipynb          ← data loading, QA, validation
│   ├── 02_sql_analysis.ipynb           ← SQL queries (CTEs, window functions)
│   └── 03_visualizations.ipynb         ← Matplotlib charts + findings
└── charts/
    ├── chart1_top_components.png
    ├── chart2_yearly_trend.png
    ├── chart3_mileage_at_failure.png
    ├── chart4_manufacturer_crash_rate.png
    └── chart5_top_states.png

Tools & Skills Demonstrated

Tool Usage
Python (Pandas) Data loading, cleaning, transformation, validation
SQL (SQLite) CTEs, window functions (RANK, LAG), GROUP BY, HAVING
Matplotlib Bar charts, dual-axis plots, horizontal bar charts
Data QA Null checks, outlier flagging, logic validation
Jupyter Notebooks Reproducible, documented analysis workflow

How to Run

  1. Clone this repository
  2. Open any notebook in Jupyter Lab or Google Colab
  3. Run cells top to bottom — no additional installs needed (uses Python standard library only)
git clone https://github.com/YOUR_USERNAME/vehicle-recall-analysis
cd vehicle-recall-analysis
jupyter notebook notebooks/01_data_cleaning.ipynb

Data Source

Dataset structure mirrors NHTSA vehicle safety complaints:
https://www.nhtsa.gov/vehicle-safety/complaints


Built as part of a data analytics portfolio focused on field reliability and vehicle quality analysis.

About

NHTSA Vechicle Complaint analysis using Python, Pandas and SQL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors