Skip to content

This project analyzes the 'Auto MPG' dataset to investigate the impact of automobile characteristics like vehicle weight on fuel efficiency, using Python tools such as Seaborn and Matplotlib for data visualization and analysis.

Notifications You must be signed in to change notification settings

subbu112/Predicting-Fuel-Economy-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Predicting-Fuel-Economy-Python

This project analyzes the 'Auto MPG' dataset to investigate the impact of automobile characteristics like vehicle weight on fuel efficiency, using Python tools such as Seaborn and Matplotlib for data visualization and analysis. Certainly! Below is a complete description and report of your project that you can include in your GitHub README file. This report explains the project's purpose, methodology, and findings, and it indicates that the dataset used is also available in the GitHub repository.


Fuel Economy Analysis Project

Project Overview

This project explores the relationship between automobile characteristics and their fuel efficiency, specifically analyzing how vehicle weight impacts miles per gallon (MPG). Utilizing the 'Auto MPG' dataset, this comprehensive study employs various data science techniques including exploratory data analysis, data cleaning, feature engineering, and regression modeling.

Dataset

The 'Auto MPG' dataset has been provided in this GitHub repository. It includes various automobile attributes such as make, model, weight, horsepower, and MPG figures, among others. This dataset serves as the foundation for our analysis and modeling.

Methodology

Data Preprocessing

The initial phase of the project involved cleaning the data, which included handling missing values and outliers. Missing values, especially in the 'horsepower' field, were imputed using the mean horsepower of the dataset to maintain the integrity of our analyses.

Exploratory Data Analysis (EDA)

We conducted thorough exploratory analysis using Python libraries like Pandas, Matplotlib, and Seaborn to understand underlying patterns and relationships within the data. Visualizations such as histograms, pair plots, and correlation matrices were created to examine the relationships between different variables, particularly focusing on the impact of weight on MPG.

Feature Engineering

To enhance the predictive performance of our models, we engineered new features. This included creating polynomial features like 'weight squared' to capture non-linear effects. The categorical variables such as 'origin' were converted into one-hot encoded formats to better fit into our regression models.

Predictive Modeling

We applied multiple regression models to predict fuel efficiency:

  • Linear Regression: Started with a simple model focusing on 'weight' as it showed a strong inverse correlation with MPG.
  • Multiple Linear Regression: Extended the model to include more features to capture complex interactions.
  • Ridge Regression: Implemented to regularize the model, addressing multicollinearity and improving model generalization.

Each model was validated using techniques like train-test split and cross-validation to ensure the robustness of our findings.

Results and Discussion

Our analysis revealed significant insights:

  • Weight Impact: Heavier vehicles consistently showed lower MPG, highlighting the critical impact of weight on fuel efficiency.
  • Model Improvement: The incorporation of additional features in the multiple regression models improved the predictiveness, evidenced by higher R² scores.
  • Technological Trends: The analysis of model year coefficients suggested improvements in fuel efficiency over the years, reflecting advancements in automotive technology.

Conclusion

The project successfully demonstrated the use of statistical modeling to predict and analyze fuel efficiency based on vehicle characteristics. The findings emphasize the importance of considering vehicle weight in designing more fuel-efficient cars. Future work could explore more sophisticated modeling techniques and deeper feature engineering to enhance the predictive accuracy.

Tools and Technologies Used

  • Python: For scripting and automation of data processing.
  • Pandas and NumPy: For data manipulation.
  • Matplotlib and Seaborn: For data visualization.
  • Scikit-learn: For implementing regression models.

This project serves as a practical application of data science in understanding and solving real-world issues related to automotive design and environmental sustainability.


About

This project analyzes the 'Auto MPG' dataset to investigate the impact of automobile characteristics like vehicle weight on fuel efficiency, using Python tools such as Seaborn and Matplotlib for data visualization and analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages