This repository contains exercises using Pandas to explore and analyze a vehicle dataset (mpg.csv
). The goal is to practice data exploration, Boolean operations, arithmetic calculations, and column manipulation in Python.
-
Data Exploration
- Imported the CSV dataset into a Pandas DataFrame.
- Explored the data using
.head()
,.tail()
,.describe()
,.shape()
,.mean()
,.sum()
,.value_counts()
,.max()
,.min()
,len()
, and.median()
. - Sorted values to inspect specific columns.
-
Creating New Columns
- Created a Boolean column
is_automatic
indicating if a vehicle has an automatic transmission. - Added a calculated column
fuel_economy
as a weighted average:
fuel_economy = (city_mpg * 0.55) + (highway_mpg * 0.45)
.
- Created a Boolean column
-
Analysis Using Boolean Masking
- Used
is_automatic
to count the number of automatic vehicles. - Determined percentage of subcompact vehicles.
- Filtered vehicles with
fuel_economy
above the median using Boolean masking.
- Used
- Importing and inspecting CSV datasets with Pandas.
- Exploring data with descriptive statistics and summary functions.
- Creating and manipulating new columns using arithmetic operations.
- Filtering and analyzing data using Boolean masking.
- Calculating percentages and weighted averages.
- Summarizing insights from the dataset effectively.
✨ This exercise reinforces foundational Pandas skills for real-world data analysis and prepares for more advanced data manipulation tasks.
import pandas as pd
mpg = pd.read_csv('mpg.csv')
mpg.head()

Task: Use the is_automatic
column to find how many vehicles are automatic.
mpg['is_automatic'].sum()

Task: Determine what percentage of vehicles are subcompacts.
subcompact_percentage = mpg[mpg["class"] == "subcompact"]["class"].count() / len(mpg) * 100
subcompact_percentage
Task: Add a fuel_economy
column as a weighted average of city and highway MPG (55% city, 45% highway).
mpg['fuel_economy'] = mpg['cty'] * 0.55 + mpg['hwy'] * 0.45
Task: Use Boolean masking to find vehicles with fuel_economy
above the median.
high_fe_vehicles = mpg[mpg.fuel_economy > mpg.fuel_economy.median()]