[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DrFranData/PfDA/blob/main/Topic2.ipynb)

# 🚗 Topic 2: Loading and Inspecting Data with Pandas

In this lesson, we'll learn how to load and explore a dataset using **pandas**.

**Dataset:** Auto MPG — Contains information about fuel efficiency and features of cars from the 1970s and 1980s.

### 🧠 Learning Objectives
- Load a dataset from a URL using pandas
- View the first and last rows
- Understand shape and structure of data
- Inspect columns and data types
- Perform basic data exploration

## 📥 Step 1: Load the Dataset

In [None]:
import pandas as pd
url = "https://raw.githubusercontent.com/plotly/datasets/master/auto-mpg.csv"
df = pd.read_csv(url)
df.head()

## 🔍 Step 2: Explore the Dataset

In [None]:
df.shape  # Rows and columns

In [None]:
df.columns  # Column names

In [None]:
df.tail()  # Last 5 rows

In [None]:
df.sample(5)  # Random sample

## ℹ️ Step 3: Understand the Columns

In [None]:
df.info()  # Summary info

In [None]:
df.describe()  # Descriptive stats

**Column meanings:**
- `mpg`: Miles per gallon
- `cylinders`: Number of engine cylinders
- `displacement`: Engine size in cubic inches
- `horsepower`: Engine horsepower
- `weight`: Car weight (lbs)
- `acceleration`: Acceleration (0–60 mph time)
- `model_year`: Year of manufacture
- `origin`: Country of origin (1=USA, 2=Europe, 3=Asia)
- `name`: Car name

## ✍️ Exercise 1: Explore the Dataset

In [None]:
# 1. Number of rows
df.shape[0]

In [None]:
# 2. Column names
df.columns.tolist()

In [None]:
# 3. Most common car names
df['name'].value_counts().head(3)

In [None]:
# 4. Unique values in 'origin'
df['origin'].nunique()

In [None]:
# 5. Average MPG
df['mpg'].mean()

## 🧾 Bonus: Slicing and Selecting Data

In [None]:
df['mpg'].head()  # Single column

In [None]:
df[['mpg', 'horsepower', 'weight']].head()  # Multiple columns

In [None]:
df.iloc[0:5]  # Rows by index

In [None]:
df.loc[0:4, ['mpg', 'name']]  # Specific rows and columns

## ✍️ Exercise 2: Try Some Slicing

In [None]:
df['horsepower'].iloc[:10]  # First 10 horsepower values

In [None]:
df.loc[10:15, ['name', 'mpg', 'weight']]  # Selected rows/cols

In [None]:
df[df['mpg'] > 35]  # High MPG

## ✅ Summary
- Loaded data from a CSV URL using pandas
- Explored its shape, types, and key features
- Performed basic slicing and filtering
- Practiced querying and analyzing simple insights