# Pandas: Reading & Exploring Data

This notebook covers:
1. Reading data from CSV
2. Creating a DataFrame from a dictionary
3. Exporting a DataFrame to CSV / Excel / JSON
4. Basic dataset inspection: head, tail, shape, info, describe
5. Accessing columns and filtering rows

In [None]:
import pandas as pd

# Load the dataset
df = pd.read_csv("../Datasets/employees.csv", encoding="latin1")

In [None]:
# Show the full dataset (or a small part depending on size)
df
    

## Create a DataFrame Manually

Here, we build a small DataFrame from a Python dictionary. Useful for testing or constructing toy examples.

In [None]:
data = {
    "Name": ["Abdullah", "Sarim", "Saif"],
    "Age": [21, 21, 23],
    "City": ["Hirabad", "Latifabad 7 no", "Latifabad 9 no"]
}

df2 = pd.DataFrame(data)
df2

## Export the DataFrame to Different File Formats

We can save our DataFrame to CSV, Excel, or JSON.  
- Use `index=False` to drop the extra index column in the exported file.  

In [None]:
# Save as CSV
df2.to_csv("output.csv", index=False)

# Save as Excel
df2.to_excel("output.xlsx", index=False)

# Save as JSON, using a list-of-records format
df2.to_json("output.json", orient="records")

## Dataset Inspection

We can inspect the dataset with:
- `head()` / `tail()` — to look at the first or last rows  
- `shape` — to see how many rows and columns  
- `info()` — to understand data types and null counts  
- `describe()` — to get summary statistics  


In [None]:
# Display first 10 rows
df.head(10)

# Display last 10 rows
df.tail(10)

# Get the shape (rows, columns)
df.shape

# Summary of data types & non-null counts
df.info()

# Summary statistics (numerical + categorical)
df.describe(include="all")

## Access Columns & Filter Rows

- Access single or multiple columns  
- Filter rows using boolean indexing (conditions)  
- Combine conditions with `&`, `|`, and `~` (AND, OR, NOT)

In [None]:
# Access specific columns
df[["First Name", "Gender"]]

# Filter for only male employees
df[df["Gender"] == "Male"]

# Filter for male employees with Salary > 65000
df[(df["Gender"] == "Male") & (df["Salary"] > 65000)]
