# Python & Pandas Mechanics: The Drill Workbook

## Course Context
Before we can master Data Analytics, we must master the **syntax**. 

This workbook focuses intensely on one question: **"Do I use brackets `()`? Do I put the data inside or use a dot?"**

**Three Core Concepts:**
1.  **Functions**: `func(data)` -> External tools (e.g., `len()`, `type()`).
2.  **Methods**: `data.method()` -> Internal actions (e.g., `.upper()`, `.mean()`).
3.  **Attributes**: `data.attribute` -> Internal properties (No brackets!) (e.g., `.shape`, `.columns`).

---

## Part 1: Python Basics (Strings, Lists, Dictionaries)
**Goal:** Understand when to use a function and when to use a method.

### Exercise 1.1: String Methods
**Task:**
1. Create a variable `city = "  new york  "`.
2. Use the `.strip()` method to remove spaces.
3. Use the `.title()` method to capitalize the first letters.
4. Print the final result.

In [None]:
# YOUR CODE HERE

### Exercise 1.2: The `len()` Function
**Task:**
1. Create a list `codes = [101, 102, 103, 104]`.
2. Use the `len()` function to count the items.
3. **Attempt** to use `.len()`. It will fail. Run it to see the error, then comment it out.

In [3]:
# YOUR CODE HERE
codes = [101, 102, 103, 104]
print(len(codes))
#codes.len()

4


### Exercise 1.3: Sorting (Function vs Method)
This is a classic interview trap.
**Task:**
1. Create `list_a = [5, 1, 3]`.
2. Create `list_b = [5, 1, 3]`.
3. Apply `sorted(list_a)` and print the result.
4. Apply `list_b.sort()` and print `list_b`.

In [5]:
# YOUR CODE HERE
list_a = [5, 1, 3]
list_b = [5, 1, 3]
print(sorted(list_a))
list_b.sort()
print(list_b)

[1, 3, 5]
[1, 3, 5]


### Exercise 1.4: Dictionary Methods
**Task:**
1. Create a dictionary `product = {"id": 99, "price": 10.50}`.
2. Use the `.keys()` method to get the keys.
3. Use the `.values()` method to get the values.
4. Use the `.update()` method to add `"in_stock": True`.

In [24]:
# YOUR CODE HERE
product = {"id": 99, "price": 10.50}
print(product.keys())
print(product.values())
product.update("in_stock": True)
print(product)

SyntaxError: invalid syntax (715540827.py, line 5)

--- 
## Part 2: Pandas Mechanics (Attributes vs Methods)
**Goal:** Distinguish between **Properties** (no brackets) and **Actions** (brackets).

### Setup: Create Data
Run this cell first.

In [10]:
import pandas as pd

data = {
    "Employee": ["Alice", "Bob", "Charlie", "David", "Eve"],
    "Department": ["HR", "IT", "IT", "Sales", "HR"],
    "Salary": [60000, 80000, 82000, 55000, 62000],
    "YearsExperience": [5, 8, 9, 2, 6]
}

df = pd.DataFrame(data)
print("DataFrame Created.")

DataFrame Created.


### Exercise 2.1: Attributes (No Brackets)
An **attribute** is a fact about the data. 
**Task:**
1. Print `df.shape` (The dimensions).
2. Print `df.columns` (The column names).
3. Print `df.dtypes` (The data types).

In [12]:
# YOUR CODE HERE
df.shape

(5, 4)

In [13]:
df.columns

Index(['Employee', 'Department', 'Salary', 'YearsExperience'], dtype='object')

In [14]:
df.dtypes

Employee           object
Department         object
Salary              int64
YearsExperience     int64
dtype: object

### Exercise 2.2: Methods (Brackets)
A **method** is an action or calculation.
**Task:**
1. Use `df.head(3)` to get the first 3 rows.
2. Use `df.info()` to get the summary.
3. Use `df.describe()` to get statistical summary.

In [15]:
# YOUR CODE HERE
df.head(3)

Unnamed: 0,Employee,Department,Salary,YearsExperience
0,Alice,HR,60000,5
1,Bob,IT,80000,8
2,Charlie,IT,82000,9


In [16]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   Employee         5 non-null      object
 1   Department       5 non-null      object
 2   Salary           5 non-null      int64 
 3   YearsExperience  5 non-null      int64 
dtypes: int64(2), object(2)
memory usage: 292.0+ bytes


In [17]:
df.describe()

Unnamed: 0,Salary,YearsExperience
count,5.0,5.0
mean,67800.0,6.0
std,12336.936411,2.738613
min,55000.0,2.0
25%,60000.0,5.0
50%,62000.0,6.0
75%,80000.0,8.0
max,82000.0,9.0


--- 
## Part 3: Simple Pandas Aggregations
**Goal:** Apply mathematical methods to columns.

### Exercise 3.1: Sum and Mean
**Task:**
1. Calculate the total of the `Salary` column using `.sum()`.
2. Calculate the average `YearsExperience` using `.mean()`.
3. **Bonus:** Try using `sum()` on the `Employee` column (Strings). What happens?

In [38]:
# YOUR CODE HERE
total=df['Salary'].sum()
print(total)

339000


Employee           AliceBobCharlieDavidEve
Department                   HRITITSalesHR
Salary                              339000
YearsExperience                         30
dtype: object


### Exercise 3.2: Counting Categories
The `.value_counts()` method is extremely useful for text columns. It works like a Pivot Table count.

**Task:**
1. Use `.value_counts()` on the `Department` column to see how many employees are in each.

In [39]:
# YOUR CODE HERE
avg=df['Department'].value_counts()
print(avg)

Department
HR       2
IT       2
Sales    1
Name: count, dtype: int64


### Exercise 3.3: Max and Min
**Task:**
1. Find the maximum salary using `.max()`.
2. Find the minimum years of experience using `.min()`.

In [43]:
# YOUR CODE HERE
maximum_salary = df.max()
print(maximum_salary)

Employee             Eve
Department         Sales
Salary             82000
YearsExperience        9
dtype: object


In [44]:
minimum_salary = df.min()
print(minimum_salary)

Employee           Alice
Department            HR
Salary             55000
YearsExperience        2
dtype: object


--- 
## Part 4: The Mixed Challenge

**Task:**
1. Create a Python list: `scores = [88, 92, 75]`.
2. Use the `.append()` method to add `100` to the list.
3. Create a Pandas Series from that list.
4. Calculate the `.mean()` of that Series.

In [1]:
# YOUR CODE HERE
import pandas as pd
scores = [88, 92, 75]
scores.append(100)
print(scores)
my_series = pd.Series(scores)

[88, 92, 75, 100]
