<a href="https://colab.research.google.com/github/ErfanS99/python_course_SRBIAU/blob/main/ex_5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Dictionaries and Pandas Basics

This notebook introduces:

1. The basics of **Python dictionaries**
2. The basics of the **pandas** library, including:
   - Creating a `DataFrame`
   - Displaying tables
   - Changing headers (column names)
   - Changing specific data in a `DataFrame`
   - Filtering rows
   - Creating `DataFrame`s from:
     - Python dictionaries
     - Lists
     - NumPy arrays


## 1. Python Dictionaries – Basics

A **dictionary** in Python is a collection of key–value pairs.  
You can think of it as a real dictionary: each *word* (key) has a *meaning* (value).

- Dictionaries are created using `{}` (curly braces)
- Each item has the form `key: value`
- Keys must be **unique**
- Values can be of any type (numbers, strings, lists, other dictionaries, ...)


In [21]:
# Creating a simple dictionary

student = {
    "name": "Alice",
    "age": 21,
    "major": "Computer Science"
}



### Accessing Values

You can access values by their key using square brackets `[]` or the `.get()` method.


In [22]:
# Accessing values by key

name = student["name"]          # using []
age = student.get("age")        # using .get()

print("Name:", name)
print("Age:", age)

Name: Alice
Age: 21


### Adding and Updating Items

You can add a new key–value pair or update an existing one simply by assignment.


In [23]:
# Adding a new key–value pair
student["gpa"] = 3.8

# Updating an existing key
student["age"] = 22

student

{'name': 'Alice', 'age': 22, 'major': 'Computer Science', 'gpa': 3.8}

### Deleting Items

You can delete items using `del` or `.pop()`.


In [24]:
# Deleting a key–value pair

# Using del
# del student["major"]

# Using pop (also returns the removed value)
gpa = student.pop("gpa")

print("Removed GPA:", gpa)
student

Removed GPA: 3.8


{'name': 'Alice', 'age': 22, 'major': 'Computer Science'}

### Looping Through a Dictionary

You can loop through keys, values, or key–value pairs.


In [25]:
# Looping through a dictionary

person = {
    "name": "Bob",
    "city": "Turin",
    "job": "Engineer"
}

print("Keys:")
for key in person.keys():
    print(" -", key)

print("\nValues:")
for value in person.values():
    print(" -", value)

print("\nItems (key, value):")
for key, value in person.items():
    print(f" - {key}: {value}")

Keys:
 - name
 - city
 - job

Values:
 - Bob
 - Turin
 - Engineer

Items (key, value):
 - name: Bob
 - city: Turin
 - job: Engineer


---
## 2. Pandas Basics

`pandas` is a powerful Python library for working with **tabular data** (data in rows and columns).  
The main objects are:

- **Series**: 1D labeled array (like one column)
- **DataFrame**: 2D labeled data structure (like a table in Excel)


In [28]:
# Importing pandas (and NumPy for later examples)

import pandas as pd
import numpy as np

pd.__version__  # show the pandas version

'2.2.2'

### 2.1 Creating a DataFrame from a Dictionary

A very common way to create a `DataFrame` is from a **dictionary of lists**.  
Each key becomes a **column name**, and each list becomes the data in that column.


In [29]:
# Creating a DataFrame from a dictionary of lists

data = {
    "name": ["Alice", "Bob", "Charlie", "Diana"],
    "age": [21, 25, 23, 22],
    "city": ["Turin", "Milan", "Rome", "Naples"]
}

df = pd.DataFrame(data)
df

Unnamed: 0,name,age,city
0,Alice,21,Turin
1,Bob,25,Milan
2,Charlie,23,Rome
3,Diana,22,Naples


### 2.2 Displaying the Table

In Jupyter Notebook, simply writing the variable name (like `df`) in the last line of a cell shows the table.  
You can also use methods like `.head()` to see the first few rows.


In [31]:
# Display the first few rows (by default 5)

df.head(3)

Unnamed: 0,name,age,city
0,Alice,21,Turin
1,Bob,25,Milan
2,Charlie,23,Rome


### 2.3 Creating a DataFrame from a List of Dictionaries

Each dictionary represents **one row**.  
The keys of the dictionaries become **column names**.


In [32]:
# List of dictionaries

students = [
    {"name": "Emily", "age": 20, "city": "Florence"},
    {"name": "Frank", "age": 24, "city": "Bologna"},
    {"name": "Grace", "age": 22, "city": "Genoa"}
]

df_from_list = pd.DataFrame(students)
df_from_list

Unnamed: 0,name,age,city
0,Emily,20,Florence
1,Frank,24,Bologna
2,Grace,22,Genoa


### 2.4 Creating a DataFrame from a NumPy Array

You can also create a `DataFrame` from a NumPy array.  
In this case, you usually provide column names and optionally row index labels.


In [33]:
# Creating a NumPy array

array_data = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

df_from_array = pd.DataFrame(
    array_data,
    columns=["A", "B", "C"],      # column (header) names
    index=["row1", "row2", "row3"]  # optional row labels
)

df_from_array

Unnamed: 0,A,B,C
row1,1,2,3
row2,4,5,6
row3,7,8,9


In [41]:
df_from_array.to_csv("test.csv")

### 2.5 Changing Column Names (Headers)

You can change the column names using:
- `df.columns = [...]` to set all column names at once
- `df.rename(columns={...}, inplace=True)` to rename specific columns


In [43]:
# Show original columns
print("Original columns:", df.columns.tolist())

# Method 1: set all column names
df.columns = ["Name", "Age", "City"]
print("After setting all columns:", df.columns.tolist())

# Method 2: rename specific columns
df.rename(columns={"City": "HomeCity"}, inplace=True)
print("After renaming one column:", df.columns.tolist())

df

Original columns: ['Name', 'Age', 'HomeCity']
After setting all columns: ['Name', 'Age', 'City']
After renaming one column: ['Name', 'Age', 'HomeCity']


Unnamed: 0,Name,Age,HomeCity
0,Alice,21,Turin
1,Bob,25,Milan
2,Charlie,23,Rome
3,Diana,22,Naples


### 2.6 Changing Specific Data in a DataFrame

You can change values using:
- `.loc[row_label, column_label]` (label-based)
- `.iloc[row_index, column_index]` (position-based)


In [46]:
# Copy the DataFrame to avoid losing the original data
df_edit = df.copy()

# Change a specific value using .loc
# Example: change the Age of the row with index 0 (first row)
df_edit.loc[0, "Age"] = 30

# Change a specific value using .iloc
# Example: change the HomeCity of the 2nd row (index 1)
df_edit.iloc[1, df_edit.columns.get_loc("HomeCity")] = "Venice"

df_edit

Unnamed: 0,Name,Age,HomeCity
0,Alice,30,Turin
1,Bob,25,Venice
2,Charlie,23,Rome
3,Diana,22,Naples


### 2.7 Adding and Dropping Columns

You can easily add a new column or remove an existing one.


In [48]:
# Add a new column based on existing columns
df_edit["AgeNextYear"] = df_edit["Age"] + 1

# Drop a column (axis=1 for columns)
df_no_age_next = df_edit.drop(columns=["AgeNextYear"])

df_edit, df_no_age_next

(      Name  Age HomeCity  AgeNextYear
 0    Alice   30    Turin           31
 1      Bob   25   Venice           26
 2  Charlie   23     Rome           24
 3    Diana   22   Naples           23,
       Name  Age HomeCity
 0    Alice   30    Turin
 1      Bob   25   Venice
 2  Charlie   23     Rome
 3    Diana   22   Naples)

### 2.8 Filtering Rows

Filtering means selecting rows based on a condition.  
For example:

- All rows where `Age > 22`
- All rows where `HomeCity == "Turin"`


In [51]:
# Filter rows where Age > 22

older_than_22 = df_edit[df_edit["Age"] > 22]
older_than_22

Unnamed: 0,Name,Age,HomeCity,AgeNextYear
0,Alice,30,Turin,31
1,Bob,25,Venice,26
2,Charlie,23,Rome,24


In [53]:
# Filter rows where HomeCity is "Turin"

from_turin = df_edit[df_edit["HomeCity"] == "Turin"]
from_turin

Unnamed: 0,Name,Age,HomeCity,AgeNextYear
0,Alice,30,Turin,31


### 2.9 Selecting Columns

You can select one or more columns using square brackets:
- `df["column_name"]` → returns a **Series**
- `df[["col1", "col2"]]` → returns a **DataFrame**


In [55]:
# Select a single column (Series)
names = df_edit["Name"]
print("Type of 'names':", type(names))
display(names)

# Select multiple columns (DataFrame)
name_and_city = df_edit[["Name", "HomeCity"]]
print("Type of 'name_and_city':", type(name_and_city))
name_and_city

Type of 'names': <class 'pandas.core.series.Series'>


Unnamed: 0,Name
0,Alice
1,Bob
2,Charlie
3,Diana


Type of 'name_and_city': <class 'pandas.core.frame.DataFrame'>


Unnamed: 0,Name,HomeCity
0,Alice,Turin
1,Bob,Venice
2,Charlie,Rome
3,Diana,Naples


---
## 3. Summary

In this notebook, you learned:

- How to work with **Python dictionaries** (create, access, update, delete, loop)
- How to use **pandas** to:
  - Create `DataFrame`s from dictionaries, lists, and NumPy arrays
  - Display tables in Jupyter
  - Change column headers
  - Edit specific data using `.loc` and `.iloc`
  - Add and remove columns
  - Filter rows based on conditions
  - Select specific columns

You can now use this notebook as a **template** and play with your own data.
