# pandas - Python Data Analysis Library
## Introduction
**Pandas** is a powerful Python library for data manipulation and analysis, you can find all the help you need directly from [Pandas' website](https://pandas.pydata.org).  
It provides two main data structures: `Series` and `DataFrame`.

In this notebook, we will learn how to:
- Create and manipulate Series and DataFrames
- Load data from CSV files
- Perform basic statistical operations
- Filter and visualize data

In [2]:
import sys
!{sys.executable} -m pip install pandas

Collecting pandas
  Downloading pandas-2.3.1-cp313-cp313-macosx_11_0_arm64.whl.metadata (91 kB)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pandas-2.3.1-cp313-cp313-macosx_11_0_arm64.whl (10.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.7/10.7 MB[0m [31m45.3 MB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
[?25hDownloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
Installing collected packages: pytz, tzdata, pandas
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3/3[0m [pandas]2m2/3[0m [pandas]
[1A[2KSuccessfully installed pandas-2.3.1 pytz-2025.2 tzdata-2025.2

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34

In [2]:
import pandas as pd
import numpy as np

## Creating a Series
A Series is a one-dimensional array-like object containing a sequence of values and an associated array of data labels, called its index.

In [3]:
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(s)

a    10
b    20
c    30
d    40
dtype: int64


## Creating a DataFrame
A DataFrame is a two-dimensional, size-mutable, and heterogeneous data structure.

In [4]:
data = {
    'Name': ['Anna', 'Luca', 'Marco', 'Julia'],
    'Age': [17, 18, 16, 17],
    'Class': ['5A', '5B', '4A', '5A']
}

df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,Class
0,Anna,17,5A
1,Luca,18,5B
2,Marco,16,4A
3,Julia,17,5A


## Basic DataFrame operations

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    4 non-null      object
 1   Age     4 non-null      int64 
 2   Class   4 non-null      object
dtypes: int64(1), object(2)
memory usage: 228.0+ bytes


In [6]:
df.describe()

Unnamed: 0,Age
count,4.0
mean,17.0
std,0.816497
min,16.0
25%,16.75
50%,17.0
75%,17.25
max,18.0


In [7]:
df['Age'].mean()

np.float64(17.0)

In [8]:
df['Class'].value_counts()

Class
5A    2
5B    1
4A    1
Name: count, dtype: int64

## Filtering and selecting data

In [9]:
df[df['Age'] > 17]

Unnamed: 0,Name,Age,Class
1,Luca,18,5B


In [10]:
df[df['Class'] == '5A']

Unnamed: 0,Name,Age,Class
0,Anna,17,5A
3,Julia,17,5A


In [11]:
df.loc[0:2, ['Name', 'Class']]

Unnamed: 0,Name,Class
0,Anna,5A
1,Luca,5B
2,Marco,4A


## Exercises

1. Create a DataFrame with at least 5 students, including columns: Name, Surname, Age, Class.
2. Calculate the average age of the students.
3. Display how many students are in each class.
4. Filter only the students in class 5A.

Try to complete these tasks in new code cells below.

In [None]:
## solve the exercise here