Import Pandas and NumPy for the notebook:

In [5]:
import pandas as pd
import numpy as np

---
## Pandas Series

Create a series from a list:

In [2]:
ages = pd.Series([25, 30, 35, 40], index=['Alice', 'Bob', 'Charlie', 'David'])
ages

Alice      25
Bob        30
Charlie    35
David      40
dtype: int64

Print values from the series, using the index:

In [4]:
print(ages['Bob']) 

30


---
## Creating Series from other objects:

**Tuple:**

In [6]:
t = ('a', 'b', 'c')
s = pd.Series(t)
s

0    a
1    b
2    c
dtype: object

**Set:**

Since sets are unordered, Pandas will convert it to a list internally, but the order may not be guaranteed.

In [9]:
st = {100, 200, 300}
s = pd.Series(list(st))
s

0    200
1    100
2    300
dtype: int64

**Dictionary:**

The keys become the index automatically:

In [10]:
d = {'Alice': 25, 'Bob': 30, 'Charlie': 35}
s = pd.Series(d)
s

Alice      25
Bob        30
Charlie    35
dtype: int64

**NumPy arrays:**

In [11]:
arr = np.array([5, 10, 15])
s = pd.Series(arr)
s

0     5
1    10
2    15
dtype: int64

**Random numbers:**

Using NumPy to create random arrays

In [12]:
s = pd.Series(np.random.randint(1, 100, size=5))
s

0    11
1    67
2     5
3    86
4    76
dtype: int64


---
---



## Pandas DataFrame

Create a DataFrame from a dictionary:

In [13]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago
3,David,40,Houston


Show only ages greater than 30:

In [14]:
df[df['Age'] > 30]

Unnamed: 0,Name,Age,City
2,Charlie,35,Chicago
3,David,40,Houston


A Pandas DataFrame is essentially a collection of Series objects that share the same index. Each column in a DataFrame is a Series, and the rows are aligned by the DataFrame’s index. So each column can have a different data type, but all values in the column must have the same data type.

**Combining series:**

In [15]:
# Create Series
names = pd.Series(['Alice', 'Bob', 'Charlie'])
ages = pd.Series([25, 30, 35])
cities = pd.Series(['New York', 'Los Angeles', 'Chicago'])

# Combine into a DataFrame
df = pd.DataFrame({
    'Name': names,
    'Age': ages,
    'City': cities
})

df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago


**From a 2D list:**

In [16]:
data = [
    [1, 'Alice', 25],
    [2, 'Bob', 30],
    [3, 'Charlie', 35]
]

df = pd.DataFrame(data, columns=['ID', 'Name', 'Age'])
df

Unnamed: 0,ID,Name,Age
0,1,Alice,25
1,2,Bob,30
2,3,Charlie,35


**From a 2D NumPy array:**

Each row in the array becomes a row in the DataFrame, we sepcify the columns:

In [17]:
arr = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90]
])

df = pd.DataFrame(arr, columns=['A', 'B', 'C'])
df

Unnamed: 0,A,B,C
0,10,20,30
1,40,50,60
2,70,80,90


**From a dictionary of lists:**

Keys become column names, values are the column data:

In [18]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,City
0,Alice,25,New York
1,Bob,30,Los Angeles
2,Charlie,35,Chicago


**From a list of dictionaries:**

Each dictionary becomes a row. Missing keys become NaN (Not a Number):

In [19]:
data = [
    {'Name': 'Alice', 'Age': 25},
    {'Name': 'Bob', 'Age': 30, 'City': 'Los Angeles'},
    {'Name': 'Charlie', 'City': 'Chicago'}
]

df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age,City
0,Alice,25.0,
1,Bob,30.0,Los Angeles
2,Charlie,,Chicago


Column names can usually be inferred, but you can always provide them via columns=[...].

If indexes are not specified, pandas creates a default integer index [0, 1, 2, ...].