# Core Data Structures in Pandas

Pandas is built on **two main data structures**:

1. **Series** → One-dimensional (like a single column in Excel)
2. **DataFrame** → Two-dimensional (like a full spreadsheet or SQL table)

***

## Series — 1D Labeled Array

A `Series` is like a list with **labels (index)**.

```python
import pandas as pd

s = pd.Series([10, 20, 30, 40])
print(s)
```

**Output:**

```
0    10
1    20
2    30
3    40
dtype: int64
```

Notice the **automatic index**: 0, 1, 2, 3

You can also define a custom index:

```python
s = pd.Series([10, 20, 30], index=["a", "b", "c"])
```

A `pandas.Series` may look similar to a Python dictionary because both store data with labels, but a Series offers much more. Unlike a dictionary, a Series supports fast vectorized operations, automatic index alignment during arithmetic, and handles missing data using `NaN`. It also allows both label-based and position-based access, and integrates seamlessly with the pandas ecosystem, especially DataFrames. While a dictionary is great for simple key–value storage, a Series is better suited for data analysis and manipulation tasks where performance, flexibility, and built-in functionality matter.

***

## DataFrame — 2D Labeled Table

A `DataFrame` is like a **dictionary of Series** — multiple columns with labels.

```python
data = {
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["Delhi", "Mumbai", "Bangalore"]
}

df = pd.DataFrame(data)
print(df)
```

**Output:**

```
     name  age      city
0   Alice   25     Delhi
1     Bob   30    Mumbai
2  Charlie   35  Bangalore
```

Each column in a `DataFrame` is a `Series`.

***

## Index and Labels

Every Series and DataFrame has an **Index** — it helps with:

* Fast lookups
* Aligning data
* Merging & joining
* Time series operations

```python
df.index         # Row labels
df.columns       # Column labels
```

You can change them using:

```python
df.index = ["a", "b", "c"]
df.columns = ["Name", "Age", "City"]
```

***

## Why Learn These Well?

Most Pandas operations are built on these foundations:

* Selection
* Filtering
* Merging
* Aggregation

Understanding Series & DataFrames will make everything else easier.

***

## Summary

* `Series` = 1D array with labels
* `DataFrame` = 2D table with rows + columns
* Both come with index and are the heart of Pandas

In [1]:
import pandas as pd

In [2]:
s1 = pd.Series([71, 84, 56, 23, 56, 98, 56])

In [3]:
type(s1)

pandas.core.series.Series

In [4]:
print(s1)

0    71
1    84
2    56
3    23
4    56
5    98
6    56
dtype: int64


In [5]:
s2 = pd.Series([71, 84, 56, 23, 56, 98, 56], index=["Harry", "Shubh", "Rohan", "Aakash", "Kirti", "John", "Rehan"])

In [6]:
s2

Harry     71
Shubh     84
Rohan     56
Aakash    23
Kirti     56
John      98
Rehan     56
dtype: int64

In [7]:
s2["Harry"]

np.int64(71)

In [8]:
s2["Rohan"]

np.int64(56)

In [9]:
data = {
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["Delhi", "Mumbai", "Bangalore"]
}

In [10]:
df = pd.DataFrame(data)

In [11]:
df

Unnamed: 0,name,age,city
0,Alice,25,Delhi
1,Bob,30,Mumbai
2,Charlie,35,Bangalore


In [12]:
df.index

RangeIndex(start=0, stop=3, step=1)

In [13]:
df.columns

Index(['name', 'age', 'city'], dtype='object')

In [14]:
df.describe

<bound method NDFrame.describe of       name  age       city
0    Alice   25      Delhi
1      Bob   30     Mumbai
2  Charlie   35  Bangalore>

In [15]:
df.head(1)

Unnamed: 0,name,age,city
0,Alice,25,Delhi


In [16]:
s = pd.Series([10, 20, 30, 40])
print(s)

0    10
1    20
2    30
3    40
dtype: int64


In [18]:
s = pd.Series([10, 20, 30], index=["a", "b", "c"])
s

a    10
b    20
c    30
dtype: int64

In [19]:
data = {
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["Delhi", "Mumbai", "Bangalore"]
}

df = pd.DataFrame(data)
df

Unnamed: 0,name,age,city
0,Alice,25,Delhi
1,Bob,30,Mumbai
2,Charlie,35,Bangalore


In [20]:
df.index         # Row labels
df.columns       # Column labels

Index(['name', 'age', 'city'], dtype='object')

In [22]:
df.index = ["a", "b", "c"]
df.columns = ["Name", "Age", "City"]

In [23]:
df

Unnamed: 0,Name,Age,City
a,Alice,25,Delhi
b,Bob,30,Mumbai
c,Charlie,35,Bangalore
