# 📘 Pandas - Intro to Data Structures


This notebook covers the **introductory concepts of Pandas data structures** with clear explanations and examples.
We’ll explore:
- **Series**
- **DataFrame**
- Indexing, access, and basic operations


## 🟢 Series


A **Series** is a one-dimensional labeled array capable of holding any data type (integers, strings, floats, Python objects, etc.).  
It’s similar to a column in an Excel sheet or a single column in a database table.


In [None]:

import pandas as pd
import numpy as np

# Create a simple Series
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print("Series:")
print(s)


### Custom Index for Series

In [None]:

s_custom = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print("Custom Index Series:")
print(s_custom)
print("\nAccess element 'b':", s_custom['b'])


### Series from Dictionary

In [None]:

data_dict = {'x': 100, 'y': 200, 'z': 300}
s_from_dict = pd.Series(data_dict)
s_from_dict


### Vectorized Operations

In [None]:

s_ops = pd.Series([1, 2, 3, 4])
print("Original:", s_ops.values)
print("Add 10:", s_ops + 10)
print("Squared:", s_ops ** 2)


## 🔵 DataFrame


A **DataFrame** is a 2D labeled data structure with columns of potentially different types — similar to a spreadsheet or SQL table.


In [None]:

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [24, 27, 22, 32],
    'City': ['New York', 'London', 'Paris', 'Tokyo']
}
df = pd.DataFrame(data)
df


### DataFrame from NumPy Array

In [None]:

arr = np.random.randn(4, 3)
df_arr = pd.DataFrame(arr, columns=['A', 'B', 'C'])
df_arr


### Accessing Data in DataFrame

In [None]:

print("First 2 rows:")
print(df.head(2))
print("\nAccess 'Name' column:")
print(df['Name'])
print("\nAccess row by label (loc):")
print(df.loc[1])
print("\nAccess row by position (iloc):")
print(df.iloc[2])


### Adding and Deleting Columns

In [None]:

df['Salary'] = [50000, 60000, 45000, 70000]
print("After adding column:")
print(df)

df = df.drop(columns=['Age'])
print("\nAfter deleting 'Age' column:")
print(df)


### Basic Operations on DataFrame

In [None]:

print("Summary statistics:")
print(df.describe())

print("\nTranspose:")
print(df.T)


### Handling Missing Data

In [None]:

df_nan = df.copy()
df_nan.loc[2, 'Salary'] = np.nan
print("With missing data:")
print(df_nan)

print("\nAfter filling NaN with mean:")
df_nan['Salary'] = df_nan['Salary'].fillna(df_nan['Salary'].mean())
print(df_nan)



## ✅ Summary
- **Series** → 1D labeled array  
- **DataFrame** → 2D labeled table  
- Support for vectorized operations, alignment by labels, and missing data handling  
