# Pandas Introduction

Pandas is a powerful library for data manipulation and analysis in Python. It provides data structures like Series and DataFrame for handling structured data.

In [3]:
import pandas as pd
import numpy as np

print("Pandas imported successfully!")
print(f"Pandas version: {pd.__version__}")

Pandas imported successfully!
Pandas version: 2.3.0


## Pandas Series

A Series is a one-dimensional labeled array that can hold any data type.

In [4]:
# Creating a Series
s = pd.Series([1, 3, 5, 6, 8])
print("Series:")
print(s)
print(f"Type: {type(s)}")

# Series with custom index
s_custom = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print("\nSeries with custom index:")
print(s_custom)

# Accessing elements
print(f"s[0]: {s[0]}")
print(f"s_custom['b']: {s_custom['b']}")

# Basic operations
print(f"Mean: {s.mean()}")
print(f"Sum: {s.sum()}")
print(f"Max: {s.max()}")

Series:
0    1
1    3
2    5
3    6
4    8
dtype: int64
Type: <class 'pandas.core.series.Series'>

Series with custom index:
a    10
b    20
c    30
dtype: int64
s[0]: 1
s_custom['b']: 20
Mean: 4.6
Sum: 23
Max: 8


## Pandas DataFrame

A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.

In [5]:
# Creating a DataFrame from dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['NYC', 'LA', 'Chicago']
}
df = pd.DataFrame(data)
print("DataFrame:")
print(df)
print(f"Type: {type(df)}")

# Basic info
print(f"\nShape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print(f"Index: {list(df.index)}")

# Accessing columns
print(f"\nNames: {df['Name'].tolist()}")
print(f"Ages: {df['Age'].tolist()}")

# Accessing rows
print(f"\nFirst row:\n{df.iloc[0]}")
print(f"Row with index 1:\n{df.loc[1]}")

# Summary statistics
print(f"\nAge statistics:\n{df['Age'].describe()}")

DataFrame:
      Name  Age     City
0    Alice   25      NYC
1      Bob   30       LA
2  Charlie   35  Chicago
Type: <class 'pandas.core.frame.DataFrame'>

Shape: (3, 3)
Columns: ['Name', 'Age', 'City']
Index: [0, 1, 2]

Names: ['Alice', 'Bob', 'Charlie']
Ages: [25, 30, 35]

First row:
Name    Alice
Age        25
City      NYC
Name: 0, dtype: object
Row with index 1:
Name    Bob
Age      30
City     LA
Name: 1, dtype: object

Age statistics:
count     3.0
mean     30.0
std       5.0
min      25.0
25%      27.5
50%      30.0
75%      32.5
max      35.0
Name: Age, dtype: float64
