<a href="https://colab.research.google.com/github/yellowgram1543/6-Stages-of-AIML/blob/main/AIML0_Day3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np
import pandas as pd

## **Series**

A **Series** is one of the two primary data structures in pandas (the other being DataFrame). It's a **one-dimensional labeled array** capable of holding data of any type (integers, strings, floating point numbers, Python objects, etc.). Think of it as a single column of data with an associated index.

- One-dimensional: Contains a single column of data
- Homogeneous data type: All elements typically have the same data type (though technically can hold mixed types)
- Labeled index: Each element has an associated label (index)
- Size-immutable: You cannot change the size of a Series after creation
- Values are mutable: You can modify the values within the Series

In [4]:
import pandas as pd
import numpy as np

# 1. From a Python list
s1 = pd.Series([1, 3, 5, np.nan, 6, 8])
print(s1, "\n")

# 2. With custom index
s2 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(s2, "\n")

# 3. From a dictionary
data_dict = {'apple': 45, 'banana': 30, 'orange': 55}
s3 = pd.Series(data_dict)
print(s3, "\n")

# 4. From a scalar value (creates Series with repeated value)
s4 = pd.Series(5, index=['x', 'y', 'z'])
print(s4, "\n")

# 5. From NumPy array
arr = np.array([1, 2, 3, 4])
s5 = pd.Series(arr)
print(s5)

0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64 

a    10
b    20
c    30
d    40
dtype: int64 

apple     45
banana    30
orange    55
dtype: int64 

x    5
y    5
z    5
dtype: int64 

0    1
1    2
2    3
3    4
dtype: int64


**Series Attributes**

In [6]:
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])

# Access the underlying data as a NumPy array
print(s.values, "\n")

# Access the index
print(s.index, "\n")

# Get the data type
print(s.dtype, "\n")

# Get the shape
print(s.shape, "\n")

# Get the size (number of elements)
print(s.size, "\n")

# Check if Series is empty
print(s.empty)

[10 20 30 40] 

Index(['a', 'b', 'c', 'd'], dtype='object') 

int64 

(4,) 

4 

False


**Indexing**

In [10]:
s = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])

# Single label
print(s['a'])        # 10

# Multiple labels
print(s[['a', 'c']])

# Using .loc for explicit label-based access
print(s.loc['b'], "\n")    # 20
print(s.loc[['b', 'd']], "\n")

print(s.iloc[0], "\n")     # 10 (first element)
print(s.iloc[1:3], "\n")   # Elements at positions 1 and 2

mask = s > 25
print(mask)

10
a    10
c    30
dtype: int64
20 

b    20
d    40
dtype: int64 

10 

b    20
c    30
dtype: int64


**Series Operation**

In [None]:
s1 = pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
s2 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])

# Element-wise operations
print(s1 + s2)
print(s2 / s1)

In [None]:
s = pd.Series([1, np.nan, 3, np.nan, 5])

# Check for missing values
print(s.isnull())

# Drop missing values
print(s.dropna())

# Fill missing values
print(s.fillna(0))

**Statstical Methods**

In [None]:
s = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

print(s.sum())      # 55
print(s.mean())     # 5.5
print(s.median())   # 5.5
print(s.std())      # 3.0276503540974917
print(s.min())      # 1
print(s.max())      # 10
print(s.count())    # 10 (counts non-null values)

**Sorting**

In [None]:
s = pd.Series([3, 1, 4, 1, 5], index=['e', 'a', 'c', 'b', 'd'])

# Sort by values
print(s.sort_values())

# Sort by index
print(s.sort_index())

**String Operations**

In [None]:
s = pd.Series(['apple', 'banana', 'cherry'])

print(s.str.upper())
print(s.str.len())

**Alignment Feature**

One of pandas' most powerful features is automatic alignment based on index labels

In [11]:
s1 = pd.Series([1, 2, 3], index=['a', 'b', 'c'])
s2 = pd.Series([4, 5, 6], index=['b', 'c', 'd'])

# When performing operations, pandas aligns by index
result = s1 + s2
print(result, "\n")

# To handle missing values in alignment
result_filled = s1.add(s2, fill_value=0)
print(result_filled)
# a    1.0  (1+0)
# b    6.0  (1+5)
# c    8.0  (2+6)
# d    6.0  (0+6)

a    NaN
b    6.0
c    8.0
d    NaN
dtype: float64 

a    1.0
b    6.0
c    8.0
d    6.0
dtype: float64
