## `Series` – One-dimensional labeled array

In [1]:
import numpy as np
import pandas as pd

**What is it?**

A `Series` is like a column in Excel or a single column from a DataFrame.

**Components:**

* **Values**: The actual data.
* **Index**: Labels for each value (like row numbers, but can be custom).

**Basic syntax:**

```python
pd.Series(data, index=None, dtype=None, name=None)
```

* `data`: Array-like (list, NumPy array, dict, scalar).
* `index`: Optional. Custom labels for each element.
* `dtype`: Optional. Specify data type (e.g., int, float, str).
* `name`: Optional. Name of the Series.

### Pandas Series Creation

Does series is internally a numpy array ?

#### From a List

In [2]:
s = pd.Series([10, 20 ,30 ,40])
print(s)

0    10
1    20
2    30
3    40
dtype: int64


In [3]:
print(type(s))
print(s.index)
print(s.dtype)
print(s.name)
print(s.shape)
print(s.ndim)

<class 'pandas.core.series.Series'>
RangeIndex(start=0, stop=4, step=1)
int64
None
(4,)
1


#### From a list with custom index

In [4]:
s1 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
print(s1)

a    10
b    20
c    30
d    40
dtype: int64


In [5]:
print(type(s1))
print(s1.index)
print(s1.dtype)
print(s1.name)

<class 'pandas.core.series.Series'>
Index(['a', 'b', 'c', 'd'], dtype='object')
int64
None


In [6]:
s2 = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'], dtype=np.int8, name='Days')
print(s2)

a    10
b    20
c    30
d    40
Name: Days, dtype: int8


In [7]:
print(type(s2))
print(s2.index)
print(s2.dtype)
print(s2.name)

<class 'pandas.core.series.Series'>
Index(['a', 'b', 'c', 'd'], dtype='object')
int8
Days


#### From a Dictionary (keys become index)

In [8]:
xyz = pd.Series({'x': 100, 'y': 200, 'z': 300}, name="XYZ", dtype=np.int16)
print(xyz)

x    100
y    200
z    300
Name: XYZ, dtype: int16


#### From a Scalar (repeated value)

In [9]:
abc = pd.Series(5, index=['a', 'b', 'c'])
print(abc)

a    5
b    5
c    5
dtype: int64


#### With Specified Data Type

In [10]:
pd.Series([10, 20, 30], dtype=np.float16)

0    10.0
1    20.0
2    30.0
dtype: float16

#### With a Name

In [11]:
pd.Series([100, 200], name='Scores')

0    100
1    200
Name: Scores, dtype: int64

### Operations to perform on pandas Series

#### 1. Creation

In [12]:
s = pd.Series([10, 20 ,30, 40], index=['a', 'b', 'c', 'd'], name="Numbers", dtype=np.int8)
print(s)

a    10
b    20
c    30
d    40
Name: Numbers, dtype: int8


#### 2. Accessing Elements

#### By label

In [13]:
s['b']

20

In [14]:
s.loc['b']

20

#### By position

In [15]:
s[1]

  s[1]


20

In [16]:
s.iloc[1]

20

#### Multiple Elements

In [17]:
s[['b', 'a']]

b    20
a    10
Name: Numbers, dtype: int8

Can't we access multiple elements using loc[] and iloc[] ?

### 3. Vectorized Operations (Element-wise)

In [18]:
print(s)

a    10
b    20
c    30
d    40
Name: Numbers, dtype: int8


In [19]:
print(s + 10)
print(s * 2)

print(s > 20)

a    20
b    30
c    40
d    50
Name: Numbers, dtype: int8
a    20
b    40
c    60
d    80
Name: Numbers, dtype: int8
a    False
b    False
c     True
d     True
Name: Numbers, dtype: bool


### 4. Slicing & Filtering

#### Slice by position

In [20]:
s[1:3]

b    20
c    30
Name: Numbers, dtype: int8

In [21]:
s.iloc[1:3]

b    20
c    30
Name: Numbers, dtype: int8

In [22]:
s.loc['a':'c']

a    10
b    20
c    30
Name: Numbers, dtype: int8

#### Filter by condition

In [23]:
s > 20

a    False
b    False
c     True
d     True
Name: Numbers, dtype: bool

In [24]:
s[s > 20]

c    30
d    40
Name: Numbers, dtype: int8

#### Renaming index

In [25]:
s.index

Index(['a', 'b', 'c', 'd'], dtype='object')

In [26]:
s.index = ['w', 'x', 'y', 'x']

In [27]:
s.index

Index(['w', 'x', 'y', 'x'], dtype='object')

#### 5. Mathematical and Statistical Operations

In [114]:
print(f"Sum: {s.sum()}")
print(f"Mean: {s.mean()}")
print(f"Min: {s.min()}")
print(f"describe: \n{s.describe()}")

Sum: 150
Mean: 30.0
Min: 10
describe: 
count     5.000000
mean     30.000000
std      15.811388
min      10.000000
25%      20.000000
50%      30.000000
75%      40.000000
max      50.000000
dtype: float64


#### 6. Handling Missing Data
- isnull()
- not_null()
- dropna()
- fillna()
- ffill() - forward fill
- bfill() - backward fill

In [29]:
s = pd.Series([10, None, 30, None, 50])
print(s)

0    10.0
1     NaN
2    30.0
3     NaN
4    50.0
dtype: float64


In [30]:
s.isnull() # Detect missing values

0    False
1     True
2    False
3     True
4    False
dtype: bool

In [31]:
s.notnull() # Detect non-missing values

0     True
1    False
2     True
3    False
4     True
dtype: bool

In [32]:
s.dropna()

0    10.0
2    30.0
4    50.0
dtype: float64

In [33]:
s

0    10.0
1     NaN
2    30.0
3     NaN
4    50.0
dtype: float64

In [34]:
s.dropna(inplace=True) # Drop missing values

In [35]:
s

0    10.0
2    30.0
4    50.0
dtype: float64

In [36]:
s1 = pd.Series([10, None, 30, None, 50])
print(s1)

0    10.0
1     NaN
2    30.0
3     NaN
4    50.0
dtype: float64


In [37]:
s1.fillna(100) # Fill with 100

0     10.0
1    100.0
2     30.0
3    100.0
4     50.0
dtype: float64

In [38]:
s1

0    10.0
1     NaN
2    30.0
3     NaN
4    50.0
dtype: float64

In [39]:
s1.fillna(100, inplace=True)

In [40]:
s1

0     10.0
1    100.0
2     30.0
3    100.0
4     50.0
dtype: float64

In [116]:
s1 = pd.Series([10, None, 30, None, 50])
print(s1)

0    10.0
1     NaN
2    30.0
3     NaN
4    50.0
dtype: float64


In [44]:
s1.fillna(method='ffill') # Forward fill

  s1.fillna(method='ffill') # Forward fill


0    10.0
1    10.0
2    30.0
3    30.0
4    50.0
dtype: float64

In [47]:
s1.ffill()

0    10.0
1    10.0
2    30.0
3    30.0
4    50.0
dtype: float64

In [118]:
s1.bfill()

0    10.0
1    30.0
2    30.0
3    50.0
4    50.0
dtype: float64

### 7. String Operations
- Applicable only if the Series contains strings.
- Useful in text preprocessing, NLP, formatting


- .str.lower()
- .str.len()
- .str.contains()
- .str.replace()

In [63]:
names = pd.Series(['alice', 'Bob', 'CHARLIE'])
names

0      alice
1        Bob
2    CHARLIE
dtype: object

In [67]:
names = pd.Series(['alice', 'Bob', 'CHARLIE'], dtype='unicode_')
names

0      alice
1        Bob
2    CHARLIE
dtype: object

In [66]:
names1 = pd.Series(['alice', 'Bob', 'CHARLIE'], dtype='string_')
names1

0      b'alice'
1        b'Bob'
2    b'CHARLIE'
dtype: bytes56

In [69]:
names.str.lower()

0      alice
1        bob
2    charlie
dtype: object

In [73]:
try:
    print(names1.str.lower())
except Exception as e:
    print(e)

Cannot use .str.lower with values of inferred dtype 'bytes'.


In [74]:
names.str.len()

0    5
1    3
2    7
dtype: int64

In [75]:
names.str.contains('a')

0     True
1    False
2    False
dtype: bool

In [76]:
names.str.replace('a', '@')

0      @lice
1        Bob
2    CHARLIE
dtype: object

### 8. Value Counts and Uniqueness
- value_counts()
- unique()
- nunique()

In [79]:
s = pd.Series(['A', 'B', 'A', 'C', 'B', 'A'])

s1 = s.value_counts()
s1

A    3
B    2
C    1
Name: count, dtype: int64

In [None]:
# A B C acts as index 
# 3 2 1 acts as original data in series

In [83]:
print(type(s1))
print(s1.dtype)
print(s1.index)
print(s1.values)

<class 'pandas.core.series.Series'>
int64
Index(['A', 'B', 'C'], dtype='object')
[3 2 1]


In [86]:
print(s.unique())
print(s1.nunique())

['A' 'B' 'C']
3


### 9. Sorting Series
- sort_values() - By values
- sort_index()  - By index

In [88]:
print(s)

0    A
1    B
2    A
3    C
4    B
5    A
dtype: object


In [92]:
print(s.sort_values())
print(s.sort_values(ascending=False))
print(s.sort_index())
print(s.sort_index(ascending=False))

0    A
2    A
5    A
1    B
4    B
3    C
dtype: object
3    C
1    B
4    B
0    A
2    A
5    A
dtype: object
0    A
1    B
2    A
3    C
4    B
5    A
dtype: object
5    A
4    B
3    C
2    A
1    B
0    A
dtype: object


### 10. Mapping and Applying Functions
- map()
- apply()

In [94]:
s = pd.Series([1, 2, 3])
s.map(lambda x: x**2)

0    1
1    4
2    9
dtype: int64

In [95]:
s.apply(lambda x: x + 10)

0    11
1    12
2    13
dtype: int64

### 11. Combining and Appending Series
Append (old way, deprecated in future versions):

- concat()

In [101]:
s1 = pd.Series([10, 20])
s2 = pd.Series([30, 40])

print(pd.concat([s1, s2]))
print(pd.concat([s2, s1]))

0    10
1    20
0    30
1    40
dtype: int64
0    30
1    40
0    10
1    20
dtype: int64


### 12. 🧰 Other Useful Operations

| Operation         | Syntax                       | Description            |
| ----------------- | ---------------------------- | ---------------------- |
| Rename Series     | `s.rename("New Name")`       | Assign a name          |
| Convert to list   | `s.tolist()`                 | Convert to Python list |
| Check dtype       | `s.dtype`                    | Data type              |
| Convert dtype     | `s.astype(float)`            | Change type            |
| Clip values       | `s.clip(lower=20, upper=40)` | Truncate values        |
| Between condition | `s[s.between(20, 30)]`       | Range filtering        |
| Rank              | `s.rank()`                   | Rank values            |
| Cumulative sum    | `s.cumsum()`                 | Running total          |

In [106]:
s = pd.Series([10, 20, 30, 40, 50])
s.clip(lower=20, upper=40)

0    20
1    20
2    30
3    40
4    40
dtype: int64

In [107]:
s.between(20, 40)

0    False
1     True
2     True
3     True
4    False
dtype: bool

In [108]:
s[s.between(20, 40)]

1    20
2    30
3    40
dtype: int64

In [109]:
s.rank()

0    1.0
1    2.0
2    3.0
3    4.0
4    5.0
dtype: float64

In [110]:
s.cumsum()

0     10
1     30
2     60
3    100
4    150
dtype: int64

### Summary Table

| Category        | Examples                                 |
| --------------- | ---------------------------------------- |
| Access          | `s[0]`, `s['a']`, `s[1:3]`               |
| Filter          | `s[s > 10]`, `s[s.isnull()]`             |
| Math/Stats      | `mean()`, `std()`, `sum()`, `describe()` |
| Missing data    | `isnull()`, `dropna()`, `fillna()`       |
| Strings         | `str.lower()`, `str.contains()`          |
| Unique values   | `unique()`, `value_counts()`             |
| Sort and rank   | `sort_values()`, `rank()`                |
| Apply functions | `map()`, `apply()`                       |
| Combine         | `s1 + s2`, `concat()`                    |


<center><b>Thanks</b></center>