#### Pandas Tutorial - Part 42

This notebook covers various Series methods including:
- Arithmetic operations with `add()`
- Label manipulation with `add_prefix()` and `add_suffix()`
- Boolean operations
- Categorical data with `cat` accessor
- Value clipping with `clip()`

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

##### Arithmetic Operations with Series

Pandas Series support various arithmetic operations. The `add()` method is one such operation that allows for element-wise addition with special handling for missing values.

### The `add()` Method

The `add()` method performs element-wise addition between two Series objects or a Series and a scalar value. It also provides options for handling missing values.

In [None]:
# Create two Series with some missing values
a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])

print("Series a:")
print(a)
print("\nSeries b:")
print(b)

In [None]:
# Basic addition (NaN values propagate)
print("a + b:")
print(a + b)

In [None]:
# Using add() with fill_value
print("a.add(b, fill_value=0):")
print(a.add(b, fill_value=0))

In [None]:
# Adding a scalar value
print("a.add(10):")
print(a.add(10))

### Other Arithmetic Operations

Similar to `add()`, pandas Series support other arithmetic operations like subtraction, multiplication, and division.

In [None]:
# Subtraction
print("a.sub(b, fill_value=0):")
print(a.sub(b, fill_value=0))

# Multiplication
print("\na.mul(b, fill_value=1):")
print(a.mul(b, fill_value=1))

# Division
print("\na.div(b, fill_value=1):")
print(a.div(b, fill_value=1))

##### Label Manipulation

Pandas provides methods to manipulate the labels (index) of Series and DataFrame objects.

### The `add_prefix()` Method

The `add_prefix()` method adds a prefix to the labels of a Series or the column names of a DataFrame.

In [None]:
# Create a simple Series
s = pd.Series([1, 2, 3, 4])
print("Original Series:")
print(s)

In [None]:
# Add prefix to Series labels
s_prefixed = s.add_prefix('item_')
print("Series with prefixed labels:")
print(s_prefixed)

In [None]:
# Create a simple DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [3, 4, 5, 6]})
print("Original DataFrame:")
print(df)

In [None]:
# Add prefix to DataFrame column names
df_prefixed = df.add_prefix('col_')
print("DataFrame with prefixed column names:")
print(df_prefixed)

### The `add_suffix()` Method

The `add_suffix()` method adds a suffix to the labels of a Series or the column names of a DataFrame.

In [None]:
# Add suffix to Series labels
s_suffixed = s.add_suffix('_value')
print("Series with suffixed labels:")
print(s_suffixed)

In [None]:
# Add suffix to DataFrame column names
df_suffixed = df.add_suffix('_col')
print("DataFrame with suffixed column names:")
print(df_suffixed)

##### Boolean Operations

Pandas Series can be used in boolean contexts, but there are some special considerations.

### The `bool()` Method

The `bool()` method converts a single-element Series to a boolean value. It raises a ValueError if the Series has more than one element or if the element is not boolean.

In [None]:
# Create a single-element boolean Series
s_true = pd.Series([True])
s_false = pd.Series([False])

print(f"s_true.bool(): {s_true.bool()}")
print(f"s_false.bool(): {s_false.bool()}")

In [None]:
# Try bool() on a multi-element Series
try:
    s_multi = pd.Series([True, False])
    s_multi.bool()
except ValueError as e:
    print(f"Error: {e}")

In [None]:
# Try bool() on a non-boolean Series
try:
    s_nonbool = pd.Series([1])
    s_nonbool.bool()
except ValueError as e:
    print(f"Error: {e}")

### Boolean Indexing

Boolean indexing is a powerful feature in pandas that allows you to filter data based on conditions.

In [None]:
# Create a Series
s = pd.Series([1, 2, 3, 4, 5])

# Filter values greater than 2
mask = s > 2
print("Boolean mask:")
print(mask)

print("\nFiltered Series:")
print(s[mask])

In [None]:
# Multiple conditions
mask = (s > 2) & (s < 5)
print("Values between 2 and 5 (exclusive):")
print(s[mask])

##### Categorical Data with the `cat` Accessor

The `cat` accessor provides access to categorical operations for Series with categorical data.

In [None]:
# Create a categorical Series
s = pd.Series(['a', 'b', 'c', 'a', 'b', 'c'], dtype='category')
print("Categorical Series:")
print(s)
print("\nData type:", s.dtype)

In [None]:
# Get categories
print("Categories:")
print(s.cat.categories)

In [None]:
# Rename categories
s_renamed = s.cat.rename_categories(['A', 'B', 'C'])
print("Series with renamed categories:")
print(s_renamed)

In [None]:
# Reorder categories
s_reordered = s.cat.reorder_categories(['c', 'b', 'a'])
print("Series with reordered categories:")
print(s_reordered)

In [None]:
# Add categories
s_added = s.cat.add_categories(['d', 'e'])
print("Series with added categories:")
print(s_added)
print("\nNew categories:")
print(s_added.cat.categories)

In [None]:
# Remove categories
s_removed = s_added.cat.remove_categories(['d'])
print("Series with removed categories:")
print(s_removed)
print("\nRemaining categories:")
print(s_removed.cat.categories)

In [None]:
# Set categories
s_set = s.cat.set_categories(['a', 'b', 'c', 'd', 'e'])
print("Series with set categories:")
print(s_set)
print("\nSet categories:")
print(s_set.cat.categories)

In [None]:
# Make categories ordered
s_ordered = s.cat.as_ordered()
print("Series with ordered categories:")
print(s_ordered)
print("\nIs ordered:", s_ordered.cat.ordered)

In [None]:
# Make categories unordered
s_unordered = s_ordered.cat.as_unordered()
print("Series with unordered categories:")
print(s_unordered)
print("\nIs ordered:", s_unordered.cat.ordered)

##### Value Clipping with `clip()`

The `clip()` method trims values at specified thresholds, replacing values outside the thresholds with the threshold values.

In [None]:
# Create a DataFrame
data = {'col_0': [9, -3, 0, -1, 5], 'col_1': [-2, -7, 6, 8, -5]}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

In [None]:
# Clip values between -4 and 6
df_clipped = df.clip(-4, 6)
print("DataFrame with values clipped between -4 and 6:")
print(df_clipped)

In [None]:
# Clip using different thresholds per row
t = pd.Series([2, -4, -1, 6, 3])
print("Threshold Series:")
print(t)

df_row_clipped = df.clip(t, t + 4, axis=0)
print("\nDataFrame with row-specific clipping:")
print(df_row_clipped)

In [None]:
# Clip a Series
s = pd.Series([1, 10, -5, 3, -10, 8])
print("Original Series:")
print(s)

s_clipped = s.clip(-3, 7)
print("\nSeries with values clipped between -3 and 7:")
print(s_clipped)

In [None]:
# Clip in-place
s_inplace = s.copy()
s_inplace.clip(-3, 7, inplace=True)
print("Series after in-place clipping:")
print(s_inplace)

##### Conclusion

In this notebook, we've explored various Series methods in pandas:

1. Arithmetic operations with `add()` and similar methods, including special handling for missing values.
2. Label manipulation with `add_prefix()` and `add_suffix()` for both Series and DataFrames.
3. Boolean operations, including the `bool()` method for single-element Series and boolean indexing for filtering data.
4. Categorical data operations using the `cat` accessor, including managing categories and their order.
5. Value clipping with `clip()` to trim values at specified thresholds, with options for different thresholds per row and in-place modification.

These methods are essential tools for data manipulation and analysis in pandas, allowing for flexible and powerful operations on your data.