# Pandas Tutorial - Part 47

This notebook covers various Series methods including:
- Standard error of the mean with `sem()`
- Setting axis labels with `set_axis()`
- Shifting data with `shift()`

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

## Standard Error of the Mean

The `sem()` method calculates the standard error of the mean of a Series.

In [None]:
# Create a Series with sample data
s = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print("Sample data:")
print(s)

In [None]:
# Calculate the mean
mean = s.mean()
print(f"Mean: {mean}")

In [None]:
# Calculate the standard deviation
std = s.std()
print(f"Standard deviation: {std}")

In [None]:
# Calculate the standard error of the mean
sem = s.sem()
print(f"Standard error of the mean: {sem}")

In [None]:
# Verify the calculation: SEM = std / sqrt(n)
manual_sem = s.std() / np.sqrt(len(s))
print(f"Manually calculated SEM: {manual_sem}")

In [None]:
# Create a Series with missing values
s_with_nan = pd.Series([1, 2, 3, np.nan, 5, 6, np.nan, 8, 9, 10])
print("Series with missing values:")
print(s_with_nan)

In [None]:
# Calculate SEM with skipna=True (default)
sem_skipna = s_with_nan.sem()
print(f"SEM with skipna=True: {sem_skipna}")

In [None]:
# Calculate SEM with skipna=False
sem_no_skipna = s_with_nan.sem(skipna=False)
print(f"SEM with skipna=False: {sem_no_skipna}")

In [None]:
# Calculate SEM with different ddof values
sem_ddof0 = s.sem(ddof=0)
sem_ddof1 = s.sem(ddof=1)  # default
sem_ddof2 = s.sem(ddof=2)

print(f"SEM with ddof=0: {sem_ddof0}")
print(f"SEM with ddof=1 (default): {sem_ddof1}")
print(f"SEM with ddof=2: {sem_ddof2}")

## Setting Axis Labels

The `set_axis()` method assigns desired labels to a given axis.

In [None]:
# Create a Series
s = pd.Series([1, 2, 3])
print("Original Series:")
print(s)

In [None]:
# Set new axis labels
s_new_labels = s.set_axis(['a', 'b', 'c'], axis=0)
print("Series with new labels:")
print(s_new_labels)

In [None]:
# Set new axis labels in-place
s_inplace = s.copy()
s_inplace.set_axis(['x', 'y', 'z'], axis=0, inplace=True)
print("Series after in-place label change:")
print(s_inplace)

In [None]:
# Create a DataFrame
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
print("Original DataFrame:")
print(df)

In [None]:
# Change row labels
df_row_labels = df.set_axis(['a', 'b', 'c'], axis='index')
print("DataFrame with new row labels:")
print(df_row_labels)

In [None]:
# Change column labels
df_col_labels = df.set_axis(['I', 'II'], axis='columns')
print("DataFrame with new column labels:")
print(df_col_labels)

In [None]:
# Change column labels in-place
df_inplace = df.copy()
df_inplace.set_axis(['X', 'Y'], axis='columns', inplace=True)
print("DataFrame after in-place column label change:")
print(df_inplace)

## Shifting Data

The `shift()` method shifts the index by a desired number of periods.

In [None]:
# Create a Series
s = pd.Series([1, 2, 3, 4, 5])
print("Original Series:")
print(s)

In [None]:
# Shift data by 1 period (default)
s_shift1 = s.shift()
print("Series shifted by 1 period:")
print(s_shift1)

In [None]:
# Shift data by 2 periods
s_shift2 = s.shift(periods=2)
print("Series shifted by 2 periods:")
print(s_shift2)

In [None]:
# Shift data backward by 1 period
s_shift_neg1 = s.shift(periods=-1)
print("Series shifted backward by 1 period:")
print(s_shift_neg1)

In [None]:
# Shift with a custom fill value
s_shift_fill = s.shift(periods=2, fill_value=0)
print("Series shifted by 2 periods with fill_value=0:")
print(s_shift_fill)

In [None]:
# Create a Series with datetime index
date_s = pd.Series([1, 2, 3, 4], index=pd.date_range('2023-01-01', periods=4))
print("Series with datetime index:")
print(date_s)

In [None]:
# Shift with frequency
date_s_freq = date_s.shift(periods=1, freq='D')
print("Series shifted by 1 day:")
print(date_s_freq)

In [None]:
# Shift with different frequencies
print("Series shifted by 2 days:")
print(date_s.shift(periods=2, freq='D'))

print("\nSeries shifted by 1 week:")
print(date_s.shift(periods=1, freq='W'))

print("\nSeries shifted by 1 month:")
print(date_s.shift(periods=1, freq='M'))

In [None]:
# Create a DataFrame
df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]})
print("Original DataFrame:")
print(df)

In [None]:
# Shift all columns
df_shift = df.shift()
print("DataFrame with all columns shifted:")
print(df_shift)

In [None]:
# Shift only one column
df_col_shift = df.copy()
df_col_shift['A'] = df_col_shift['A'].shift()
print("DataFrame with only column A shifted:")
print(df_col_shift)

## Applications of Shifting Data

Shifting data is particularly useful for time series analysis and calculating differences or percentage changes.

In [None]:
# Create a Series with stock prices
stock_prices = pd.Series([100, 102, 104, 103, 105, 107, 108], 
                         index=pd.date_range('2023-01-01', periods=7))
print("Stock prices:")
print(stock_prices)

In [None]:
# Calculate daily price difference
daily_diff = stock_prices - stock_prices.shift(1)
print("Daily price difference:")
print(daily_diff)

In [None]:
# Calculate daily percentage change
daily_pct_change = stock_prices.pct_change()
print("Daily percentage change:")
print(daily_pct_change)

In [None]:
# Calculate moving average
def moving_average(data, window):
    return data.rolling(window=window).mean()

ma3 = moving_average(stock_prices, 3)
print("3-day moving average:")
print(ma3)

In [None]:
# Plot stock prices and moving average
plt.figure(figsize=(10, 6))
plt.plot(stock_prices.index, stock_prices, label='Stock Price')
plt.plot(ma3.index, ma3, label='3-day MA')
plt.title('Stock Price and Moving Average')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid(True)
plt.show()

## Conclusion

In this notebook, we've explored various Series methods in pandas:

1. Standard error of the mean with `sem()`, which calculates the standard error of the mean of a Series with options for handling missing values and degrees of freedom.
2. Setting axis labels with `set_axis()`, which assigns new labels to a given axis in a Series or DataFrame.
3. Shifting data with `shift()`, which shifts the index by a desired number of periods with options for custom fill values and frequency.

We also explored practical applications of shifting data, such as calculating differences, percentage changes, and moving averages, which are particularly useful for time series analysis.

These methods are essential tools for data manipulation and analysis in pandas, allowing for flexible and powerful operations on your data.