## Pandas Tutorial 21: Pandas Shifting and Lagging

Shifting and lagging are techniques used to move values in a time series forward or backward in time. The `shift()` method can be applied to both DataFrames and time series to shift values or dates, making it useful for tasks like calculating differences, forecasting, or comparing data across time periods.

#### Topics covered:
* **Introduction**
* **Using the `shift()` function**

This tutorial will show you how to use the `shift()` method for adjusting time series data and applying lagging for time-based analysis.

In [2]:
import pandas as pd

In [3]:
df = pd.read_csv("fb.csv", parse_dates=['Date'], index_col='Date')
df

  df = pd.read_csv("fb.csv", parse_dates=['Date'], index_col='Date')


Unnamed: 0_level_0,Price
Date,Unnamed: 1_level_1
2017-08-15,171.0
2017-08-16,170.0
2017-08-17,166.91
2017-08-18,167.41
2017-08-21,167.78
2017-08-22,169.64
2017-08-23,168.71
2017-08-24,167.74
2017-08-25,166.32
2017-08-28,167.24


<img src="shift_image.png" alt="Shift Image" width="800"/>

### Shifting Data in a DataFrame
The `shift()` method is used to shift the values in a DataFrame by a specified number of periods. In this example, we shift the data down by one row.

**Key Features:**
* `shift(1)`: Moves the data down by 1 row, introducing `NaN` values at the top.
* **Common Use**: Useful for comparing values with previous periods or creating lagged features in time series analysis.

In [4]:
# Shifts all values in the DataFrame down by 1 row
df.shift(1)

Unnamed: 0_level_0,Price
Date,Unnamed: 1_level_1
2017-08-15,
2017-08-16,171.0
2017-08-17,170.0
2017-08-18,166.91
2017-08-21,167.41
2017-08-22,167.78
2017-08-23,169.64
2017-08-24,168.71
2017-08-25,167.74
2017-08-28,166.32


In [5]:
# Shifts all values in the DataFrame up by 1 row
df.shift(-1)

Unnamed: 0_level_0,Price
Date,Unnamed: 1_level_1
2017-08-15,170.0
2017-08-16,166.91
2017-08-17,167.41
2017-08-18,167.78
2017-08-21,169.64
2017-08-22,168.71
2017-08-23,167.74
2017-08-24,166.32
2017-08-25,167.24
2017-08-28,


### Creating a "Previous Day Price" Column
In this example, we create a new column called `'Prev Day Price'`, which contains the stock price from the previous day. We use the `shift(1)` function to shift the `'Price'` column down by one row.

**Key Features:**
* `shift(1)`: Shifts the `'Price'` column down by one row, effectively moving each value to the next row to represent the previous day's price.
* **Creates a lagged feature**: Useful for comparing the current day's price with the previous day's price.

This approach is commonly used in financial analysis to calculate day-to-day changes in stock prices.

In [6]:
df['Prev Day Price'] = df['Price'].shift(1)
df

Unnamed: 0_level_0,Price,Prev Day Price
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2017-08-15,171.0,
2017-08-16,170.0,171.0
2017-08-17,166.91,170.0
2017-08-18,167.41,166.91
2017-08-21,167.78,167.41
2017-08-22,169.64,167.78
2017-08-23,168.71,169.64
2017-08-24,167.74,168.71
2017-08-25,166.32,167.74
2017-08-28,167.24,166.32


In [7]:
# Calculates the price change and creates a new column
df['Price Change'] = df['Price'] - df['Prev Day Price']
df

Unnamed: 0_level_0,Price,Prev Day Price,Price Change
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2017-08-15,171.0,,
2017-08-16,170.0,171.0,-1.0
2017-08-17,166.91,170.0,-3.09
2017-08-18,167.41,166.91,0.5
2017-08-21,167.78,167.41,0.37
2017-08-22,169.64,167.78,1.86
2017-08-23,168.71,169.64,-0.93
2017-08-24,167.74,168.71,-0.97
2017-08-25,166.32,167.74,-1.42
2017-08-28,167.24,166.32,0.92


### Calculating 5-day Return
In this example, we create a new column called `'5 day return'`, which calculates the percentage return over the last 5 days. This is done by subtracting the price from 5 days ago (`shift(5)`) from the current price, dividing by the price 5 days ago, and multiplying by 100 to get the percentage.

**Key Features:**
* `f['Price'].shift(5)`: Shifts the price column by 5 days, allowing comparison with the price from 5 days ago.
* **Return Calculation:** `(Current Price - Price 5 Days Ago) / Price 5 Days Ago * 100` gives the percentage return over 5 days.
* **Track performance:** Useful for financial analysis, helping to understand short-term trends and stock performance.

In [8]:
# Calculates the 5-day return percentage
df['5 day return'] = (df['Price'] - df['Price'].shift(5))*100/df['Price'].shift(5)
df

Unnamed: 0_level_0,Price,Prev Day Price,Price Change,5 day return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2017-08-15,171.0,,,
2017-08-16,170.0,171.0,-1.0,
2017-08-17,166.91,170.0,-3.09,
2017-08-18,167.41,166.91,0.5,
2017-08-21,167.78,167.41,0.37,
2017-08-22,169.64,167.78,1.86,-0.795322
2017-08-23,168.71,169.64,-0.93,-0.758824
2017-08-24,167.74,168.71,-0.97,0.497274
2017-08-25,166.32,167.74,-1.42,-0.651096
2017-08-28,167.24,166.32,0.92,-0.32185


In [9]:
df = df[['Price']]
df

Unnamed: 0_level_0,Price
Date,Unnamed: 1_level_1
2017-08-15,171.0
2017-08-16,170.0
2017-08-17,166.91
2017-08-18,167.41
2017-08-21,167.78
2017-08-22,169.64
2017-08-23,168.71
2017-08-24,167.74
2017-08-25,166.32
2017-08-28,167.24


### Shifting the DataFrame Index Using `DateOffset`

In this example, we first set the index of the DataFrame to a date range starting from August 15, 2017, with 10 business days. Then, we shift the entire index by one day using `pd.DateOffset(1)`.

In [17]:
# Displays the current index of the DataFrame
df.index

DatetimeIndex(['2017-08-16', '2017-08-17', '2017-08-18', '2017-08-19',
               '2017-08-22', '2017-08-23', '2017-08-24', '2017-08-25',
               '2017-08-26', '2017-08-29'],
              dtype='datetime64[ns]', freq=None)

In [11]:
# Creates a DateTimeIndex with 10 business days
df.index = pd.date_range(start='2017-08-15', periods=10, freq='B')
df

Unnamed: 0,Price
2017-08-15,171.0
2017-08-16,170.0
2017-08-17,166.91
2017-08-18,167.41
2017-08-21,167.78
2017-08-22,169.64
2017-08-23,168.71
2017-08-24,167.74
2017-08-25,166.32
2017-08-28,167.24


In [12]:
# Displays the current index
df.index

DatetimeIndex(['2017-08-15', '2017-08-16', '2017-08-17', '2017-08-18',
               '2017-08-21', '2017-08-22', '2017-08-23', '2017-08-24',
               '2017-08-25', '2017-08-28'],
              dtype='datetime64[ns]', freq='B')

In [15]:
# Shifts the index forward by 1 day
df.index = df.index + pd.DateOffset(1)

In [16]:
df

Unnamed: 0,Price
2017-08-16,171.0
2017-08-17,170.0
2017-08-18,166.91
2017-08-19,167.41
2017-08-22,167.78
2017-08-23,169.64
2017-08-24,168.71
2017-08-25,167.74
2017-08-26,166.32
2017-08-29,167.24
