## Applying custom functions to analyze time series data

In [1]:
import numpy as np
import pandas as pd
from IPython.display import display
from openbb import obb

In [2]:
obb.user.preferences.output_type = "dataframe"

Fetches historical price data for the equity "AAPL" using the "yfinance" provider and stores it in 'df'

In [3]:
df = obb.equity.price.historical("AAPL", provider="yfinance")

In [4]:
display(df)

Unnamed: 0_level_0,open,high,low,close,volume,split_ratio,dividend
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2023-06-15,183.960007,186.520004,183.779999,186.009995,65433200,0.0,0.0
2023-06-16,186.729996,186.990005,184.270004,184.919998,101235600,0.0,0.0
2023-06-20,184.410004,186.100006,184.410004,185.009995,49799100,0.0,0.0
2023-06-21,184.899994,185.410004,182.589996,183.960007,49515700,0.0,0.0
2023-06-22,183.740005,187.050003,183.669998,187.000000,51245300,0.0,0.0
...,...,...,...,...,...,...,...
2024-06-10,196.899994,197.300003,192.149994,193.119995,97262100,0.0,0.0
2024-06-11,193.649994,207.160004,193.630005,207.149994,172373300,0.0,0.0
2024-06-12,207.369995,220.199997,206.899994,213.070007,198134300,0.0,0.0
2024-06-13,214.740005,216.750000,211.600006,214.240005,97862700,0.0,0.0


Applies a lambda function to calculate the difference between the high and low prices for each row

In [5]:
df.apply(lambda x: x["high"] - x["low"], axis=1)

date
2023-06-15     2.740005
2023-06-16     2.720001
2023-06-20     1.690002
2023-06-21     2.820007
2023-06-22     3.380005
                ...    
2024-06-10     5.150009
2024-06-11    13.529999
2024-06-12    13.300003
2024-06-13     5.149994
2024-06-14     3.869995
Length: 252, dtype: float64

Defines a function 'fcn' that calculates the difference between the high and low prices for each row

In [6]:
def fcn(row):
    return row["high"] - row["low"]

Applies the 'fcn' function to each row of 'df'

In [7]:
df.apply(fcn, axis=1)

date
2023-06-15     2.740005
2023-06-16     2.720001
2023-06-20     1.690002
2023-06-21     2.820007
2023-06-22     3.380005
                ...    
2024-06-10     5.150009
2024-06-11    13.529999
2024-06-12    13.300003
2024-06-13     5.149994
2024-06-14     3.869995
Length: 252, dtype: float64

Adds a new column 'valid' to 'df' that checks if the 'close' price is between 'low' and 'high' prices for each row

In [8]:
df["valid"] = df.apply(lambda x: x["low"] <= x["close"] <= x["high"], axis=1)

In [9]:
display(df)

Unnamed: 0_level_0,open,high,low,close,volume,split_ratio,dividend,valid
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2023-06-15,183.960007,186.520004,183.779999,186.009995,65433200,0.0,0.0,True
2023-06-16,186.729996,186.990005,184.270004,184.919998,101235600,0.0,0.0,True
2023-06-20,184.410004,186.100006,184.410004,185.009995,49799100,0.0,0.0,True
2023-06-21,184.899994,185.410004,182.589996,183.960007,49515700,0.0,0.0,True
2023-06-22,183.740005,187.050003,183.669998,187.000000,51245300,0.0,0.0,True
...,...,...,...,...,...,...,...,...
2024-06-10,196.899994,197.300003,192.149994,193.119995,97262100,0.0,0.0,True
2024-06-11,193.649994,207.160004,193.630005,207.149994,172373300,0.0,0.0,True
2024-06-12,207.369995,220.199997,206.899994,213.070007,198134300,0.0,0.0,True
2024-06-13,214.740005,216.750000,211.600006,214.240005,97862700,0.0,0.0,True


Filters 'df' to create 'ddf' that contains rows where 'valid' is False

In [10]:
ddf = df[df.valid == False]

In [11]:
display(ddf)

Unnamed: 0_level_0,open,high,low,close,volume,split_ratio,dividend,valid
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1


Defines a function 'calculate_range' that calculates the range between 'high' and 'low' prices and returns it if it exceeds a threshold, otherwise returns NaN

In [12]:
def calculate_range(row, high_col, low_col, threshold):
    range = row[high_col] - row[low_col]
    return range if range > threshold else np.nan

Sets a threshold value

In [13]:
threshold = 1.5

Applies the 'calculate_range' function to each row of 'df' to calculate the range and adds it as a new column 'range'

In [14]:
df["range"] = df.apply(calculate_range, args=("high", "low", threshold), axis=1)

In [15]:
display(df)

Unnamed: 0_level_0,open,high,low,close,volume,split_ratio,dividend,valid,range
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2023-06-15,183.960007,186.520004,183.779999,186.009995,65433200,0.0,0.0,True,2.740005
2023-06-16,186.729996,186.990005,184.270004,184.919998,101235600,0.0,0.0,True,2.720001
2023-06-20,184.410004,186.100006,184.410004,185.009995,49799100,0.0,0.0,True,1.690002
2023-06-21,184.899994,185.410004,182.589996,183.960007,49515700,0.0,0.0,True,2.820007
2023-06-22,183.740005,187.050003,183.669998,187.000000,51245300,0.0,0.0,True,3.380005
...,...,...,...,...,...,...,...,...,...
2024-06-10,196.899994,197.300003,192.149994,193.119995,97262100,0.0,0.0,True,5.150009
2024-06-11,193.649994,207.160004,193.630005,207.149994,172373300,0.0,0.0,True,13.529999
2024-06-12,207.369995,220.199997,206.899994,213.070007,198134300,0.0,0.0,True,13.300003
2024-06-13,214.740005,216.750000,211.600006,214.240005,97862700,0.0,0.0,True,5.149994


**Jason Strimpel** is the founder of <a href='https://pyquantnews.com/'>PyQuant News</a> and co-founder of <a href='https://www.tradeblotter.io/'>Trade Blotter</a>. His career in algorithmic trading spans 20+ years. He previously traded for a Chicago-based hedge fund, was a risk manager at JPMorgan, and managed production risk technology for an energy derivatives trading firm in London. In Singapore, he served as APAC CIO for an agricultural trading firm and built the data science team for a global metals trading firm. Jason holds degrees in Finance and Economics and a Master's in Quantitative Finance from the Illinois Institute of Technology. His career spans America, Europe, and Asia. He shares his expertise through the <a href='https://pyquantnews.com/subscribe-to-the-pyquant-newsletter/'>PyQuant Newsletter</a>, social media, and has taught over 1,000+ algorithmic trading with Python in his popular course **<a href='https://gettingstartedwithpythonforquantfinance.com/'>Getting Started With Python for Quant Finance</a>**. All code is for educational purposes only. Nothing provided here is financial advise. Use at your own risk.