## Introduction:

The Naive Bayes Classifier can be applied to the financial data regarding a publicly traded company. The idea is to create a model that can assist in deciding whether to purchase a stock before earnings are released before market open the following day. What this classifier helps us determine is the probability that Stock A will increase by y%, given that a certain amount of events x have already occured. For example, assume that Stock A releases their earnings the morning after market open and at 3:55 PM an investor needs to make a decision on whether to buy or not buy. The investor could use this model to see what
scenario is most probable. The question that the model will answer is:

"What is the probability that Chegg's stock price will increase by at least 10%, at least once, before 12 pm of the following trading day, given that events x1 through x6 occur:

- x1 = Earnings report is released after the market closes
        -BMO
        -AMC
- x1 = Adjusted Stock Price is larger than when the previous earnings report was released 
- x2 = Current earnings per share (EPS) is greater than the EPS from the previous earnings report
- x3 = EPS Surprise percentage is positive
- x6 = Intraday return of the stock, on the day earnings are released, is greater than or equal to the median intraday return of the last 5 trading days
- x5 = Intraday return of the SPY etf is  





## Stating The Hypothesis

Null Hypothesis:

The percentage return of Stock, one full trading day after earnings are released, will be at least 5%, given that events x1 through x6 occured

Alternative Hypothesis:

The percentage return of Stock, one full trading day after earnings are released, will NOT be at least 5%, given that events x1 through x6 occured


## Explaining Events

The first event is necessary for this analysis to occur because the analysis is assuming that a earnings report has been released after market open.

The first event could be telling us that the market thinks that the company is going to perform well in the upcoming earnings report, and it reflects that through the price. The market capitalization was used because it takes into account stock-splits. Moreover, the first event could also signal that the market is overvaluing the future performance of the company, potentially making it a relatively bad time to buy the stock. 

The second event informs the investor that that growth is currently in place or that the company is becoming more efficient by decreasing their cost of goods/expenses. Both could lead to an increase interest in the company and drive the stock price up based on demand.   

The third event gives us a better understanding of how incorrect the market is at estimating the performance of a company. The higher the percentage error, also known as the surprise percentage, the better, because it signals under-valuation. During these periods, the investor has the opportunity to purchase a stock for the low and then sell at a 'fair' price.

The fourth event tells investors that the company has a relatively bright financial future. It is common to see the stock price of a company increase even after a poor earnings report because the guidance given by the company leaders was positive. At the end of the day, investing is always forward looking and people are willing to pay a premium for that.

The fifth event informs the investor on how the overall market is performing in the span of one week. The S&P 500 is index that tracks slightly over 500 publicly traded companies in the United States. This index is used as a benchmark to gage how the overall market is performing. If by 3:00 PM, the return on the S&P 500 was higher than that of last week, it could potentially signal that the market has a positive outlook or that a recorrection is taking place.

The sixth event shows that the market has a positive sentiment regarding earnings. It is common to see a stock's intraday return fluctulate throughout the week and it is possible that one of those days, the market overvalued the stock and increased the average intraday return. Thus, the median was used to  better reflect the distribution of the most recent intraday returns. 

## Gathering Financial Data

The majority of the financial data will be pulled from alphavantage's financial APIs. The first point of interest is to know how the alphavantage stores information regarding what publicly traded companies are releasing earnings in a given time period.


If model is used the day of then the api will not get in fresh data unfortunatetly


#######
Explain the thought process of how to go about getting the data to prove your hypothesis. Explain that you will be using the naive bayes that deals with binary values for the 'independent' variables. Explain the Yahoo Finance API and how to pull data from it. Explain the prunning process and how the model can be easily tested. Explain why its called Naive Bayes (Assumption that the x variables are independent of each other is naive to think because that rarely happpens in real life. The events listed above could potentially be corrrelated to one another -  this will be analyed in the descriptive section using scatter plots. Ideally, the benchmark probability should be higher than a coin toss, meaning that the model should be able to predict this null hypothesis at least 51% of the time. Knowing this probability is useful because a simulation can be dervived from it (Monte Carlo)

###  1.0. Logic of finhubEarningsCalendar Function

finhubEarningsCalendar will be used to pull most recent market data to make a decision on whether to buy or not to buy a certain stock one trading day after earnings have been reported. 

In [1]:
import finnhub
finnhub_client = finnhub.Client(api_key="c5d5iiqad3i9ue38pn9g")

finub_earn_calendar = finnhub_client.earnings_calendar(_from="2021-10-15", to="2021-10-15", symbol="", international=False)
finub_earn_calendar 

{'earningsCalendar': [{'date': '2021-10-15',
   'epsActual': 0.97,
   'epsEstimate': 0.5165443000000001,
   'hour': 'bmo',
   'quarter': 3,
   'revenueActual': 1037281000,
   'revenueEstimate': 1044079749,
   'symbol': 'PLD',
   'year': 2021},
  {'date': '2021-10-15',
   'epsActual': 0.68,
   'epsEstimate': None,
   'hour': 'bmo',
   'quarter': 4,
   'revenueActual': 126062000,
   'revenueEstimate': None,
   'symbol': 'CLPS',
   'year': 2021},
  {'date': '2021-10-15',
   'epsActual': None,
   'epsEstimate': None,
   'hour': 'bmo',
   'quarter': 3,
   'revenueActual': None,
   'revenueEstimate': None,
   'symbol': 'HIVE',
   'year': 2021},
  {'date': '2021-10-15',
   'epsActual': None,
   'epsEstimate': None,
   'hour': 'amc',
   'quarter': 3,
   'revenueActual': None,
   'revenueEstimate': None,
   'symbol': 'ICCM',
   'year': 2021},
  {'date': '2021-10-15',
   'epsActual': None,
   'epsEstimate': None,
   'hour': 'bmo',
   'quarter': 2,
   'revenueActual': None,
   'revenueEstimate': 

In [2]:
finub_earn_calendar = finub_earn_calendar['earningsCalendar']
finub_earn_calendar

[{'date': '2021-10-15',
  'epsActual': 0.97,
  'epsEstimate': 0.5165443000000001,
  'hour': 'bmo',
  'quarter': 3,
  'revenueActual': 1037281000,
  'revenueEstimate': 1044079749,
  'symbol': 'PLD',
  'year': 2021},
 {'date': '2021-10-15',
  'epsActual': 0.68,
  'epsEstimate': None,
  'hour': 'bmo',
  'quarter': 4,
  'revenueActual': 126062000,
  'revenueEstimate': None,
  'symbol': 'CLPS',
  'year': 2021},
 {'date': '2021-10-15',
  'epsActual': None,
  'epsEstimate': None,
  'hour': 'bmo',
  'quarter': 3,
  'revenueActual': None,
  'revenueEstimate': None,
  'symbol': 'HIVE',
  'year': 2021},
 {'date': '2021-10-15',
  'epsActual': None,
  'epsEstimate': None,
  'hour': 'amc',
  'quarter': 3,
  'revenueActual': None,
  'revenueEstimate': None,
  'symbol': 'ICCM',
  'year': 2021},
 {'date': '2021-10-15',
  'epsActual': None,
  'epsEstimate': None,
  'hour': 'bmo',
  'quarter': 2,
  'revenueActual': None,
  'revenueEstimate': None,
  'symbol': 'BPTS',
  'year': 2021},
 {'date': '2021-10-1

In [3]:
import pandas as pd
earningsCalendar = pd.DataFrame(data=finub_earn_calendar)
earningsCalendar


Unnamed: 0,date,epsActual,epsEstimate,hour,quarter,revenueActual,revenueEstimate,symbol,year
0,2021-10-15,0.97,0.516544,bmo,3,1037281000.0,1044080000.0,PLD,2021
1,2021-10-15,0.68,,bmo,4,126062000.0,,CLPS,2021
2,2021-10-15,,,bmo,3,,,HIVE,2021
3,2021-10-15,,,amc,3,,,ICCM,2021
4,2021-10-15,,,bmo,2,,,BPTS,2021
5,2021-10-15,0.03,,bmo,2,7926000.0,,RMCF,2022
6,2021-10-15,0.85,0.811363,bmo,3,344287000.0,332512000.0,SXT,2021
7,2021-10-15,3.3,3.236545,bmo,3,5197000000.0,5106017000.0,PNC,2021
8,2021-10-15,14.93,10.480816,bmo,3,13608000000.0,11853610000.0,GS,2021
9,2021-10-15,-0.07,-0.1224,amc,3,45680.0,507499.0,QBIO,2021


In [6]:
# column_names = ['date', 'epsActual','epsEstimate','hour','symbol']
earningsCalendar = earningsCalendar.loc[:,['date', 'epsActual','epsEstimate','hour','symbol']]
earningsCalendar

Unnamed: 0,date,epsActual,epsEstimate,hour,symbol
0,2021-10-15,0.97,0.516544,bmo,PLD
1,2021-10-15,0.68,,bmo,CLPS
2,2021-10-15,,,bmo,HIVE
3,2021-10-15,,,amc,ICCM
4,2021-10-15,,,bmo,BPTS
5,2021-10-15,0.03,,bmo,RMCF
6,2021-10-15,0.85,0.811363,bmo,SXT
7,2021-10-15,3.3,3.236545,bmo,PNC
8,2021-10-15,14.93,10.480816,bmo,GS
9,2021-10-15,-0.07,-0.1224,amc,QBIO


In [7]:
#change the column names:
earningsCalendar.rename(columns = {'epsEstimate':'estimatedEPS'},inplace=True)
earningsCalendar.rename(columns = {'date':'reportedDate'},inplace=True)
earningsCalendar.rename(columns = {'epsActual':'reportedEPS'},inplace=True)

#change the positions of the columns:
#list of the column names



column_names = ["symbol", "reportedDate", "hour","estimatedEPS","reportedEPS"]

#reoder the dataframe
earningsCalendar = earningsCalendar.reindex(columns=column_names)
earningsCalendar


# 		reportedEPS	estimatedEPS	surprise	surprisePercentage

Unnamed: 0,symbol,reportedDate,hour,estimatedEPS,reportedEPS
0,PLD,2021-10-15,bmo,0.516544,0.97
1,CLPS,2021-10-15,bmo,,0.68
2,HIVE,2021-10-15,bmo,,
3,ICCM,2021-10-15,amc,,
4,BPTS,2021-10-15,bmo,,
5,RMCF,2021-10-15,bmo,,0.03
6,SXT,2021-10-15,bmo,0.811363,0.85
7,PNC,2021-10-15,bmo,3.236545,3.3
8,GS,2021-10-15,bmo,10.480816,14.93
9,QBIO,2021-10-15,amc,-0.1224,-0.07


In [8]:
#filter out the nones, of which can be found in the following columns:
earningsCalendar =  earningsCalendar.loc[(earningsCalendar['estimatedEPS'].notnull())]
earningsCalendar =  earningsCalendar.loc[(earningsCalendar['reportedEPS'].notnull())]

earningsCalendar

Unnamed: 0,symbol,reportedDate,hour,estimatedEPS,reportedEPS
0,PLD,2021-10-15,bmo,0.516544,0.97
6,SXT,2021-10-15,bmo,0.811363,0.85
7,PNC,2021-10-15,bmo,3.236545,3.3
8,GS,2021-10-15,bmo,10.480816,14.93
9,QBIO,2021-10-15,amc,-0.1224,-0.07
10,KARO,2021-10-15,bmo,4.3656,3.85
11,SCHW,2021-10-15,bmo,0.828169,0.84
12,UNTY,2021-10-15,bmo,0.82467,0.9
13,JBHT,2021-10-15,bmo,1.809857,1.88
14,LOOP,2021-10-15,bmo,-0.2856,-0.19


In [9]:
## add surprise and surprise percentage column

#add the surprise amount column
earningsCalendar['surprise'] = earningsCalendar['reportedEPS'] - earningsCalendar['estimatedEPS'] 

#add the surprise perecentage column

earningsCalendar['surprisePercentage'] = round(((earningsCalendar['reportedEPS'] /earningsCalendar['estimatedEPS'])-1)*100,4)

earningsCalendar

Unnamed: 0,symbol,reportedDate,hour,estimatedEPS,reportedEPS,surprise,surprisePercentage
0,PLD,2021-10-15,bmo,0.516544,0.97,0.453456,87.7864
6,SXT,2021-10-15,bmo,0.811363,0.85,0.038637,4.7619
7,PNC,2021-10-15,bmo,3.236545,3.3,0.063455,1.9606
8,GS,2021-10-15,bmo,10.480816,14.93,4.449184,42.4507
9,QBIO,2021-10-15,amc,-0.1224,-0.07,0.0524,-42.8105
10,KARO,2021-10-15,bmo,4.3656,3.85,-0.5156,-11.8105
11,SCHW,2021-10-15,bmo,0.828169,0.84,0.011831,1.4286
12,UNTY,2021-10-15,bmo,0.82467,0.9,0.07533,9.1346
13,JBHT,2021-10-15,bmo,1.809857,1.88,0.070143,3.8756
14,LOOP,2021-10-15,bmo,-0.2856,-0.19,0.0956,-33.4734


###  1.1. Creating finhubEarningsCalendar Function

This function can only return historical earnings data.

In [2]:
#this function pulls earnings data from finhub and then cleans it so that it can be used in the analysis
#this function only works for past data and does not give upcoming earnings, that is why the alphaVantage api was used in conjuction
#earnings calendar should be different than the alpha because it is more up to date and gives BMO o AMC datapoints
#inputs include:
#date range
#symbol - if no symbol is given, api will return all the stocks that are releasing earnings on that date

def finhubEarningsCalendar(finhub_api,from_date,to_date,symbol):
    
    
    #gathering the earnings calendar data:
    
    #import the library
    import finnhub
      
    #create an instance of the finhub library that contains API key
    finnhub_client = finnhub.Client(api_key=finhub_api)
    
    #get the stock earnings data that from the specified date range
    #this returns a nested dictionary that needs to be pulled apart 
    finub_earn_calendar = finnhub_client.earnings_calendar(_from=from_date, to=to_date, symbol=symbol, international=False)
    
    #the earningsCalendar key contains the values neede for the analysis
    finub_earn_calendar = finub_earn_calendar['earningsCalendar']
    
    #import pandas library
    import pandas as pd
    
    #convert the dictionary into a readable dataframe
    earningsCalendar = pd.DataFrame(data=finub_earn_calendar)
    
    
    #editing the dataframe so that it can be compatible with the alphaVantage dataframes and to use in the model:
    
    #filter out the following columns: quarter, revenueActual, revenueEstimate, and year:
    earningsCalendar.loc[:,['date', 'epsActual','epsEstimate','hour','symbol']]
    
    #change the column names:
    earningsCalendar.rename(columns = {'epsEstimate':'estimatedEPS'},inplace=True)
    earningsCalendar.rename(columns = {'date':'reportedDate'},inplace=True)
    earningsCalendar.rename(columns = {'epsActual':'reportedEPS'},inplace=True)
    
    #change the positions of the columns:
    #list of the column names
    column_names = ["symbol", "reportedDate", "hour","reportedEPS","estimatedEPS"]
    
    #reoder the dataframe
    earningsCalendar = earningsCalendar.reindex(columns=column_names)  
    
    #filter out the nones, of which can be found in the following columns:
    earningsCalendar =  earningsCalendar.loc[(earningsCalendar['estimatedEPS'].notnull())]
    earningsCalendar =  earningsCalendar.loc[(earningsCalendar['reportedEPS'].notnull())]
    
    
    #add the surprise amount column
    earningsCalendar['surprise'] = earningsCalendar['reportedEPS'] - earningsCalendar['estimatedEPS'] 

    #add the surprise perecentage column
    earningsCalendar['surprisePercentage'] = round(((earningsCalendar['reportedEPS'] /earningsCalendar['estimatedEPS'])-1)*100,4)
    
    
    #return the dataframe
    return earningsCalendar

###  1.2. Testing finhubEarningsCalendar Function (All Earnings)


In [37]:
#historical earnings only
from_date = '2021-10-18'
to_date = '2021-10-18'
symbol = ''
finhub_api = 'c5d5iiqad3i9ue38pn9g'
print(symbol)
finhubTest1 = finhubEarningsCalendar(finhub_api,from_date,to_date,symbol)
finhubTest1




Unnamed: 0,symbol,reportedDate,hour,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,ACI,2021-10-18,bmo,0.64,0.264953,0.375047,141.552
2,STT,2021-10-18,bmo,2.0,1.934908,0.065092,3.3641
10,GNTY,2021-10-18,bmo,0.76,0.7854,-0.0254,-3.234


###  1.3. Testing finhubEarningsCalendar Function (Individual Earnings)


In [12]:
#specific stock
from_date = '2021-10-18'
to_date = '2021-10-18'
symbol = ''

finhubTest2 = finhubEarningsCalendar(from_date,to_date,symbol)
finhubTest2

Unnamed: 0,symbol,reportedDate,hour,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,GS,2021-10-15,bmo,14.93,10.480816,4.449184,42.4507


### 2.0. Logic of alphaDataRetriever Function

alphaVantage has data regarding historical earnings and daily adjusted price history for a specific stock. This function is broken down into 3: earnings calendar, historical earnings (specific company), and daily adjusted prices (specific company). The earnings calendar provides earnings dates for the next three months, along with the market's estimated Earnings Per Share (EPS) for each stock. Moreover, historical earnings returns data regarding past estimates and reported values for a specific stock. Finally, the daily adjusted prices section returns the most up to date stock prices that take into account any stock splits. 

In [6]:

#import necessary libraries
import pandas as pd
import csv
import requests

#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '1'
symbol = 'N/A'
date_of_interest = '2021-10-18'
# if data_of_interest == '1':
        
CSV_URL = 'https://www.alphavantage.co/query?function=EARNINGS_CALENDAR&horizon=3month&apikey=' + api_key

with requests.Session() as s:

    download = s.get(CSV_URL)
    decoded_content = download.content.decode('utf-8')
    cr = csv.reader(decoded_content.splitlines(), delimiter=',')
    earnings_calendar = pd.DataFrame(data =cr)

#edit the dataframe:
#make the first row the column names
earnings_calendar = earnings_calendar.rename(columns=earnings_calendar.iloc[0])

#delete the first row since it has been used already to name the columns
earnings_calendar = earnings_calendar.iloc[1:,:]

#filter by the date of interest and loc the dataframe
ec = earnings_calendar.loc[earnings_calendar.reportDate == date_of_interest]

#not all stocks have an EPS estimate, these should be excluded from the dataframe
#dataframe = dataframe[dataframe.column then the condition]
ec = ec[ec.estimate != '']

#not all stocks are traded in USD, but this project will only focus on the USD currency
ec = ec[ec.currency == 'USD']

#remove the irrelevant columns
ec = ec.drop(['fiscalDateEnding','currency'],1)

#display dataframe
display(ec)

Unnamed: 0,symbol,name,reportDate,estimate


In [14]:
#import necessary libraries
import pandas as pd
import csv
import requests

#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '2'
symbol = 'GS'


#format of how the url should be structured before using the api
earnings_history_url = 'https://www.alphavantage.co/query?function=EARNINGS&symbol=' + symbol +'&apikey=' + api_key

#makes the request using the formatted url
r = requests.get(earnings_history_url)
data = r.json()

#the data variable stores the data in json structure and the value of key = quarterlyEarnings is the point of interest
quarterly_earnings = data['quarterlyEarnings']

#convert this json into a dataframe
quarterly_earnings = pd.DataFrame(quarterly_earnings)

quarterly_earnings = quarterly_earnings.drop('fiscalDateEnding',1)
print('This is the historical earnings table')
print()
display(quarterly_earnings)

This is the historical earnings table



Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-15,14.93,10.011,4.919,49.136
1,2021-07-13,15.02,10.1816,4.8384,47.521
2,2021-04-14,18.6,10.1861,8.4139,82.6018
3,2021-01-19,12.5471,7.3621,5.185,70.4283
4,2020-10-14,9.68,5.5047,4.1753,75.8497
...,...,...,...,...,...
85,2000-06-20,1.48,1.4,0.08,5.7143
86,2000-03-21,1.76,1.48,0.28,18.9189
87,1999-12-21,1.48,1.22,0.26,21.3115
88,1999-09-21,1.32,1.1,0.22,20


In [16]:
#getting the time series data
#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '3'
symbol = 'GS'

#this url contains the format needed to make the api call for time series data
daily_prices_url = 'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=' + symbol + '&outputsize=full' +'&apikey=' + api_key

#makes the api request/call
daily_price_request = requests.get(daily_prices_url)

#stores the values into a dictionary
daily_prices = daily_price_request.json()

#make a copy
daily_prices_copy = daily_prices.copy()

print(daily_prices_copy)

{'Meta Data': {'1. Information': 'Daily Time Series with Splits and Dividend Events', '2. Symbol': 'GS', '3. Last Refreshed': '2021-10-15', '4. Output Size': 'Full size', '5. Time Zone': 'US/Eastern'}, 'Time Series (Daily)': {'2021-10-15': {'1. open': '402.31', '2. high': '407.272', '3. low': '396.34', '4. close': '406.07', '5. adjusted close': '406.07', '6. volume': '6451387', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-14': {'1. open': '392.0', '2. high': '393.66', '3. low': '382.2', '4. close': '391.2', '5. adjusted close': '391.2', '6. volume': '3380905', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-13': {'1. open': '388.0', '2. high': '388.3809', '3. low': '378.625', '4. close': '386.31', '5. adjusted close': '386.31', '6. volume': '2392607', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-12': {'1. open': '386.22', '2. high': '388.59', '3. low': '381.02', '4. close': '386.53', '5. adjusted close': '

In [18]:
#stores the relevant information into a dictionary
dictionary_of_dp = daily_prices_copy['Time Series (Daily)']

#relevant data
print(dictionary_of_dp)

{'2021-10-15': {'1. open': '402.31', '2. high': '407.272', '3. low': '396.34', '4. close': '406.07', '5. adjusted close': '406.07', '6. volume': '6451387', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-14': {'1. open': '392.0', '2. high': '393.66', '3. low': '382.2', '4. close': '391.2', '5. adjusted close': '391.2', '6. volume': '3380905', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-13': {'1. open': '388.0', '2. high': '388.3809', '3. low': '378.625', '4. close': '386.31', '5. adjusted close': '386.31', '6. volume': '2392607', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-12': {'1. open': '386.22', '2. high': '388.59', '3. low': '381.02', '4. close': '386.53', '5. adjusted close': '386.53', '6. volume': '2088229', '7. dividend amount': '0.0000', '8. split coefficient': '1.0'}, '2021-10-11': {'1. open': '392.9', '2. high': '396.95', '3. low': '385.02', '4. close': '385.24', '5. adjusted close': '385.24',

In [19]:
#convert this json into a dataframe, where the keys are the rows and the values are the columns 
historical_prices = pd.DataFrame.from_dict(dictionary_of_dp, orient = 'index')
print('This is the data that is provided by the API')
print()
display(historical_prices)

This is the data that is provided by the API



Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient
2021-10-15,402.31,407.272,396.34,406.07,406.07,6451387,0.0000,1.0
2021-10-14,392.0,393.66,382.2,391.2,391.2,3380905,0.0000,1.0
2021-10-13,388.0,388.3809,378.625,386.31,386.31,2392607,0.0000,1.0
2021-10-12,386.22,388.59,381.02,386.53,386.53,2088229,0.0000,1.0
2021-10-11,392.9,396.95,385.02,385.24,385.24,1787360,0.0000,1.0
...,...,...,...,...,...,...,...,...
1999-11-05,72.0,74.31,71.25,73.69,57.0430566638,1304900,0.0000,1.0
1999-11-04,69.69,70.75,69.44,70.13,54.2872786515,570400,0.0000,1.0
1999-11-03,69.81,69.88,68.31,68.75,53.2190276243,924000,0.0000,1.0
1999-11-02,69.19,71.75,69.13,69.75,53.9931225716,543000,0.0000,1.0


In [20]:
#create the columns that will be used to derive the values of the intraday event

#turn column values into numeric values
historical_prices = historical_prices.apply(pd.to_numeric)

#perform return calculations for each specific date
historical_prices['Intraday_Returns'] = round((historical_prices['5. adjusted close']/historical_prices['1. open']-1)*100,4)

#find the previous 5 day median intraday return and create the column 
#the value will take into account the previous 5 trading days
historical_prices['Weekly_Median_Return'] = historical_prices['Intraday_Returns'].rolling(window=5).median().shift(-5)

#the indexes are the dates thus make that a column labeled reportedDate because that will be the key column for the inner join
historical_prices['trading_date'] = historical_prices.index

#reset the index since its a column now
historical_prices.reset_index(drop=True, inplace=True)

print('This is the dataframe that has the Intraday_Returns & Weekly_Median_Return columns that will be used to calculate intraday events for stock A and stock SPY')
print()

#remember that the model could break if there is not enough historical data
display(historical_prices)


This is the dataframe that has the Intraday_Returns & Weekly_Median_Return columns that will be used to calculate intraday events for stock A and stock SPY



Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,402.31,407.2720,396.340,406.07,406.070000,6451387,0.0,1.0,0.9346,-0.2041,2021-10-15
1,392.00,393.6600,382.200,391.20,391.200000,3380905,0.0,1.0,-0.2041,-0.4356,2021-10-14
2,388.00,388.3809,378.625,386.31,386.310000,2392607,0.0,1.0,-0.4356,0.0803,2021-10-13
3,386.22,388.5900,381.020,386.53,386.530000,2088229,0.0,1.0,0.0803,0.5066,2021-10-12
4,392.90,396.9500,385.020,385.24,385.240000,1787360,0.0,1.0,-1.9496,0.5066,2021-10-11
...,...,...,...,...,...,...,...,...,...,...,...
5521,72.00,74.3100,71.250,73.69,57.043057,1304900,0.0,1.0,-20.7735,,1999-11-05
5522,69.69,70.7500,69.440,70.13,54.287279,570400,0.0,1.0,-22.1018,,1999-11-04
5523,69.81,69.8800,68.310,68.75,53.219028,924000,0.0,1.0,-23.7659,,1999-11-03
5524,69.19,71.7500,69.130,69.75,53.993123,543000,0.0,1.0,-21.9640,,1999-11-02


### 2.1. Creating alphaDataRetriever Function

In [9]:
#create a function that:
#individual earnings table for a specific stock
#daily adjusted prices for a specific stock:
#it also calculates the Intraday_Return and Weekly_Median_Return to be used later in the analysis section

#inputs include:
#type of data to retrieve where:
#1 = historical earnings
#2 = daily ajusted prices
#api key for alphaVantage


def alphaDataRetriever(api_key, data_of_interest, symbol,date_of_interest):
    #import necessary libraries
    import pandas as pd
    import csv
    import requests
    
    
    #remember that alphavantage only takes 5 calls per mintute and 500 throughout the entire day
    #for the earnings calendar call, symbol is not important

    if data_of_interest == '1':

        CSV_URL = 'https://www.alphavantage.co/query?function=EARNINGS_CALENDAR&horizon=3month&apikey=' + api_key

        with requests.Session() as s:

            download = s.get(CSV_URL)
            decoded_content = download.content.decode('utf-8')
            cr = csv.reader(decoded_content.splitlines(), delimiter=',')
            earnings_calendar = pd.DataFrame(data =cr)

        #edit the dataframe:
        #make the first row the column names
        earnings_calendar = earnings_calendar.rename(columns=earnings_calendar.iloc[0])

        #delete the first row since it has been used already to name the columns
        earnings_calendar = earnings_calendar.iloc[1:,:]

        #filter by the date of interest and loc the dataframe
        ec = earnings_calendar.loc[earnings_calendar.reportDate == date_of_interest]

        #not all stocks have an EPS estimate, these should be excluded from the dataframe
        #dataframe = dataframe[dataframe.column then the condition]
        ec = ec[ec.estimate != '']

        #not all stocks are traded in USD, but this project will only focus on the USD currency
        ec = ec[ec.currency == 'USD']

        #remove the irrelevant columns
        ec = ec.drop(['fiscalDateEnding','currency'],1)

        #display dataframe
        return ec


    
    elif data_of_interest == '2':

        #format of how the url should be structured before using the api
        earnings_history_url = 'https://www.alphavantage.co/query?function=EARNINGS&symbol=' + symbol +'&apikey=' + api_key

        #makes the request using the formatted url
        r = requests.get(earnings_history_url)
        data = r.json()

        #the data variable stores the data in json structure and the value of key = quarterlyEarnings is the point of interest
        quarterly_earnings = data['quarterlyEarnings']

        #convert this json into a dataframe
        quarterly_earnings = pd.DataFrame(quarterly_earnings)
        
        quarterly_earnings = quarterly_earnings.drop('fiscalDateEnding',1)
        
        return quarterly_earnings
    
    #this api call will return the most recent data avaialble
    
    elif data_of_interest == '3':
        #getting the time series data
        #this url contains the format needed to make the api call for time series data
        daily_prices_url = 'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=' + symbol + '&outputsize=full' +'&apikey=' + api_key

        #makes the api request/call
        daily_price_request = requests.get(daily_prices_url)

        #stores the values into a dictionary
        daily_prices = daily_price_request.json()
        
        #stores the relevant information into a dictionary
        dictionary_of_dp = daily_prices['Time Series (Daily)']

        #convert this json into a dataframe, where the keys are the rows and the values are the columns 
        historical_prices = pd.DataFrame.from_dict(dictionary_of_dp, orient = 'index')
        
        #turn column values into numeric values
        historical_prices = historical_prices.apply(pd.to_numeric)
    
        #perform return calculations for each specific date
        historical_prices['Intraday_Returns'] = round((historical_prices['5. adjusted close']/historical_prices['1. open']-1)*100,4)

        #find the previous 5 day median intraday return and create the column 
        #the value will take into account the previous 5 trading days
        historical_prices['Weekly_Median_Return'] = historical_prices['Intraday_Returns'].rolling(window=5).median().shift(-5)

        #the indexes are the dates thus make that a column labeled reportedDate because that will be the key column for the inner join
        historical_prices['trading_date'] = historical_prices.index
        
        #reset the index since its a column now
        historical_prices.reset_index(drop=True, inplace=True)
        
        #remember that the model could break if there is not enough historical data
        return historical_prices
    
    #tell the analyst that their value is not valid
    else:
        print('Wrong Input for the alphaDataRetriever - Try again.')

### 2.2. Testing alphaDataRetriever Function (Upcoming Earnings)

In [119]:
#inputs 
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '1'
symbol = 'GS'

#future dates
date_of_interest = '2021-11-24'

alphaDataRetriever_test_1 = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
alphaDataRetriever_test_1

Unnamed: 0,symbol,name,reportDate,estimate
1288,DE,Deere & Company,2021-11-24,4.03
2572,LAIX,LAIX Inc,2021-11-24,-0.44


In [87]:
date_test = alphaDataRetriever_test_1.iloc[0,2]
date_test

'2021-10-18'

### 2.3. Testing alphaDataRetriever Function (Earnings History)

In [23]:
#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '2'
symbol = 'GS'
date_of_interest = 'n/a'

alphaDataRetriever_test_2 = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
alphaDataRetriever_test_2

Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-15,14.93,10.011,4.919,49.136
1,2021-07-13,15.02,10.1816,4.8384,47.521
2,2021-04-14,18.6,10.1861,8.4139,82.6018
3,2021-01-19,12.5471,7.3621,5.185,70.4283
4,2020-10-14,9.68,5.5047,4.1753,75.8497
...,...,...,...,...,...
85,2000-06-20,1.48,1.4,0.08,5.7143
86,2000-03-21,1.76,1.48,0.28,18.9189
87,1999-12-21,1.48,1.22,0.26,21.3115
88,1999-09-21,1.32,1.1,0.22,20


### 2.4. Testing alphaDataRetriever Function (Daily Adjusted Prices)

In [24]:
#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '3'
date_of_interest='n/a'
symbol = 'GS'

alphaDataRetriever_test_3 = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
alphaDataRetriever_test_3

Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,402.31,407.2720,396.340,406.07,406.070000,6451387,0.0,1.0,0.9346,-0.2041,2021-10-15
1,392.00,393.6600,382.200,391.20,391.200000,3380905,0.0,1.0,-0.2041,-0.4356,2021-10-14
2,388.00,388.3809,378.625,386.31,386.310000,2392607,0.0,1.0,-0.4356,0.0803,2021-10-13
3,386.22,388.5900,381.020,386.53,386.530000,2088229,0.0,1.0,0.0803,0.5066,2021-10-12
4,392.90,396.9500,385.020,385.24,385.240000,1787360,0.0,1.0,-1.9496,0.5066,2021-10-11
...,...,...,...,...,...,...,...,...,...,...,...
5521,72.00,74.3100,71.250,73.69,57.043057,1304900,0.0,1.0,-20.7735,,1999-11-05
5522,69.69,70.7500,69.440,70.13,54.287279,570400,0.0,1.0,-22.1018,,1999-11-04
5523,69.81,69.8800,68.310,68.75,53.219028,924000,0.0,1.0,-23.7659,,1999-11-03
5524,69.19,71.7500,69.130,69.75,53.993123,543000,0.0,1.0,-21.9640,,1999-11-02


In [106]:
#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '3'
date_of_interest = 'n/a'
symbol = 'SPY'

alphaDataRetriever_test_SPY = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
alphaDataRetriever_test_SPY

Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,443.970000,447.550000,443.270000,447.190000,447.190000,61472157,0.0,1.0,0.7253,0.1081,2021-10-18
1,444.750000,446.260000,444.090000,445.870000,445.870000,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
2,439.080000,442.660000,438.580000,442.500000,442.500000,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
3,434.710000,436.050000,431.540000,435.180000,435.180000,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
4,435.670000,436.100000,432.780000,433.620000,433.620000,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
...,...,...,...,...,...,...,...,...,...,...,...
5522,138.625000,139.109299,136.781204,137.875000,91.859026,7431500,0.0,1.0,-33.7356,,1999-11-05
5523,136.750000,137.359299,135.765594,136.531204,90.963724,7907500,0.0,1.0,-33.4817,,1999-11-04
5524,136.000000,136.375000,135.125000,135.500000,90.276685,7222300,0.0,1.0,-33.6201,,1999-11-03
5525,135.968704,137.250000,134.593704,134.593704,89.672867,6516900,0.0,1.0,-34.0489,,1999-11-02


### 3.0. Logic of historicalEarningsTransformer Function

This function is used to clean the data so that it can be inputted into the naive bayes model.

In [29]:
#assuming that the date provided by the analyst is after market close on the reported date:

#first thing to do is make copies so that original dataframes don't get messed up

#import necessary libraries
import pandas as pd

#don't alter the dataframes that were used as inputs, instead, make a copy of each 
#this will be used to get the class variable of whether the stock price increased by 5% the day after earnings are released
#timing of the earnings release is not relevant for analysis

#this variable holds a stock's individual earnings history
earnings_hist = alphaDataRetriever_test_2.copy(deep=True)

#display the copy
display(earnings_hist)

Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-15,14.93,10.011,4.919,49.136
1,2021-07-13,15.02,10.1816,4.8384,47.521
2,2021-04-14,18.6,10.1861,8.4139,82.6018
3,2021-01-19,12.5471,7.3621,5.185,70.4283
4,2020-10-14,9.68,5.5047,4.1753,75.8497
...,...,...,...,...,...
85,2000-06-20,1.48,1.4,0.08,5.7143
86,2000-03-21,1.76,1.48,0.28,18.9189
87,1999-12-21,1.48,1.22,0.26,21.3115
88,1999-09-21,1.32,1.1,0.22,20


In [30]:
#for training purposes,only keep the rows that are relevant.
#the api feeds in the most current earnings and as a result, the historical prices will not line up
#let those rows be used for the testing portion, while the remaining events get analyzed
#no need for rows that don't contain information such as none. If none is found in the estimate, then the remaining 2 columns will be none

#if analyst chooses the reporting date of today, after market close: 

#keep the first cell in the first row because it contains the date of the most recent/upcoming earnings event
#this date will be used to gather several variables such as intraday event, SPY event, Adjusted Price event
report_date_of_interest = earnings_hist.iloc[0,0]

print(report_date_of_interest)
    

2021-10-15


In [31]:
#filter out the nones, of which can be found in the following columns:
earnings_hist =  earnings_hist.loc[(earnings_hist['reportedEPS']!= 'None')]
earnings_hist =  earnings_hist.loc[(earnings_hist['estimatedEPS']!= 'None')]
earnings_hist =  earnings_hist.loc[(earnings_hist['surprise']!= 'None')]

#delete this on 10/18
earnings_hist = earnings_hist.loc[1:,:]

earnings_hist

Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
1,2021-07-13,15.02,10.1816,4.8384,47.521
2,2021-04-14,18.6,10.1861,8.4139,82.6018
3,2021-01-19,12.5471,7.3621,5.185,70.4283
4,2020-10-14,9.68,5.5047,4.1753,75.8497
5,2020-07-15,6.26,3.9085,2.3515,60.1637
...,...,...,...,...,...
85,2000-06-20,1.48,1.4,0.08,5.7143
86,2000-03-21,1.76,1.48,0.28,18.9189
87,1999-12-21,1.48,1.22,0.26,21.3115
88,1999-09-21,1.32,1.1,0.22,20


In [32]:
#editing dataframes via renaming columns and removing unnessary columns:


#before joining the 3 dataframes, there needs to be a common column name
#change the name of the date columns to column_of_interest
earnings_hist.rename(columns = {'reportedDate':'column_of_interest'}, inplace = True)

#see the change of names
display(earnings_hist)



Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage
1,2021-07-13,15.02,10.1816,4.8384,47.521
2,2021-04-14,18.6,10.1861,8.4139,82.6018
3,2021-01-19,12.5471,7.3621,5.185,70.4283
4,2020-10-14,9.68,5.5047,4.1753,75.8497
5,2020-07-15,6.26,3.9085,2.3515,60.1637
...,...,...,...,...,...
85,2000-06-20,1.48,1.4,0.08,5.7143
86,2000-03-21,1.76,1.48,0.28,18.9189
87,1999-12-21,1.48,1.22,0.26,21.3115
88,1999-09-21,1.32,1.1,0.22,20


In [33]:
#this is where the finhub dataset will be used:
#the earnings data will be inputted into the earnings hist dataframe

#the function must have a parameter that takes in the finhub earnings function
#the function must have an if or depending on the date

from_date = '2021-10-15'
to_date = '2021-10-15'
symbol ='GS'
finhub = finhubEarningsCalendar(from_date,to_date,symbol)

#get only the stock that the analyst wants
finhub = finhub.loc[finhub['symbol'] == symbol] 

#rename so that merge is smooth
finhub.rename(columns = {'reportedDate':'column_of_interest','epsActual':'reportedEPS','estimate':'estimatedEPS'},inplace=True)

#recreate the column_of_interest	reportedEPS	estimatedEPS	surprise	surprisePercentage columnd of finhub 
#then add that row to the earnings hist and then merge the dataframes all together to get all 5 x variables

finhub = finhub.drop('hour',1)
finhub = finhub.drop('symbol',1)

finhub


Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-15,14.93,10.480816,4.449184,42.4507


In [34]:
#concat the dataframes so that the last 2 x variables can be calculated
earnings_hist = pd.concat([finhub,earnings_hist])
earnings_hist

Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-15,14.93,10.4808,4.44918,42.4507
1,2021-07-13,15.02,10.1816,4.8384,47.521
2,2021-04-14,18.6,10.1861,8.4139,82.6018
3,2021-01-19,12.5471,7.3621,5.185,70.4283
4,2020-10-14,9.68,5.5047,4.1753,75.8497
...,...,...,...,...,...
85,2000-06-20,1.48,1.4,0.08,5.7143
86,2000-03-21,1.76,1.48,0.28,18.9189
87,1999-12-21,1.48,1.22,0.26,21.3115
88,1999-09-21,1.32,1.1,0.22,20


In [35]:
#adding the 2 variables events:


#editing the merged dataframe:


#convert the following columns to numeric
earnings_hist["reportedEPS"] = pd.to_numeric(earnings_hist["reportedEPS"])
earnings_hist["estimatedEPS"] = pd.to_numeric(earnings_hist["estimatedEPS"])
earnings_hist["surprisePercentage"] = pd.to_numeric(earnings_hist["surprisePercentage"])


#add the column that determines if current earnings were greater than last time
earnings_hist['earnings_event'] = (earnings_hist['reportedEPS'] > earnings_hist['reportedEPS'].shift(-1)).astype(int)

#add the column that determines if the surprise variable is positive or not
earnings_hist['surprise_event'] = (earnings_hist['surprisePercentage'] > 0).astype(int)

display(earnings_hist)

Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage,earnings_event,surprise_event
0,2021-10-15,14.9300,10.480816,4.44918,42.4507,0,1
1,2021-07-13,15.0200,10.181600,4.8384,47.5210,0,1
2,2021-04-14,18.6000,10.186100,8.4139,82.6018,1,1
3,2021-01-19,12.5471,7.362100,5.185,70.4283,1,1
4,2020-10-14,9.6800,5.504700,4.1753,75.8497,1,1
...,...,...,...,...,...,...,...
85,2000-06-20,1.4800,1.400000,0.08,5.7143,0,1
86,2000-03-21,1.7600,1.480000,0.28,18.9189,1,1
87,1999-12-21,1.4800,1.220000,0.26,21.3115,1,1
88,1999-09-21,1.3200,1.100000,0.22,20.0000,1,1


In [120]:
most_recent = earnings_hist.iloc[0,0]
most_recent

'2021-10-15'

In [135]:
# import the datetime module
import datetime
  

format_value = '%Y-%m-%d'
  
# convert from string format to datetime format
datetime = datetime.datetime.strptime(most_recent, format_value)
  
# get the date from the datetime using date() 
# function
print(datetime.date())


2021-10-15


### 3.1. Creating the historicalEarningsTransformer Function


In [13]:
#the model needs to understand what date the user is inputting
def historicalEarningsTransformer(date_type,historical_earnings,finhub_function):
    
    #assuming that the date provided by the analyst is after market close on the reported date:

    #import necessary libraries
    import pandas as pd

    #don't alter the dataframes that were used as inputs, instead, make a copy of each:
    
    #this variable holds a stock's individual earnings history
    earnings_hist = historical_earnings.copy(deep=True)
    

    #for training purposes,only keep the rows that are relevant.
    #the api feeds in the most current earnings and as a result, the historical prices will not line up
    #let those rows be used for the testing portion, while the remaining events get analyzed
    #no need for rows that don't contain information such as none. If none is found in the estimate, then the remaining 2 columns will be none
   
    
    #if analyst chooses the reporting date of today, after market close: 
    
    if date_type == '1':
        
        #keep the first cell in the first row because it contains the date of the most recent/upcoming earnings event
        #this date will be used to gather several variables such as intraday event, SPY event, Adjusted Price event
        report_date_of_interest = earnings_hist.iloc[0,0]

        
        #create common column name so that the dataframes can be merged properly 
        #common name is equal to 'column_of_interest'
        earnings_hist.rename(columns = {'reportedDate':'column_of_interest'}, inplace = True)

        #filter out the nones, of which can be found in the following columns:

        #the first row will always be none because it is a future date or the current report date
        #this api does not update in real time
        earnings_hist =  earnings_hist.loc[(earnings_hist['reportedEPS']!= 'None')]
        earnings_hist =  earnings_hist.loc[(earnings_hist['estimatedEPS']!= 'None')]
        earnings_hist =  earnings_hist.loc[(earnings_hist['surprise']!= 'None')]
        earnings_hist =  earnings_hist.loc[(earnings_hist['surprisePercentage']!= 'None')]
        
        
        #drop the columns that don't align with the earnings_hist dataframe. This is needed to merge properly
        finhub_function = finhub_function.drop('hour',1)
        finhub_function = finhub_function.drop('symbol',1)
        
        #create common column name
        finhub_function.rename(columns = {'reportedDate':'column_of_interest'}, inplace = True)
        
        
        #concat the dataframes so that the surprise_event and earnings_event x variables can be calculated 
        earnings_hist = pd.concat([finhub_function,earnings_hist])
        
        #convert the following columns to numeric in order to do math operations
        earnings_hist["reportedEPS"] = pd.to_numeric(earnings_hist["reportedEPS"])
        earnings_hist["estimatedEPS"] = pd.to_numeric(earnings_hist["estimatedEPS"])
        earnings_hist["surprisePercentage"] = pd.to_numeric(earnings_hist["surprisePercentage"])


        #add the column that determines if current earnings were greater than last time
        earnings_hist['earnings_event'] = (earnings_hist['reportedEPS'] > earnings_hist['reportedEPS'].shift(-1)).astype(int)

        #add the column that determines if the surprise variable is positive or not
        earnings_hist['surprise_event'] = (earnings_hist['surprisePercentage'] > 0).astype(int)
        
        return earnings_hist

     
    elif date_type == '2':

        #create common column name so that the dataframes can be merged properly 
        #common name is equal to 'column_of_interest'
        earnings_hist.rename(columns = {'reportedDate':'column_of_interest'}, inplace = True)

        #filter out the nones, of which can be found in the following columns:

        #the first row will always be none because it is a future date or the current report date
        #this api does not update in real time
        earnings_hist =  earnings_hist.loc[(earnings_hist['reportedEPS']!= 'None')]
        earnings_hist =  earnings_hist.loc[(earnings_hist['estimatedEPS']!= 'None')]
        earnings_hist =  earnings_hist.loc[(earnings_hist['surprise']!= 'None')]
        earnings_hist =  earnings_hist.loc[(earnings_hist['surprisePercentage']!= 'None')]
        
        #convert the following columns to numeric in order to do math operations
        earnings_hist["reportedEPS"] = pd.to_numeric(earnings_hist["reportedEPS"])
        earnings_hist["estimatedEPS"] = pd.to_numeric(earnings_hist["estimatedEPS"])
        earnings_hist["surprisePercentage"] = pd.to_numeric(earnings_hist["surprisePercentage"])


        #add the column that determines if current earnings were greater than last time
        earnings_hist['earnings_event'] = (earnings_hist['reportedEPS'] > earnings_hist['reportedEPS'].shift(-1)).astype(int)

        #add the column that determines if the surprise variable is positive or not
        earnings_hist['surprise_event'] = (earnings_hist['surprisePercentage'] > 0).astype(int)
        
        return earnings_hist


    else:
        print('Invalid input. Please Try Again')



### 3.2. Testing the historicalEarningsTransformer Function (Day of Earnings Report)

In [128]:
#inputs for the historical earnings transformer function

#first, get the historical earnings for a specific stock
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '2'
symbol = 'ACI'
date_of_interest = 'n/a'

present_day_test = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
present_day_test

Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-18,,0.45,,
1,2021-07-29,0.78,0.5892,0.1908,32.3829
2,2021-04-26,-0.37,0.4848,-0.8548,-176.3201
3,2021-01-12,0.2,0.2783,-0.0783,-28.1351
4,2020-10-20,0.49,0.2217,0.2683,121.0194
5,2020-07-27,1.0,1.1927,-0.1927,-16.1566
6,2020-04-30,0.24,0.0,0.24,


In [129]:
from_date = '2021-10-18'
to_date = '2021-10-18'
symbol = 'ACI'

finhub_present_day = finhubEarningsCalendar(from_date,to_date,symbol)
finhub_present_day

TypeError: finhubEarningsCalendar() missing 1 required positional argument: 'symbol'

In [40]:
#let's assume that it is the day of earnings reporting:
#inputs
date_type = '1'
historicalEarningsTransformer_test1 = historicalEarningsTransformer(date_type,present_day_test,finhub_present_day)
historicalEarningsTransformer_test1

Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage,earnings_event,surprise_event
0,2021-10-15,14.9300,10.480816,4.44918,42.4507,0,1
1,2021-07-13,15.0200,10.181600,4.8384,47.5210,0,1
2,2021-04-14,18.6000,10.186100,8.4139,82.6018,1,1
3,2021-01-19,12.5471,7.362100,5.185,70.4283,1,1
4,2020-10-14,9.6800,5.504700,4.1753,75.8497,1,1
...,...,...,...,...,...,...,...
85,2000-06-20,1.4800,1.400000,0.08,5.7143,0,1
86,2000-03-21,1.7600,1.480000,0.28,18.9189,1,1
87,1999-12-21,1.4800,1.220000,0.26,21.3115,1,1
88,1999-09-21,1.3200,1.100000,0.22,20.0000,1,1


### 3.3. Testing the historicalEarningsTransformer Function (Future Date)


In [41]:
#inputs for the historical earnings transformer function

#first, get the historical earnings for a specific stock
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '2'
symbol = 'ACI'
date_of_interest = 'n/a'

future_day_test = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
future_day_test

Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-07-29,0.78,0.5892,0.1908,32.3829
1,2021-04-26,-0.37,0.4848,-0.8548,-176.3201
2,2021-01-12,0.2,0.2783,-0.0783,-28.1351
3,2020-10-20,0.49,0.2217,0.2683,121.0194
4,2020-07-27,1.0,1.1927,-0.1927,-16.1566
5,2020-04-30,0.24,0.0,0.24,


In [72]:
#inputs
date_type = '2'
symbol = 'ACI'
finhub123 = ''

historicalEarningsTransformer_ACI = historicalEarningsTransformer(date_type,future_day_test,finhub123)
historicalEarningsTransformer_ACI

Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage,earnings_event,surprise_event
0,2021-07-29,0.78,0.5892,0.1908,32.3829,1,1
1,2021-04-26,-0.37,0.4848,-0.8548,-176.3201,0,0
2,2021-01-12,0.2,0.2783,-0.0783,-28.1351,0,0
3,2020-10-20,0.49,0.2217,0.2683,121.0194,0,1
4,2020-07-27,1.0,1.1927,-0.1927,-16.1566,0,0


### 4.0. Logic of dailyAdustedPriceTransformer

In [76]:
#this section should probably be another function:
#inputs
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '3'
date_of_interest='n/a'
symbol = 'ACI'

alphaDataRetriever_ACI = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
alphaDataRetriever_ACI

Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,28.480,29.30,28.3000,28.56,28.560000,2270996,0.0,1.0,0.2809,0.8569,2021-10-15
1,27.780,29.01,27.7000,28.35,28.350000,1440144,0.0,1.0,2.0518,-0.4957,2021-10-14
2,27.250,28.10,27.2110,27.64,27.640000,998240,0.0,1.0,1.4312,-1.6352,2021-10-13
3,27.235,27.65,27.0846,27.10,27.100000,967460,0.0,1.0,-0.4957,-1.6352,2021-10-12
4,26.840,27.73,26.7000,27.07,27.070000,1268147,0.0,1.0,0.8569,-1.6352,2021-10-11
...,...,...,...,...,...,...,...,...,...,...,...
325,15.820,15.89,15.6000,15.81,15.462668,7190929,0.0,1.0,-2.2587,,2020-07-02
326,15.790,15.89,15.5500,15.76,15.413767,3400498,0.0,1.0,-2.3827,,2020-07-01
327,15.570,15.90,15.4500,15.77,15.423547,5030640,0.0,1.0,-0.9406,,2020-06-30
328,15.890,16.01,15.5000,15.57,15.227941,7374005,0.0,1.0,-4.1665,,2020-06-29


In [59]:
#editing dataframes via renaming columns and removing unnessary columns:

#before joining the 3 dataframes, there needs to be a common column name
#change the name of the date columns to column_of_interest

#change the name
alphaDataRetriever_ACI.rename(columns = {'trading_date':'column_of_interest'}, inplace = True)


#historical_prices dataframe editting:
#in historical_prices, change the column name of '5. adjusted close' to 'stock_adjusted_close'
alphaDataRetriever_ACI.rename(columns = {'5. adjusted close':'stock_adjusted_close'}, inplace = True)

#in historical_prices, change the column name of 'Intraday_Returns' to 'stock_intraday_returns'
alphaDataRetriever_ACI.rename(columns = {'Intraday_Returns':'stock_intraday_returns'}, inplace = True)

#in historical_prices, change the column name of 'Weekly_Median_Return' to 'stock_weekly_median_return'
alphaDataRetriever_ACI.rename(columns = {'Weekly_Median_Return':'stock_weekly_median_return'}, inplace = True)


#display the change
display(alphaDataRetriever_ACI)



Unnamed: 0,1. open,2. high,3. low,4. close,stock_adjusted_close,6. volume,7. dividend amount,8. split coefficient,stock_intraday_returns,stock_weekly_median_return,column_of_interest
0,28.480,29.30,28.3000,28.56,28.560000,2270996,0.0,1.0,0.2809,0.8569,2021-10-15
1,27.780,29.01,27.7000,28.35,28.350000,1440144,0.0,1.0,2.0518,-0.4957,2021-10-14
2,27.250,28.10,27.2110,27.64,27.640000,998240,0.0,1.0,1.4312,-1.6352,2021-10-13
3,27.235,27.65,27.0846,27.10,27.100000,967460,0.0,1.0,-0.4957,-1.6352,2021-10-12
4,26.840,27.73,26.7000,27.07,27.070000,1268147,0.0,1.0,0.8569,-1.6352,2021-10-11
...,...,...,...,...,...,...,...,...,...,...,...
325,15.820,15.89,15.6000,15.81,15.462668,7190929,0.0,1.0,-2.2587,,2020-07-02
326,15.790,15.89,15.5500,15.76,15.413767,3400498,0.0,1.0,-2.3827,,2020-07-01
327,15.570,15.90,15.4500,15.77,15.423547,5030640,0.0,1.0,-0.9406,,2020-06-30
328,15.890,16.01,15.5000,15.57,15.227941,7374005,0.0,1.0,-4.1665,,2020-06-29


In [62]:
# #since the analyst choose the report date of today, after market close, it is necessary to add the current date into the list of dates:


#change the name
alphaDataRetriever_ACI.rename(columns = {'trading_date':'column_of_interest'}, inplace = True)

reported_dates_list = alphaDataRetriever_ACI['column_of_interest'].tolist()

# #this line filters out all the dates that are not a previous reporting date
alphaDataRetriever_ACI = alphaDataRetriever_ACI.loc[alphaDataRetriever_ACI['column_of_interest'].isin(reported_dates_list)]

# #add the column that determines if the intraday return of the earnings release date is larger than that of the median
alphaDataRetriever_ACI['intraday_event'] = (alphaDataRetriever_ACI['stock_intraday_returns'] > alphaDataRetriever_ACI['stock_weekly_median_return']).astype(int)

#add the column that determines if the adjusted share price is larger than that of last earnings call
alphaDataRetriever_ACI['adjusted_price_event'] = (alphaDataRetriever_ACI['stock_adjusted_close'] > alphaDataRetriever_ACI['stock_adjusted_close'].shift(-1)).astype(int)


#display the dataframe with the 2 new variables that were derived from the data
display(alphaDataRetriever_ACI)

Unnamed: 0,1. open,2. high,3. low,4. close,stock_adjusted_close,6. volume,7. dividend amount,8. split coefficient,stock_intraday_returns,stock_weekly_median_return,column_of_interest,intraday_event,adjusted_price_event
0,28.480,29.30,28.3000,28.56,28.560000,2270996,0.0,1.0,0.2809,0.8569,2021-10-15,0,1
1,27.780,29.01,27.7000,28.35,28.350000,1440144,0.0,1.0,2.0518,-0.4957,2021-10-14,1,1
2,27.250,28.10,27.2110,27.64,27.640000,998240,0.0,1.0,1.4312,-1.6352,2021-10-13,1,1
3,27.235,27.65,27.0846,27.10,27.100000,967460,0.0,1.0,-0.4957,-1.6352,2021-10-12,1,1
4,26.840,27.73,26.7000,27.07,27.070000,1268147,0.0,1.0,0.8569,-1.6352,2021-10-11,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
325,15.820,15.89,15.6000,15.81,15.462668,7190929,0.0,1.0,-2.2587,,2020-07-02,0,1
326,15.790,15.89,15.5500,15.76,15.413767,3400498,0.0,1.0,-2.3827,,2020-07-01,0,0
327,15.570,15.90,15.4500,15.77,15.423547,5030640,0.0,1.0,-0.9406,,2020-06-30,0,1
328,15.890,16.01,15.5000,15.57,15.227941,7374005,0.0,1.0,-4.1665,,2020-06-29,0,1


In [63]:
#spy dataframe editting:

#change name
alphaDataRetriever_test_SPY.rename(columns = {'trading_date':'column_of_interest'}, inplace = True)
#in spy, change the column name '5. adjusted close' to 'stock_adjusted_close'
alphaDataRetriever_test_SPY.rename(columns = {'5. adjusted close':'spy_adjusted_close'}, inplace = True)

#in spy, change the column name 'Intraday_Returns' to 'stock_adjusted_close'
alphaDataRetriever_test_SPY.rename(columns = {'Intraday_Returns':'spy_intraday_returns'}, inplace = True)

#in historical_prices, change the column name of 'Weekly_Median_Return' to 'spy_weekly_median_return'
alphaDataRetriever_test_SPY.rename(columns = {'Weekly_Median_Return':'spy_weekly_median_return'}, inplace = True)


#display change
display(alphaDataRetriever_test_SPY)

Unnamed: 0,1. open,2. high,3. low,4. close,spy_adjusted_close,6. volume,7. dividend amount,8. split coefficient,spy_intraday_returns,spy_weekly_median_return,column_of_interest
0,444.750000,446.260000,444.090000,445.870000,445.870000,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
1,439.080000,442.660000,438.580000,442.500000,442.500000,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
2,434.710000,436.050000,431.540000,435.180000,435.180000,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
3,435.670000,436.100000,432.780000,433.620000,433.620000,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
4,437.160000,440.260000,434.620000,434.690000,434.690000,65233285,0.0,1.0,-0.5650,0.0616,2021-10-11
...,...,...,...,...,...,...,...,...,...,...,...
5521,138.625000,139.109299,136.781204,137.875000,91.859026,7431500,0.0,1.0,-33.7356,,1999-11-05
5522,136.750000,137.359299,135.765594,136.531204,90.963724,7907500,0.0,1.0,-33.4817,,1999-11-04
5523,136.000000,136.375000,135.125000,135.500000,90.276685,7222300,0.0,1.0,-33.6201,,1999-11-03
5524,135.968704,137.250000,134.593704,134.593704,89.672867,6516900,0.0,1.0,-34.0489,,1999-11-02


In [66]:
# alphaDataRetriever_test_SPY
#add the column that determines if the intraday return of the earnings release date is larger than that of the median
alphaDataRetriever_test_SPY['spy_event'] = (alphaDataRetriever_test_SPY['spy_intraday_returns'] > alphaDataRetriever_test_SPY['spy_weekly_median_return']).astype(int)

#display the new dataframe with the additional variable that was created (total variables as of now is equal to 3)
display(alphaDataRetriever_test_SPY)


Unnamed: 0,1. open,2. high,3. low,4. close,spy_adjusted_close,6. volume,7. dividend amount,8. split coefficient,spy_intraday_returns,spy_weekly_median_return,column_of_interest,spy_event
0,444.750000,446.260000,444.090000,445.870000,445.870000,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15,1
1,439.080000,442.660000,438.580000,442.500000,442.500000,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14,1
2,434.710000,436.050000,431.540000,435.180000,435.180000,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13,1
3,435.670000,436.100000,432.780000,433.620000,433.620000,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12,0
4,437.160000,440.260000,434.620000,434.690000,434.690000,65233285,0.0,1.0,-0.5650,0.0616,2021-10-11,0
...,...,...,...,...,...,...,...,...,...,...,...,...
5521,138.625000,139.109299,136.781204,137.875000,91.859026,7431500,0.0,1.0,-33.7356,,1999-11-05,0
5522,136.750000,137.359299,135.765594,136.531204,90.963724,7907500,0.0,1.0,-33.4817,,1999-11-04,0
5523,136.000000,136.375000,135.125000,135.500000,90.276685,7222300,0.0,1.0,-33.6201,,1999-11-03,0
5524,135.968704,137.250000,134.593704,134.593704,89.672867,6516900,0.0,1.0,-34.0489,,1999-11-02,0


In [69]:
#now put the values of spy dataframe into the  hist prices

alphaDataRetriever_ACI['spy_event'] = alphaDataRetriever_test_SPY['spy_event']
alphaDataRetriever_ACI

Unnamed: 0,1. open,2. high,3. low,4. close,stock_adjusted_close,6. volume,7. dividend amount,8. split coefficient,stock_intraday_returns,stock_weekly_median_return,column_of_interest,intraday_event,adjusted_price_event,spy_event
0,28.480,29.30,28.3000,28.56,28.560000,2270996,0.0,1.0,0.2809,0.8569,2021-10-15,0,1,1
1,27.780,29.01,27.7000,28.35,28.350000,1440144,0.0,1.0,2.0518,-0.4957,2021-10-14,1,1,1
2,27.250,28.10,27.2110,27.64,27.640000,998240,0.0,1.0,1.4312,-1.6352,2021-10-13,1,1,1
3,27.235,27.65,27.0846,27.10,27.100000,967460,0.0,1.0,-0.4957,-1.6352,2021-10-12,1,1,0
4,26.840,27.73,26.7000,27.07,27.070000,1268147,0.0,1.0,0.8569,-1.6352,2021-10-11,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
325,15.820,15.89,15.6000,15.81,15.462668,7190929,0.0,1.0,-2.2587,,2020-07-02,0,1,0
326,15.790,15.89,15.5500,15.76,15.413767,3400498,0.0,1.0,-2.3827,,2020-07-01,0,0,0
327,15.570,15.90,15.4500,15.77,15.423547,5030640,0.0,1.0,-0.9406,,2020-06-30,0,1,1
328,15.890,16.01,15.5000,15.57,15.227941,7374005,0.0,1.0,-4.1665,,2020-06-29,0,1,1


In [70]:

#keep the relevant columns in the historical_prices df only
columns_to_keep = ['stock_adjusted_close','stock_intraday_returns','stock_weekly_median_return','column_of_interest', 'intraday_event','adjusted_price_event','spy_event']
alphaDataRetriever_ACI = alphaDataRetriever_ACI[columns_to_keep]

#display the dataframe that will be used to merge the remaining 2 dataframes
display(alphaDataRetriever_ACI)

Unnamed: 0,stock_adjusted_close,stock_intraday_returns,stock_weekly_median_return,column_of_interest,intraday_event,adjusted_price_event,spy_event
0,28.560000,0.2809,0.8569,2021-10-15,0,1,1
1,28.350000,2.0518,-0.4957,2021-10-14,1,1,1
2,27.640000,1.4312,-1.6352,2021-10-13,1,1,1
3,27.100000,-0.4957,-1.6352,2021-10-12,1,1,0
4,27.070000,0.8569,-1.6352,2021-10-11,1,0,0
...,...,...,...,...,...,...,...
325,15.462668,-2.2587,,2020-07-02,0,1,0
326,15.413767,-2.3827,,2020-07-01,0,0,0
327,15.423547,-0.9406,,2020-06-30,0,1,1
328,15.227941,-4.1665,,2020-06-29,0,1,1


### 4.1. Creating the dailyAdjustedPriceTransformer Function

In [14]:
#this function takes in the daily adjusted dataframe, individual earnings dataframe, and the spy dataframe 
#it calculates 2 x variables and returns a new merged dataframe that will be merged one final time in the next function

def dailyAdustedPriceTransformer(hp,individual_earnings,spy):
    
    #editing dataframes via renaming columns and removing unnessary columns:

    #before joining the 3 dataframes, there needs to be a common column name
    #change the name of the date columns to column_of_interest
    
    historical_prices = hp.copy(deep=True)
    
    #change the name
    historical_prices.rename(columns = {'trading_date':'column_of_interest'}, inplace = True)


    #historical_prices dataframe editting:
    #in historical_prices, change the column name of '5. adjusted close' to 'stock_adjusted_close'
    historical_prices.rename(columns = {'5. adjusted close':'stock_adjusted_close'}, inplace = True)

    #in historical_prices, change the column name of 'Intraday_Returns' to 'stock_intraday_returns'
    historical_prices.rename(columns = {'Intraday_Returns':'stock_intraday_returns'}, inplace = True)

    #in historical_prices, change the column name of 'Weekly_Median_Return' to 'stock_weekly_median_return'
    historical_prices.rename(columns = {'Weekly_Median_Return':'stock_weekly_median_return'}, inplace = True)
    
    #code below is used to calculate the intraday_event and adjusted_price_event variables:
    
    
    #use the 'column_of_interest' of the individual earnings dataframe 
    reported_dates_list = individual_earnings['column_of_interest'].tolist()
    
    #this line filters out all the dates that are not a previous reporting date
    historical_prices = historical_prices.loc[historical_prices['column_of_interest'].isin(reported_dates_list)]

    #add the column that determines if the intraday return of the earnings release date is larger than that of the median
    historical_prices['intraday_event'] = (historical_prices['stock_intraday_returns'] > historical_prices['stock_weekly_median_return']).astype(int)

    #add the column that determines if the adjusted share price is larger than that of last earnings call
    historical_prices['adjusted_price_event'] = (historical_prices['stock_adjusted_close'] > historical_prices['stock_adjusted_close'].shift(-1)).astype(int)


    #spy dataframe editting:

    #change name
    spy.rename(columns = {'trading_date':'column_of_interest'}, inplace = True)
    #in spy, change the column name '5. adjusted close' to 'stock_adjusted_close'
    spy.rename(columns = {'5. adjusted close':'spy_adjusted_close'}, inplace = True)

    #in spy, change the column name 'Intraday_Returns' to 'stock_adjusted_close'
    spy.rename(columns = {'Intraday_Returns':'spy_intraday_returns'}, inplace = True)

    #in historical_prices, change the column name of 'Weekly_Median_Return' to 'spy_weekly_median_return'
    spy.rename(columns = {'Weekly_Median_Return':'spy_weekly_median_return'}, inplace = True)
    
    #add the column that determines if the intraday return of the earnings release date is larger than that of the median
    spy['spy_event'] = (spy['spy_intraday_returns'] > spy['spy_weekly_median_return']).astype(int)
    
    #add the spy intraday so that the analyst can see how the variables were created
    historical_prices['spy_intraday_returns'] = spy['spy_intraday_returns']
    
    #add the spy median so that the analyst can see how the variables were created
    historical_prices['spy_weekly_median_return'] = spy['spy_weekly_median_return']
        
    #add the spy median so that the analyst can see how the variables were created
    historical_prices['spy_adjusted_close'] = spy['spy_adjusted_close']
        
    #now put the values of spy dataframe into the  hist prices
    historical_prices['spy_event'] = spy['spy_event']
    
    
    #display the new dataframe with the 3 variables
    return(historical_prices)


### 4.2. Testing the dailyAdustedPriceTransformer Function

In [73]:
dailyAdustedPriceTransformer_ACI = dailyAdustedPriceTransformer(alphaDataRetriever_ACI,historicalEarningsTransformer_ACI,alphaDataRetriever_test_SPY)
dailyAdustedPriceTransformer_ACI

Unnamed: 0,stock_adjusted_close,stock_intraday_returns,stock_weekly_median_return,column_of_interest,intraday_event,adjusted_price_event,spy_event,spy_intraday_returns,spy_weekly_median_return,spy_adjusted_close
55,21.61,3.3971,0.1973,2021-07-29,1,1,1,-0.1333,-0.1574,439.228907
121,18.130423,-1.5721,-1.2023,2021-04-26,0,1,1,-0.6115,-0.9025,414.887444
192,16.837102,-4.2803,-1.5951,2021-01-12,0,1,0,-1.0082,-0.1869,375.070144
249,14.88563,-2.7083,-3.2821,2020-10-20,1,0,1,-1.4218,-2.0006,338.57681
309,14.905191,-7.0749,-2.1362,2020-07-27,0,0,1,-1.3111,-1.818,317.413197


### 5.0. Logic of finalTransformation Function

In [74]:
#merge cells
inner_merged_master = historicalEarningsTransformer_ACI.merge(dailyAdustedPriceTransformer_ACI,on='column_of_interest')
inner_merged_master


Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage,earnings_event,surprise_event,stock_adjusted_close,stock_intraday_returns,stock_weekly_median_return,intraday_event,adjusted_price_event,spy_event,spy_intraday_returns,spy_weekly_median_return,spy_adjusted_close
0,2021-07-29,0.78,0.5892,0.1908,32.3829,1,1,21.61,3.3971,0.1973,1,1,1,-0.1333,-0.1574,439.228907
1,2021-04-26,-0.37,0.4848,-0.8548,-176.3201,0,0,18.130423,-1.5721,-1.2023,0,1,1,-0.6115,-0.9025,414.887444
2,2021-01-12,0.2,0.2783,-0.0783,-28.1351,0,0,16.837102,-4.2803,-1.5951,0,1,0,-1.0082,-0.1869,375.070144
3,2020-10-20,0.49,0.2217,0.2683,121.0194,0,1,14.88563,-2.7083,-3.2821,1,0,1,-1.4218,-2.0006,338.57681
4,2020-07-27,1.0,1.1927,-0.1927,-16.1566,0,0,14.905191,-7.0749,-2.1362,0,0,1,-1.3111,-1.818,317.413197


In [75]:
#only keep the columns that are relevant to the analysis

final_columns_to_keep = ['column_of_interest','reportedEPS','estimatedEPS','surprise','surprisePercentage','stock_adjusted_close','stock_intraday_returns','stock_weekly_median_return',
                        'spy_adjusted_close','spy_intraday_returns','spy_weekly_median_return','earnings_event','surprise_event','adjusted_price_event','intraday_event','spy_event']

inner_merged_master = inner_merged_master[final_columns_to_keep]

inner_merged_master


Unnamed: 0,column_of_interest,reportedEPS,estimatedEPS,surprise,surprisePercentage,stock_adjusted_close,stock_intraday_returns,stock_weekly_median_return,spy_adjusted_close,spy_intraday_returns,spy_weekly_median_return,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event
0,2021-07-29,0.78,0.5892,0.1908,32.3829,21.61,3.3971,0.1973,439.228907,-0.1333,-0.1574,1,1,1,1,1
1,2021-04-26,-0.37,0.4848,-0.8548,-176.3201,18.130423,-1.5721,-1.2023,414.887444,-0.6115,-0.9025,0,0,1,0,1
2,2021-01-12,0.2,0.2783,-0.0783,-28.1351,16.837102,-4.2803,-1.5951,375.070144,-1.0082,-0.1869,0,0,1,0,0
3,2020-10-20,0.49,0.2217,0.2683,121.0194,14.88563,-2.7083,-3.2821,338.57681,-1.4218,-2.0006,0,1,0,1,1
4,2020-07-27,1.0,1.1927,-0.1927,-16.1566,14.905191,-7.0749,-2.1362,317.413197,-1.3111,-1.818,0,0,0,0,1


In [80]:
##############################################################################################################################################################################    
#add the class variable:
#this variable holds all the daily adjusted values for a particular stock, not just values that pertain to date of interest
master_daily_adjusted = alphaDataRetriever_ACI.copy(deep=True)

master_daily_adjusted.rename(columns ={'trading_date':'column_of_interest'}, inplace=True)

#create list to reference
reported_dates_list = inner_merged_master['column_of_interest'].tolist()

#master_hist is the dataframe that only includes the dates that match up with the dates that earnings are reported
master_hist = master_daily_adjusted.loc[master_daily_adjusted['column_of_interest'].isin(reported_dates_list)]

display(master_hist)

Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,column_of_interest
55,20.9,22.26,20.9,21.61,21.61,2262885,0.0,1.0,3.3971,0.1973,2021-07-29
121,18.42,18.75,17.73,18.22,18.130423,5413028,0.0,1.0,-1.5721,-1.2023,2021-04-26
192,17.59,18.25,16.88,17.1,16.837102,7387987,0.0,1.0,-4.2803,-1.5951,2021-01-12
249,15.3,15.45,14.63,15.22,14.88563,16484746,0.0,1.0,-2.7083,-3.2821,2020-10-20
309,16.04,16.12,15.04,15.24,14.905191,9327449,0.0,1.0,-7.0749,-2.1362,2020-07-27


In [321]:
# #if the current date is the reported earnings date then remove the first row of the master_hist because it will appear there
# master_hist = master_hist.iloc[1:,:]

# display(master_hist)


Unnamed: 0,1. open,2. high,3. low,4. close,stock_adjusted_close,6. volume,7. dividend amount,8. split coefficient,stock_intraday_returns,stock_weekly_median_return,column_of_interest
67,381.06,385.3300,372.0100,375.98,374.170272,6466774,0.0,1.0,-1.8080,-0.4353,2021-07-13
129,328.55,344.4694,326.1500,335.35,332.618230,9876753,0.0,1.0,1.2382,-1.1627,2021-04-14
188,305.00,306.6000,293.8031,294.20,290.702028,6727794,0.0,1.0,-4.6879,0.0525,2021-01-19
253,213.88,214.4500,210.2700,211.23,207.600372,7557023,0.0,1.0,-2.9361,-1.8629,2020-10-14
317,224.37,225.2400,214.6900,216.90,211.880171,12482019,0.0,1.0,-5.5666,-1.1223,2020-07-15
...,...,...,...,...,...,...,...,...,...,...,...
5238,87.25,95.0000,86.5000,89.38,69.538659,6007800,0.0,1.0,-20.2995,-24.6918,2000-12-19
5302,118.60,121.4000,115.3000,119.90,93.172427,3284700,0.0,1.0,-21.4398,-22.8789,2000-09-19
5365,91.50,93.4400,87.3800,89.00,69.080578,2362600,0.0,1.0,-24.5021,-21.1421,2000-06-20
5428,111.00,119.8000,106.5000,118.30,91.703301,2692800,0.0,1.0,-17.3844,-22.9301,2000-03-21


In [82]:
#notice how the dates in the column_of_interest are one day after the reported earnings date

#store the index values from the list. This will tell us the index where the earnings reported date is located at
row_index_list = master_hist.index.tolist()


#alter the values of the elements in the list so that element = element -1
#this will give us the value of the index pertaining to returns one trading day following reported release
#the way the dataframe is filtered is by most recent date to the most future date

for i in range( len(row_index_list)):
    row_index_list[i] = row_index_list[i] - 1


#lock the dataframe so that only the row indexes from the list come up
master_daily_adjusted = master_daily_adjusted.loc[row_index_list,:]

#rename the column so that the analyst knows what this value is. It's the price of the stock one full trading day after earnings have been released
master_daily_adjusted.rename(columns = {'stock_adjusted_close':'after_report_adjusted_price'},inplace=True)

display(master_daily_adjusted)


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,column_of_interest
54,21.82,22.29,21.54,21.6,21.6,1008414,0.0,1.0,-1.0082,0.8399,2021-07-30
120,18.39,18.5394,18.09,18.11,18.020964,2458069,0.0,1.0,-2.0067,-1.5721,2021-04-27
191,16.89,17.15,16.79,17.12,16.856794,2670307,0.0,1.0,-0.1966,-2.7667,2021-01-13
248,15.28,15.44,14.96,14.98,14.650902,4048843,0.0,1.0,-4.1171,-3.2821,2020-10-21
308,15.3,15.325,15.03,15.1,14.768266,2746222,0.0,1.0,-3.4754,-2.743,2020-07-28


In [83]:

#up to this point, the after_report_adjusted_price column exists
#store the values of the after_report_adjusted_price in a list to 
reported_adjusted_close_prices = master_daily_adjusted['after_report_adjusted_price'].tolist()

#since the first row was deleted, a value needs to be inserted to keep the lengths of the arrays the same
#put the value zero in the first position of the list
reported_adjusted_close_prices.insert(0,0)


#add the values from the list to a new column in the inner master df
inner_merged_master['after_report_adjusted_price'] = reported_adjusted_close_prices

#calculate the adjusted price in a new column
inner_merged_master['return_pct_difference'] = round((inner_merged_master['after_report_adjusted_price']/inner_merged_master['stock_adjusted_close']-1)*100,2)

#calculate the class variable
inner_merged_master['class_variable'] = (inner_merged_master['return_pct_difference'] > 5).astype(int)

display(inner_merged_master)


KeyError: 'after_report_adjusted_price'

### 5.1. Creating the finalTransformation Function

In [15]:
#merges the dataframes and returns the clean variables

def finalTransformation(individual_hist_earn,both_dailyAdjustedPrices,daily_prices,output_type,date_type):
    
    #merge cells
    inner_merged_master = individual_hist_earn.merge(both_dailyAdjustedPrices,on='column_of_interest')
    
    
    #add the class variable:

    #this variable holds all the daily adjusted values for a particular stock, not just values that pertain to date of interest
    master_daily_adjusted = daily_prices.copy(deep=True)
        
    #change the name
    master_daily_adjusted.rename(columns={'trading_date':'column_of_interest'},inplace=True)

    #create list to reference
    reported_dates_list = inner_merged_master['column_of_interest'].tolist()
    
    #master_hist is the dataframe that only includes the dates that match up with the dates that earnings are reported
    master_hist = master_daily_adjusted.loc[master_daily_adjusted['column_of_interest'].isin(reported_dates_list)]
    
    
    #if the current date is the reported earnings date then remove the first row of the master_hist because it will appear there
    if date_type == '1':
        #remove the first row
        master_hist = master_hist.iloc[1:,:]
    
    else:
        master_hist = master_hist
        
    
    
    #notice how the dates in the column_of_interest are one day after the reported earnings date
    #store the index values from the list. This will tell us the index where the earnings reported date is located at
    row_index_list = master_hist.index.tolist()


    #alter the values of the elements in the list so that element = element -1
    #this will give us the value of the index pertaining to returns one trading day following reported release
    #the way the dataframe is filtered is by most recent date to the most future date

    for i in range( len(row_index_list)):
        row_index_list[i] = row_index_list[i] - 1


    #lock the dataframe so that only the row indexes from the list come up
    master_daily_adjusted = master_daily_adjusted.loc[row_index_list,:]
    
    #display the master_daily_adjusted 
#     print('This dataframe contains the price information associated with the date that is one trading day after the reported earnings date:\n')
#     display(master_daily_adjusted.head(15))

    #rename the column so that the analyst knows what this value is. It's the price of the stock one full trading day after earnings have been released
    master_daily_adjusted.rename(columns = {'5. adjusted close':'after_report_adjusted_price'},inplace=True)

    #up to this point, the after_report_adjusted_price column exists
    #store the values of the after_report_adjusted_price in a list to 
    reported_adjusted_close_prices = master_daily_adjusted['after_report_adjusted_price'].tolist()
    
    if date_type == '1':
        
        #since the first row was deleted, a value needs to be inserted to keep the lengths of the arrays the same
        #put the value zero in the first position of the list
        reported_adjusted_close_prices.insert(0,0)

    else:
        reported_adjusted_close_prices = reported_adjusted_close_prices



    #add the values from the list to a new column in the inner master df
    inner_merged_master['after_report_adjusted_price'] = reported_adjusted_close_prices

    #calculate the adjusted price in a new column
    inner_merged_master['return_pct_difference'] = round((inner_merged_master['after_report_adjusted_price']/inner_merged_master['stock_adjusted_close']-1)*100,2)

    #calculate the class variable
    inner_merged_master['class_variable'] = ((inner_merged_master['return_pct_difference'] >= 1) & (inner_merged_master['return_pct_difference'] <= 5) ).astype(int)
    
    #display inner_merged
#     print('This dataframe contains all the variables associated with building the model:\n')
#     display(inner_merged_master)

    #keep relevant columns:
    
    #relevant columns
    model_variables = ['earnings_event','surprise_event','adjusted_price_event','intraday_event','spy_event','class_variable']

    #filter the dataframe
    training_data = inner_merged_master[model_variables]        

    
    #this is the logical display of how the variables were created
    if output_type  == '1':
         
        #display the following dataframe
        #relevant columns
        relevantColumns = ['column_of_interest','reportedEPS','estimatedEPS','surprise','surprisePercentage','stock_adjusted_close','stock_intraday_returns','stock_weekly_median_return','spy_adjusted_close','spy_intraday_returns','spy_weekly_median_return','after_report_adjusted_price','return_pct_difference','earnings_event','surprise_event','adjusted_price_event','intraday_event','spy_event','class_variable']
        
        #filter the dataframe
        logical_dataframe = inner_merged_master[relevantColumns]
        
        #display
        print('This is the dataframe with the columns that were used to construct the variables of the model:\n')
        display(logical_dataframe)

            
    #this returns the clean variables, ready to be analyzed
    elif output_type  == '2':
        print('Clean Variables:\n')
        
        

    else:
        print('Invalid Input. Please try again.')
   

    return training_data

### 5.2. Testing the finalTransformation Function (Logic Display)

In [87]:
output1 ='1'
finTransformationTest = finalTransformation(historicalEarningsTransformer_test1,df2,alphaDataRetriever_test_2,output1)
finTransformationTest

NameError: name 'df2' is not defined

### 5.3. Testing the finalTransformation Function (Clean Variable Display)

In [348]:
output2 ='2'
finTransformationTest2 = finalTransformation(historicalEarningsTransformer_test1,df2,alphaDataRetriever_test_2,output2)
finTransformationTest2.head(15)

Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,1,1,1,0
1,0,1,1,0,0,0
2,1,1,1,1,0,0
3,1,1,1,0,1,0
4,1,1,0,0,0,0
5,1,1,1,0,0,0
6,0,0,0,1,0,0
7,0,0,1,1,1,0
8,0,0,0,1,1,0
9,1,1,1,0,0,0


In [385]:
finTransformationTest2.iloc[0:1,0:5]

Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event
0,0,1,1,1,1


In [358]:
finTransformationTest2.iloc[1:5,0:5]

Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event
1,0,1,1,0,0
2,1,1,1,1,0
3,1,1,1,0,1
4,1,1,0,0,0


In [359]:
finTransformationTest2.iloc[1:5,5]

1    0
2    0
3    0
4    0
Name: class_variable, dtype: int32

## Performing Naive Bayes Analysis

In [16]:
#create the function to peform the analysis
#most import input of this function are the clean variables created in the variableTransformer function that holds the clean variables
#this holds the dataframe that will be broken down into train and test data

def naiveBayesPredictor(cleanVariables,dateType):
    
    #import the tools
    from sklearn.naive_bayes import BernoulliNB

    #start by breaking down the data into their respective sections
    #at this point, the variableTransformer made sure there was enough data to conduct a proper analysis
    
    #break the data into x and y variables
    #for training purposes the test data will consist of full year prediction
    #the dataframe has the most recent quarters at the beginning, thus use the first 4 rows as the testing data
    
    #if the current date is the earnings report date
    if dateType == '1':
            
        #the train variables will be the actual results of one year prior
        x_train_variables = cleanVariables.iloc[5:,0:5]

        y_train_variables = cleanVariables.iloc[5:,5]

        #the first row contains the data for the prediction that is going to be used for the buy or not buy signal
        x_prediction_set = cleanVariables.iloc[0:1,0:5]
        
        #test the model on a full year worth of data (most recent year)
        x_test_variables = cleanVariables.iloc[1:5,0:5]
        y_test_variables = cleanVariables.iloc[1:5,5]
    
    #if the current date is not the same as earnings report date  
    elif dateType == '2':
        
        #the train variables will be the actual results of one year prior
        x_train_variables = cleanVariables.iloc[4:,0:5]
        y_train_variables = cleanVariables.iloc[4:,5]

        #test the model on a full year worth of data (most recent year)
        x_test_variables = cleanVariables.iloc[0:4,0:5]
        y_test_variables = cleanVariables.iloc[0:4,5]
    
    #create an instance of the model
    naive_bayes_model = BernoulliNB(binarize=0.0)
    
    #fit the training variables to the model model
    naive_bayes_model.fit(x_train_variables, y_train_variables)
    
    #make predictions with the model
    predictions = naive_bayes_model.predict(x_test_variables)
    
    #import the metric tools
    from sklearn.metrics import multilabel_confusion_matrix
    from sklearn.metrics import classification_report
    
    #create an instance where the confusion matrix will live
    multi_cm = multilabel_confusion_matrix(y_test_variables, predictions)

    #first row
    tp = multi_cm[0][0][0]
    fp = multi_cm[0][0][1]

    #second row
    fn = multi_cm[0][1][0]
    tn = multi_cm[0][1][1]

    #get the relevant metrics from the classification report
    cr = classification_report(y_test_variables, predictions,output_dict=True)
    precision = cr['weighted avg']['precision'] 
    recall = cr['weighted avg']['recall'] 
    f1_score = cr['weighted avg']['f1-score'] 

    #score the model
    model_score = round(naive_bayes_model.score(x_test_variables,y_test_variables)*100,2)
        
    #both models need to be evaluated for their performance:
    
    #create the point at which one could deem this model accurate
    bench_mark_precision_WA = 0.75

    #if the models precisionWA is greater than or equal to the bench_mark_precision_WA then recommend the analyst use the model
    if precision >= bench_mark_precision_WA:
        model_recommendation = 'Valid'

    #if precision wa is less than the benchmark, the model should NOT be used to make any buying decisions
    else:
        model_recommendation = 'Invalid'
    
    
    #buy or not buy signal:
    if dateType == '1':
    
        #give the analyst a prediction
        signal = naive_bayes_model.predict(x_prediction_set)
        
        print()
        print('This is the buy signal:')
        
        #create a prediction variable
        predicition_value = signal[0] 
        
        if predicition_value == 0:
            buy_signal = 'Do NOT buy'

        
        #if the prediciton value is not equal to zero then it is equal to one, thus signaling a buying opportunity
        else:
            #assign the buy value to the variable
            buy_signal = 'Buy'

        
        #add the buy signal to the dictionary
        confusion_matrix_values = {'TP':[tp],'FP':[fp],'FN':[fn],'TN':[tn],'Precision WA':[precision],'Recall WA':[recall],'F1-Score WA':[f1_score],'Model Score %':[model_score],'Model Recommendation':[model_recommendation],'Buy Signal':[buy_signal]}
        
        return (confusion_matrix_values)

    #if datetype is not equal to 1 it is equal to 2 which means that the results will be the model accuracy and model recommendations
    else:
        #create the dictionary that will store the relevant values (buy signal is excluded because there isnt one to give)
        confusion_matrix_values = {'TP':[tp],'FP':[fp],'FN':[fn],'TN':[tn],'Precision WA':[precision],'Recall WA':[recall],'F1-Score WA':[f1_score],'Model Score %':[model_score],'Model Recommendation':[model_recommendation]}

#         print('This is the confusion matrix:')
#         print(confusion_matrix_values)

        #return confusion_matrix_values
        return (confusion_matrix_values)




### Testing naiveBayesPredictor Function

#### 1. Table Output

In [244]:
model_output = naiveBayesPredictor(finTransformationTest2,'1')
model_output

NameError: name 'finTransformationTest2' is not defined

In [174]:
#revised main function
def main():
    
    #will help display the dataframes
    from IPython.display import display
    import pandas as pd
       
    #to ensure the model is only ran once at time, a 60 second timer will be placed in the main function
    import time
    
    #model trigger (model turns off if equal to false)
    modelValidation = True
    
    #add a print statement at the bottom to signal that the model has ended
    while  modelValidation == True:
                
        #there needs to be a tracker for how many calls have been used and when 
        total_api_calls = 0

        from datetime import datetime

        #ask for the API key
        api_q = input('What is your alphavantage API Key?:\n\n')
        api_q = 'TQ6TR98HVUPE3L9Z'
        print()

        #ask the data type
        data_type_q = input('What kind of data are you looking for?\n\nType the number in the input box:\n\n1. Upcoming Earnings Calendar (3 month horizon)\n2. Historical Earnings (for a specific company)\n3. Daily Adjusted Price History (for a specific company)\n4. Naive Bayes Prediction Model\n\n')

    
        if data_type_q == '1':

            print()
            date_of_interest = input('Pick a date that is within the next three months (yyyy-mm-dd):\n')

            #make the symbol equal to n/a because the function requires the parameter to be defined before performing what it needs to do
            symbol = 'N/A'

            #this function pulls the upcoming earnings calendar
            alphaEarningsCalendar  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
            
            #api is called in the alphaEarningsCalendar 
            total_api_calls = total_api_calls + 1 

            print('Total API Calls: %i'%(total_api_calls))


            #if the dataframe does not contain any rows, that means its empty and no company is reporting earnings
            if alphaEarningsCalendar.shape[0] == 0:
                print('No publicly traded company is reporting their earnings on %s'%(date_of_interest))
                print('Please wait 60 seconds before running the model.')
#                 time.sleep(60)
                modelValidation = False

            else:

                #for some reason, if you put return dataframe, the function will not spit out the dataframe - use display instead
                #display
                display(alphaEarningsCalendar)

                print()
                
                
                #print disclaimers
                print('This is not investment advice.')
                print('Please wait 60 seconds before running the model again.')
                
                #sleep the model
#                 time.sleep(60)
                modelValidation = False
         
        #input 2 refers to the historical earnings table for a specific stock; it requires an api key and ticker symbol
        elif data_type_q == '2':
            print()
            symbol_q = input('Enter the ticker symbol:\n\n')
            print()
            
            #date_of_interest is not relevant when just pulling data
            date_of_interest = 'N/A'
      
            #this runs the historical earnings report for the symbol specified
            alphaHistoricalEarnings  = alphaDataRetriever(api_q,data_type_q,symbol_q,date_of_interest)
            
            #api is called in alphaHistoricalEarnings
            total_api_calls = total_api_calls + 1 
            
            #print disclaimers
            print('This is not investment advice.')
            print('Please wait 60 seconds before running the model again.')
            
            print('Total API Calls: %i'%(total_api_calls))

            #display first then sleep so that the analyst can look at something in the meantime
            display(alphaHistoricalEarnings)
            
            #sleep the model
#             time.sleep(60)
            
            modelValidation = False
    
        #input 3 refers to the daily adjusted price table for a specific stock; it requires an api key and ticker symbol
        elif data_type_q == '3':
            
            #ask the analyst what ticker they want
            print()
            symbol_q = input('Enter the ticker symbol:\n\n')
            print()
            
            #date_of_interest is not relevant when just pulling data because this function pulls most recent data
            date_of_interest = 'N/A'
            
            #runs the function that pulls daily adjusted prices for a specific stock
            alphaDailyAdjusted  = alphaDataRetriever(api_q,data_type_q,symbol_q,date_of_interest)
            
            #api is called
            total_api_calls = total_api_calls + 1 

            #disclaimer
            print('This is not investment advice.')
            print('Please wait 60 seconds before running the model again.')
            print()
            
            #print the total calls, this should be 1
            print('Total API Calls: %i'%(total_api_calls))
            print()
            
            #display first then sleep so that the analyst can look at something in the meantime
            display(alphaDailyAdjusted)
        
#             #sleep the model
#             time.sleep(60)
            
            modelValidation = False

################################################################################################################################################################################################################################################        
        #naive bayes prediction option
        elif data_type_q == '4':
            
            import math         
            print()
            
            #ask the analyst what data the model should do analysis on
            analysis_type = input('What type of analysis are you looking to do?:\n\n1.Individual Company\n\n2.Multi-Company\n\n')

            print()
            
            
            #the model must know whe 
            date_of_interest = input('Enter the date of interest:\n\n')
            print()
            
            #individual stock analysis
            if analysis_type == '1':

                #ask what stock they want analyzed
                ticker = input('Enter the ticker symbol:\n\n')
                print()

                #the historical earnings should be pulled first to see if the analysis can be done
                #switch the value of data_type_q to 2 so that the historical earnings table can be pulled via alphaDataRetriever
                data_type_q = '2'
                
                try:
                    
                    #run the alphaDataretriever to get the historical earnings for each symbol in the list
                    alphaHistoricalEarnings  = alphaDataRetriever(api_q,data_type_q,ticker,date_of_interest)
                    print()
#                     print('This dataframe contains %s\'s historical earnings report. The row pertains to either the most recent earnings date (if the date is in the past/the same as today) or the most upcoming earnings report date (if the date is in the future)'%(ticker))
#                     display(alphaHistoricalEarnings)
            
                except:
                    print('Function Broke. Check your inputs.')
                    #turn the model off
                    modelValidation = False
                #count the call
                total_api_calls = total_api_calls + 1


                #check to see if the company chosen has enough data to run a proper analysis on
                #the model should have at least 3 years worth of quarterly reports so that 2 years (8 quarters) can be used to train the model and 1 year (4 quarters) can be used to test
                if alphaHistoricalEarnings.shape[0] >= 12:
                    print(ticker,'has reported earnings',alphaHistoricalEarnings.shape[0],'times, as of the current date.')

                    #change the value of data_type_q to 3 to call the adjusted price function
                    data_type_q = '3'

                    #run the function that gets the daily adjusted prices for the ticker symbol in the loop
                    alphaDailyAdjusted  = alphaDataRetriever(api_q,data_type_q,ticker,date_of_interest)
                    
#                     print('This dataframe contains %s\'s 5,000 most recent daily adjusted prices:\n '%(ticker))
#                     display(alphaDailyAdjusted.head(15))

                    #count the call
                    total_api_calls = total_api_calls + 1

                    #change value of symbol so that the function can pull daily prices for SPY
                    symbol = 'SPY'

                    #the symbol here SPY will be constant and won't change
                    alphaSPY  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
                    print()
#                     print('This dataframe contains SPY\'s most 5,000 most recent daily adjusted prices:\n')
#                     print(alphaSPY.head(15))

                    #this is the second call, but the SPY wont be called again for the remaining of the analysis
                    total_api_calls = total_api_calls + 1 
                
                
                    #the most recent date will be located in this cell
                    most_recent_date = alphaHistoricalEarnings.iloc[0,0]
                    
#                     #print the most recent earnings date (aka the first cell value of first row)
#                     print('%s\'s most recent/upcoming reporting date is:\n'%(ticker))
#                     print(most_recent_date)
#                     print()
                    #alphaHistoricalEarnings will return an earnings history report that includes the most upcoming earnings date
                    #if the date in the most upcoming earnings date happens to be the same as today and the time is after 4:01 (market close) then do the following:
                    
                    
                    import datetime 
                    
                    #current time instance
                    current_time = datetime.datetime.now() 
                    
                    #calculate this so that the model knows if the most recent/upcoming earnings is today
                    current_date = str(current_time.year) + '-' + str(current_time.month) + '-' + str(current_time.day)

                    #calculate this so that the model knows if the market has closed (after 4 PM)
                    current_hour = current_time.hour
                    
                    #if the most recent date in the historical earnings dataframe is equal to today's date and the market has closed then change the date_type and call the finhub function that has the most updated earnings information :
                    if most_recent_date == current_date and current_hour >= 17:
                        print('test')
                        
                        date_type = '1'
                        
                        #ask the analyst for thier finhub api key
                        print()
                        
                        finhubapi = input('Enter your finhub API-Key:\n\n') 
                        print()
                                                
                        #call the finhub function that will be used to create buying signals
                        finhub_function = finhubEarningsCalendar(finhubapi, date_of_interest,date_of_interest,ticker)
                        
                        #display
                        print()
                        print('This is %s\'s most updated earnings report:\n'%(ticker))
                        display(finhub_function)
                            
                        
                    
                    #if it is not the current date AND the market has not closed, then switch the date_type and assign the finhub function to 'N/A' because it's not needed
                    else:
                        date_type = '2'
                        finhub_function = 'N/A'    
    
    
                    #call the historicalEarningsTransformer to calculate the first 2 variables
                    historical_earnings_main = historicalEarningsTransformer(date_type,alphaHistoricalEarnings,finhub_function)
                    
                    #display
                    print()
#                     print('This is %s\'s most updated earnings table that contains the earnings_event and surprise_event variables that will be used in the analysis:\n'%(ticker))
#                     display(historical_earnings_main.head(15)) 

                    #this runs the first part of the model where the variables are transformed only if the api call count is within the 5 call limit
                    dailyAdustedPriceTransformer_main = dailyAdustedPriceTransformer(alphaDailyAdjusted,historical_earnings_main,alphaSPY)
                            
                    #display
                    print()
#                     print('This is %s\'s most updated dataframe that contains the three price-oriented variables (intraday_event, adjusted_price_event, spy_event) that will be used in the analysis:\n'%(ticker))
#                     display(dailyAdustedPriceTransformer_main.head(15)) 
                    
                    print()
                    #call the finalTransformation function that will get all the x and y variables ready
                    output_type = input('What type of output would you like?:\n\nLogic will display how the model variables were created, while Clean Variables will not:\n\n 1.Logic\n\n 2.Clean Variables\n\n')
                    print()
                    
                    #if the user picks logic
                    if output_type == '1':
                        #print the displays down to the order:
                        
                        #first run the historical earnings report for the stock 
                        print('1. Retrieve the historical earnings report for the stock:\n')
                        print('This dataframe contains %s\'s historical earnings report. The first row pertains to either the most recent earnings date (if the date is in the past/the same as today) or the upcoming earnings report date (if the date is in the future)'%(ticker))
                        display(alphaHistoricalEarnings.head(15))
                        print()
                        
                        #secondly
                        print('2. Insert the historical earnings report into the historicalEarningsTransformer function:\n')
                        print('This is %s\'s most updated earnings table that contains the earnings_event and surprise_event variables that will be used in the analysis:\n'%(ticker))
                        display(historical_earnings_main.head(15))
                        print()
                        
                        #thirdly
                        print('3. Retrieve the last 5,000 daily adjusted prices for %s in order to create the intraday_event and adjusted_price_event variables:\n'%(ticker))
                        display(alphaDailyAdjusted.head(15))
                        print()
                        
                        #fourth
                        print('4. Retrieve the last 5,000 daily adjusted prices for SPY in order to create the spy_event variable:\n')
                        display(alphaSPY.head(15))
                        print()
                        
                        #fifth
                        print('5. Connect both daily adjusted price dataframes (Stock Prices and SPY Prices) with the historical earnings report in order to calculate the x variables intraday_event, adjusted_price_event, spy_event:\n')
                        display(dailyAdustedPriceTransformer_main.head(15))
                        print()
                        
                        #sixth
                        print('6. Run the final tranformer function to generate the class variable:\n')
                        print()
  
                        
                        
                    
                    #parameters include : individual_hist_earn,both_dailyAdjustedPrices,daily_prices,output_type
                    finTransformationTransformer_main = finalTransformation(historical_earnings_main,dailyAdustedPriceTransformer_main,alphaDailyAdjusted,output_type,date_type)
                    print()
                    print('This is the dataframe that contains all the clean variables necessary for analysis:')
                    display(finTransformationTransformer_main.head(15))
                    
                    #run the model by calling the naivebayesPredictor function (returns a dictionary):
                    model_output = naiveBayesPredictor(finTransformationTransformer_main,date_type)

                    #convert the dictionary into a readable dataframe:
                    individual_analysis_results = pd.DataFrame.from_dict(model_output)
                    
                    #display the pretty dataframe
                    print()
                    print('Result dataframe:\n')
                    display(individual_analysis_results)
                    
                    #say bye
                    print('Thank you for using the model.')
                    print('Please wait 60 seconds before running the model again.')

                    #sleep the model
                    #time.sleep(60)
                
                    #turn model off
                    modelValidation = False
                
                #if the rows are less than 12 and not equal to 0 then:
                else:
                    
                    #print diagnosis
                    print('Company does not have enough reported earnings history to conduct a proper naive bayes prediction analysis.')
                    print('Please wait 60 seconds before running the model again.')

                    #sleep the model
#                     time.sleep(60)
                    
                    #turn the model off since the naive bayes cannot be done
                    modelValidation = False

#code works up to this point. Below is the multi-company analysis                    
############################################################################################################################
      
            #if the analysis type is not 1 then it is equal to 2 which refers to the multi-company analysis
            else:
                
                #big assumption: date_of_interest can only be the current date or plus three months because a past date will break the model
                #if the date of interest is equal to today and the time is after market close, then the model will assume that it's go time
                #if the date of interest is equal to today but the time is before market close, then the model will just give back model results
                #if the date of interest is not equal to today, then it has to be 
                
                #call the finalTransformation function that will get all the x and y variables ready
                output_type = input('What type of output would you like?:\n\nLogic will display how the model variables were created, while Clean Variables will not:\n\n 1.Logic\n\n 2.Clean Variables\n\n')
                print()
                
                
                
                import datetime                 
                
                # using now() to get current time 
                current_time = datetime.datetime.now() 
#                 print(current_time)

                #calculate this so that the model knows if the most recent/upcoming earnings is today
                current_date = str(current_time.year) + '-' + str(current_time.month) + '-' + str(current_time.day)
#                 print(current_date)

                #calculate this so that the model knows if the market has closed (after 4 PM)
                current_hour = current_time.hour
#                 print(current_hour)
################################################################################################################################################################################################################################################################                
                #if the date of interest is the same as the current date and the market is closed, then run the finhub function
                #finhub is the most updated API and it will have the actual companies that reported earnings
                #the earnings calendar gives estimates of when the stock is going to report earnings and when they are wrong, finhub is used fill in the gap as a discrepancy check
                if date_of_interest == current_date and current_hour >= 18:
                                       
                    print('test')
                    #change the data_type because the model is in its prime time mode, ready to make a prediciton for the next trading date
                    date_type = '1'

                    #ask the analyst for thier finhub api key
                    print()
                    finhubapi = input('Enter your finhub API-Key:\n\n') 

                    #empty space as the value for ticker will return all the stocks that reported earnings on that date
                    ticker = ''

                    #call the finhub function that will be used to create buying signals
                    finhub_function = finhubEarningsCalendar(finhubapi, date_of_interest,date_of_interest,ticker)

                    #debug
                    print('Below is the finhub function')
                    display(finhub_function)

                    #loop through the symbol column so that the model knows what stocks to analyze and give a buy signal to
                    #create the list of symbols that will be used in the for loop
                    list_of_ticker_symbols =  finhub_function.symbol.tolist()
                        

#                         #get the length of the list to determine how many stocks are in your dataset
#                         stock_count = len(list_of_ticker_symbols)    
                        
#                         #print stock count
#                         print('Stock Count:\n',stock_count)

#                         #for a naive bayes analyis, each stock will need its earnings history and daily adjusted prices 
#                         stock_api_calls = stock_count * 3

#                         #5 api calls per minute
#                         api_calls_per_minute = 5

#                         #how many sets of 5 api calls/min need to be done in order to do an analysis on all of the stocks
#                         #this will round up to the nearest integer
#                         #this essentially how long (in minutes) the model will take to run an analysis on all of the stocks on the earnings calendar for the specified date
#                         api_sets  = math.ceil(stock_api_calls / api_calls_per_minute)
                        
################################################################################################################################################################################################################################################################
                #if it is not the current date AND the market has not closed, then switch the date_type and assign the finhub function to 'N/A' because it's not needed                
                #model results, NO buy signal
                #different thought process here
                else:
                    
                    #not go time
                    date_type = '2'
                    finhub_function = 'N/A'    
                    
                    #code for the earnings calendar
                    data_type_q = '1'
                    
                    #change symbol value
                    symbol = ''
                    
                    #run the alphaDataRetriever to get the earnings calendar 
                    print(symbol)
                    print(date_of_interest)
                    print()
                    alphaEarningsCalendar  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
                    print('This should be earnings calendar')
                    display(alphaEarningsCalendar)
                    
                    print()
                    print('Model is sleeping so that it can reset')
                    #sleep model so that the count resets
                    time.sleep(60)
                    
                    #reference the list from the alpha earnings calendar
                    list_of_ticker_symbols = alphaEarningsCalendar.symbol.tolist()
                    print(list_of_ticker_symbols)
################################################################################################################################################################################################################################################################                      
                #Calculate the api information:        
                        
                #get the length of the list to determine how many stocks are in your dataset
                stock_count = len(list_of_ticker_symbols)    

                #print stock count
                print()
                print('Stock Count:',stock_count)
                print()

                #for a naive bayes analyis, each stock will need its earnings history and daily adjusted prices and the SPY call
                stock_api_calls = stock_count * 3

                #5 api calls per minute
                api_calls_per_minute = 5

                #how many sets of 5 api calls/min need to be done in order to do an analysis on all of the stocks
                #this will round up to the nearest integer
                #this essentially how long (in minutes) the model will take to run an analysis on all of the stocks on the earnings calendar for the specified date
                api_sets  = math.ceil(stock_api_calls / api_calls_per_minute)        
                        

                #12 min should be the longest an analyst waits for an analysis, this equal to 30 stocks that have 3 dataframes each 
                if stock_count > 30:
                    print('There are %i companies reporting earings on the specified date. Analysis will only be provided for the first 30 companies in the list. This should take about 12 minutes. Please wait patiently.'%(stock_count))

                    #only get the first 30 symbols
                    list_of_ticker_symbols = list_of_ticker_symbols[0:10]

                    #15 stocks * 3 dataframes = 45
                    stock_api_calls = 15
                    
                    #api_sets are set to 12 mintutes
                    api_sets = 3
                
                #if there is 30 or less stocks in the list, keep the symbol list the same
                else:
                    #keep the same
                    list_of_ticker_symbols = list_of_ticker_symbols
                    stock_api_calls = stock_api_calls
                                    
                    #tell the analyst how long the analysis is going to take
                    print('The model will take about %i minutes to run. Please wait patiently.'%(api_sets))



                print()
                #list of symbols stay the same
                list_of_ticker_symbols  = list_of_ticker_symbols

                #stock_api_calls stay the same
                stock_api_calls = stock_api_calls 
                print()
                print('There are a total of %i stock calls'%(stock_api_calls))
################################################################################################################################################################################################################
                #for loop begins here:
                #this list will contain the confusion matrix for each stock in the for loop
                #list of dictionaries
                model_result_list = []

                #for each ticker in the list of symbols, run each ticker through the functions
                for ticker in list_of_ticker_symbols:

                    print()
                    print('This is the stock_api_call at each loop:')
                    print()
                    print(stock_api_calls)
                    print()
                    print('This is the ticker:')
                    print()
                    print(ticker)
                    print()


                    #change value of data_type_q so that the function can pull daily prices for SPY
                    data_type_q = '3'
                    print(date_of_interest)
                    #the symbol here SPY will be constant and won't change
                    alphaSPY  = alphaDataRetriever(api_q,data_type_q,'SPY',date_of_interest)
                    print()
                    print('This is the alphaSPY')
                    display(alphaSPY.head(15))

                    #this is the second call
                    total_api_calls = total_api_calls + 1 

#                     #subtract the call from the stock count
#                     stock_api_calls = stock_api_calls - 1


                    if total_api_calls == api_calls_per_minute:
                        total_api_calls = 0
                        print('Model has been put to sleep after the alpha spy')
                        time.sleep(60)

                    else:
                        total_api_calls = total_api_calls


                    #assign symbol the same value as ticker so that the tickers in the column are inputted into the historical earnings function
                    symbol = ticker    

                    #switch the value of data_type_q to 2 so that the historical earnings table can be pulled via alphaDataRetriever
                    data_type_q = '2'

                    #run the alphaDataretriever to get the historical earnings for each symbol in the list

                    try:
                        print(symbol)
                        print()
                        alphaHistoricalEarnings  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
                        print()
                        print('This is the historical earnings dataframe')
                        display(alphaHistoricalEarnings)
                    except:
                        print()
                        print(ticker)
                        print('An error has occured in the historical earnings function. Check inputs.')
                        #skip to the next ticker and run analysis on it
                        print('This will fuck up the rest of the functions')
#                         total_api_calls = total_api_calls -1
                        
                        continue
        
                    
                    print('This should not print if the continue function works')
                    #the historical earnings are called 
                    total_api_calls = total_api_calls + 1

#                     #subtract the call from the stock count
#                     stock_api_calls = stock_api_calls - 1


                    #if the total api calls are 5 then a full set has been completed and it can be subtracted from the total sets needed to gather the information
                    if total_api_calls == api_calls_per_minute:

                        #reset the number of total api calls
                        total_api_calls = 0
                        print()
                        print('Model has been put to sleep after historical earnings call')
                        #sleep the model because an additional api call will break the model
                        time.sleep(60)

                    #if the total api calls isn't equal to 5 then keep the value the same
                    else:

                        total_api_calls = total_api_calls 
################################################################################################################################################################################################################
                    #this section analyzes the stock only if there is enough training data available
    
                    #still within the for loop
                    #the model should have at least 3 years worth of quarterly reports so that 2 years (8 quarters) can be used to train the model and 1 year (4 quarters) can be used to test
                    #if the model has 12 or more instances, then conduct the analysis and provide a buying signal
                    if alphaHistoricalEarnings.shape[0] >= 12:
                        print(symbol,'has reported earnings',alphaHistoricalEarnings.shape[0],'times, as of the current date.')
                        print()

                        #change the value of data_type_q to 3 to call the adjusted price function
                        data_type_q = '3'

                        #run the function that gets the daily adjusted prices for the ticker symbol in the loop
                        try:
                            print('This is the daily adjusted dataframe')
                            print(symbol)
                            print(date_of_interest)
                            alphaDailyAdjusted  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
#                             print('This is the daily adjusted price for 
                            
#                             print()
                            
                        except:
#                             total_api_calls = total_api_calls -1
                            print('symbol:',symbol)
                            print('date_of_interest',date_of_interest)
                            print()

                    
                    
                    
                        
#                         print('An error has occured in the alphaDailyAdjusted function. Check inputs.')
                        #skip to the next ticker and run analysis on it
#                         print('skip to the next ticker and run analysis on it')
#                         continue


                        #count the call
                        total_api_calls = total_api_calls + 1

                        #subtract the set from the stock count
#                         stock_api_calls = stock_api_calls - 1


                        #if the total api calls are 5 then a full set has been completed and it can be subtracted from the total sets needed to gather the information
                        if total_api_calls == api_calls_per_minute:
                            #reset the number of total api calls
                            total_api_calls = 0
                            print()
                            print('model has been put to sleep after the daily adjusted call')
                            #sleep the model because an additional api call will break the model
                            time.sleep(60)

                        #keep as is if the total api calls isnt equal to 5 
                        else:

                            total_api_calls = total_api_calls


                        #make sure to call the finhub_function for each stock

              

                        try:
                            #call the finhub function that will be used to create buying signals
                            finhub_function = finhubEarningsCalendar(finhubapi, date_of_interest,date_of_interest,ticker)
                        

                            #call the historicalEarningsTransformer to calculate the first 2 variables
                            historical_earnings_main = historicalEarningsTransformer(date_type,alphaHistoricalEarnings,finhub_function)

                            #display
                            print()
#                             print('Below is the historical main function that returns the two x variables')
#                             display(historical_earnings_main.head(15)) 
                        except:
                            print()
                            print('An error has occured in the historicalEarningsTransformer function. Check inputs.')
                            print(finhubapi)
                            display(alphaHistoricalEarnings.head(5))
#                             print(finhub_function)
                            
                            #skip to the next ticker and run analysis on it
                            print('skip to the next ticker and run analysis on it')
                            
                            continue



                        try:
                            #this runs the first part of the model where the variables are transformed only if the api call count is within the 5 call limit
                            dailyAdustedPriceTransformer_main = dailyAdustedPriceTransformer(alphaDailyAdjusted,historical_earnings_main,alphaSPY)

                            #display
                            print()
#                             print('This is the dataframe with the remaining 3 variables that pertain to the earnings data')
#                             display(dailyAdustedPriceTransformer_main.head(15)) 
                        except:
                            print()
                            #skip to the next ticker and run analysis on it
                            print('skip to the next ticker and run analysis on it')
                            continue

                            print('An error has occured in the dailyAdustedPriceTransformer function. Check inputs.')


                        #call the finalTransformation function that will get all the x and y variables ready
#                             output_type = input('What type of output would you like?:\n\nLogic will display how the model variables were created, while Clean Variables will not:\n\n 1.Logic\n\n 2.Clean Variables\n\n')

#                                         output_type = '2'

                        #if the user chooses output type logic, then print the following statements before running final transformer
                        if output_type == '1':

                            #print the following statements:
                            #first run the historical earnings report for the stock 
                            print('1. Retrieve the historical earnings report for the stock:\n')
                            print('This dataframe contains %s\'s historical earnings report. The first row pertains to either the most recent earnings date (if the date is in the past/the same as today) or the upcoming earnings report date (if the date is in the future)'%(ticker))
                            print()
                            try:
                                display(alphaHistoricalEarnings.head(15))
                                print()
                            except:
                                print('Does not exist.')
                            #secondly
                            print('2. Insert the historical earnings report into the historicalEarningsTransformer function:\n')
                            print('This is %s\'s most updated earnings table that contains the earnings_event and surprise_event variables that will be used in the analysis:\n'%(ticker))

                            try:
                                display(historical_earnings_main.head(15))
                                print()
                            except:
                                print('Does not exist.')

                            #thirdly
                            print('3. Retrieve the last 5,000 daily adjusted prices for %s in order to create the intraday_event and adjusted_price_event variables:\n'%(ticker))

                            try:
                                display(alphaDailyAdjusted.head(15))
                                print()
                            except:
                                print('Does not exist.')

                            #fourth

                            try:
                                print('4. Retrieve the last 5,000 daily adjusted prices for SPY in order to create the spy_event variable:\n')
                                display(alphaSPY.head(15))
                                print()
                            except:
                                print('Does not exist.')

                            #fifth
                            try:
                                print('5. Connect both daily adjusted price dataframes (Stock Prices and SPY Prices) with the historical earnings report in order to calculate the x variables intraday_event, adjusted_price_event, spy_event:\n')
                                display(dailyAdustedPriceTransformer_main.head(15))
                                print()
                            except:
                                print('Does not exist.')

                            #sixth
                            print('6. Run the final tranformer function to generate the class variable:\n')
                            print()

                        else:
                            print('Run the analysis with just clean variables')
                        
                        
                        
                        #parameters include : individual_hist_earn,both_dailyAdjustedPrices,daily_prices,output_type
                        finTransformationTransformer_main = finalTransformation(historical_earnings_main,dailyAdustedPriceTransformer_main,alphaDailyAdjusted,output_type,date_type)
                        print()
                        print('This is the dataframe that contains all the clean variables necessary for analysis:')
                        display(finTransformationTransformer_main.head(15))

                        try:
                            display(finTransformationTransformer_main.head(15))
                        except:
                            print('No variables to display.')
                            print()

                        try:
                            #run the model by calling the naivebayesPredictor function (returns a dictionary):
                            model_output = naiveBayesPredictor(finTransformationTransformer_main,date_type)
                        except:
                            print('No predicitive output to display.')
                            print()

                        #add the symbol to the dictionary values 
                        model_output['symbol'] = symbol 

                        model_result_list.append(model_output)

                        #convert the dictionary into a readable dataframe:
                        individual_analysis_results = pd.DataFrame.from_dict(model_output)

                        #display the pretty dataframe
                        display(individual_analysis_results)

################################################################################################################################################################################################################
                    #if the stock does not have enough data then skip and go on to the next one
                    else:
                        print('This stock does not have enough data to conduct a proper naive bayes analysis.')

                        #if there are not enough data points for a proper analysis, that means that this stock will not pull the remaining 2 dataframes
                        #subtract the api call count by 2
#                         stock_api_calls = stock_api_calls - 2
#                         print(stock_api_calls)

                                    
                                
                                


                #at this point, the for loop has ended and the results of the multi-company analysis shoudl be displayed
                #say bye
#                 print()
#                 print(model_result_list)
                multi_analysis_results = pd.DataFrame.from_dict(model_result_list)
                display(multi_analysis_results)
                print('Thank you for using the model.')
                print('Please wait 60 seconds before running the model again.')

                #sleep the model
                time.sleep(60)

                #turn model off
                modelValidation = False     

                        
#############################################################################################################################################################################################################                        
        #this else was put here to incase the analyst put invalid option to the first question of data of interest
        else:
            
            print('Invalid input. Please run the model again.')
            #turn the model off since the user put an invalid option
            modelValidation = False
        
        print()

In [185]:
if __name__=="__main__":
    main()        
    
# TQ6TR98HVUPE3L9Z

# c5d5iiqad3i9ue38pn9g


What is your alphavantage API Key?:

TQ6TR98HVUPE3L9Z

What kind of data are you looking for?

Type the number in the input box:

1. Upcoming Earnings Calendar (3 month horizon)
2. Historical Earnings (for a specific company)
3. Daily Adjusted Price History (for a specific company)
4. Naive Bayes Prediction Model

4

What type of analysis are you looking to do?:

1.Individual Company

2.Multi-Company

2

Enter the date of interest:

2021-10-20

What type of output would you like?:

Logic will display how the model variables were created, while Clean Variables will not:

 1.Logic

 2.Clean Variables

2

test

Enter your finhub API-Key:

c5d5iiqad3i9ue38pn9g
Below is the finhub function


Unnamed: 0,symbol,reportedDate,hour,reportedEPS,estimatedEPS,surprise,surprisePercentage
1,LVS,2021-10-20,amc,-0.45,-0.210837,-0.239163,113.4345
2,LAD,2021-10-20,bmo,11.21,9.328087,1.881913,20.1747
3,IBM,2021-10-20,amc,2.52,2.529818,-0.009818,-0.3881
4,HLX,2021-10-20,amc,-0.13,-0.105798,-0.024203,22.8762
5,GGG,2021-10-20,amc,0.57,0.642724,-0.072724,-11.3149
...,...,...,...,...,...,...,...
72,LSTR,2021-10-20,amc,2.58,2.539800,0.040200,1.5828
73,BPOP,2021-10-20,bmo,3.09,2.286075,0.803925,35.1662
74,RUSHA,2021-10-20,amc,1.20,0.991118,0.208882,21.0755
75,HCCI,2021-10-20,amc,0.79,0.628320,0.161680,25.7321



Stock Count: 70

There are 70 companies reporting earings on the specified date. Analysis will only be provided for the first 30 companies in the list. This should take about 12 minutes. Please wait patiently.


There are a total of 15 stock calls

This is the stock_api_call at each loop:

15

This is the ticker:

LVS

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


LVS


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,-0.2,,
1,2021-07-21,-0.26,-0.1742,-0.0858,-49.2537
2,2021-04-21,-0.25,-0.2545,0.0045,1.7682
3,2021-01-27,-0.37,-0.3458,-0.0242,-6.9983
4,2020-10-21,-0.67,-0.7194,0.0494,6.8668
...,...,...,...,...,...
63,2006-02-14,0.33,0.27,0.06,22.2222
64,2005-11-02,0.28,0.27,0.01,3.7037
65,2005-08-03,0.27,0.26,0.01,3.8462
66,2005-05-03,0.29,0.25,0.04,16


This should not print if the continue function works
LVS has reported earnings 68 times, as of the current date.

This is the daily adjusted dataframe
LVS
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,0,0,0,0
1,0,0,0,1,1,0
2,1,1,1,1,1,0
3,1,0,1,0,0,0
4,1,1,0,0,1,0
5,0,0,1,0,1,0
6,0,1,0,1,1,0
7,1,1,1,0,0,1
8,1,1,0,1,1,1
9,0,0,0,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,0,0,0,0
1,0,0,0,1,1,0
2,1,1,1,1,1,0
3,1,0,1,0,0,0
4,1,1,0,0,1,0
5,0,0,1,0,1,0
6,0,1,0,1,1,0
7,1,1,1,0,0,1
8,1,1,0,1,1,1
9,0,0,0,1,1,0



This is the buy signal:


  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,0,1,3,1.0,0.75,0.857143,75.0,Valid,Do NOT buy,LVS



This is the stock_api_call at each loop:

15

This is the ticker:

LAD

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


LAD


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,9.28,,
1,2021-07-21,11.12,6.1597,4.9603,80.5283
2,2021-04-21,5.89,4.763,1.127,23.6616
3,2021-02-03,5.46,5.0684,0.3916,7.7263
4,2020-10-21,6.89,6.2206,0.6694,10.761
...,...,...,...,...,...
95,1998-02-19,0.25,0.23,0.02,8.6957
96,1997-10-29,0.22,0.21,0.01,4.7619
97,1997-07-30,0.19,0.18,0.01,5.5556
98,1997-04-30,0.16,0.14,0.02,14.2857


This should not print if the continue function works

Model has been put to sleep after historical earnings call
LAD has reported earnings 100 times, as of the current date.

This is the daily adjusted dataframe
LAD
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,1,0,0,0,0
1,1,1,0,0,1,1
2,1,1,1,1,1,0
3,0,1,1,1,0,0
4,1,1,1,1,1,0
5,1,1,1,1,1,1
6,0,1,0,1,1,1
7,0,0,0,0,1,1
8,1,1,1,1,1,0
9,1,1,1,1,1,1


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,1,0,0,0,0
1,1,1,0,0,1,1
2,1,1,1,1,1,0
3,0,1,1,1,0,0
4,1,1,1,1,1,0
5,1,1,1,1,1,1
6,0,1,0,1,1,1
7,0,0,0,0,1,1
8,1,1,1,1,1,0
9,1,1,1,1,1,1



This is the buy signal:


  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,1,0,3,0.5625,0.75,0.642857,75.0,Invalid,Do NOT buy,LAD



This is the stock_api_call at each loop:

15

This is the ticker:

IBM

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


IBM


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,2.5,,
1,2021-07-19,2.33,2.2884,0.0416,1.8179
2,2021-04-19,1.77,1.6524,0.1176,7.1169
3,2021-01-21,2.07,1.8753,0.1947,10.3823
4,2020-10-19,2.58,2.579,0.001,0.0388
...,...,...,...,...,...
98,1997-04-23,0.59,0.58,0.01,1.7241
99,1997-01-21,0.98,0.99,-0.01,-1.0101
100,1996-10-21,0.61,0.61,0,0
101,1996-07-25,0.63,0.61,0.02,3.2787


This should not print if the continue function works
IBM has reported earnings 103 times, as of the current date.

This is the daily adjusted dataframe
IBM
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,0,1,0,0,0
1,1,1,1,1,0,1
2,0,1,1,0,0,1
3,0,1,1,1,0,0
4,1,1,1,0,0,0
5,1,1,1,0,1,0
6,0,1,0,1,0,0
7,1,1,0,1,0,1
8,0,1,1,0,0,0
9,1,1,0,0,0,1


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,0,1,0,0,0
1,1,1,1,1,0,1
2,0,1,1,0,0,1
3,0,1,1,1,0,0
4,1,1,1,0,0,0
5,1,1,1,0,1,0
6,0,1,0,1,0,0
7,1,1,0,1,0,1
8,0,1,1,0,0,0
9,1,1,0,0,0,1



This is the buy signal:


  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,2,0,2,0.25,0.5,0.333333,50.0,Invalid,Do NOT buy,IBM



This is the stock_api_call at each loop:

15

This is the ticker:

HLX

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


Model has been put to sleep after the alpha spy
HLX


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,-0.1,,
1,2021-07-26,-0.0876,-0.08,-0.0076,-9.5
2,2021-04-26,-0.02,-0.065,0.045,69.2308
3,2021-02-22,0.0279,-0.075,0.1029,137.2
4,2020-10-21,0.1102,0.0075,0.1027,1369.3333
...,...,...,...,...,...
92,1998-11-04,0.13,0.13,0,0
93,1998-08-10,0.1,0.09,0.01,11.1111
94,1998-05-04,0.09,0.04,0.05,125
95,1998-02-19,0.07,0.06,0.01,16.6667


This should not print if the continue function works
HLX has reported earnings 97 times, as of the current date.

This is the daily adjusted dataframe
HLX
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,0,1,0,0
1,0,0,1,1,0,0
2,0,1,0,1,1,0
3,0,1,1,1,0,0
4,1,1,0,0,1,1
5,1,1,1,1,1,0
6,0,0,0,0,1,0
7,0,1,0,0,0,0
8,1,1,0,1,1,0
9,1,0,0,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,0,1,0,0
1,0,0,1,1,0,0
2,0,1,0,1,1,0
3,0,1,1,1,0,0
4,1,1,0,0,1,1
5,1,1,1,1,1,0
6,0,0,0,0,1,0
7,0,1,0,0,0,0
8,1,1,0,1,1,0
9,1,0,0,1,1,0



This is the buy signal:


  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,1,0,3,0.5625,0.75,0.642857,75.0,Invalid,Do NOT buy,HLX



This is the stock_api_call at each loop:

15

This is the ticker:

GGG

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


GGG


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,0.63,,
1,2021-07-21,0.62,0.61,0.01,1.6393
2,2021-04-21,0.58,0.502,0.078,15.5378
3,2021-01-25,0.61,0.512,0.098,19.1406
4,2020-10-21,0.59,0.421,0.169,40.1425
...,...,...,...,...,...
98,1997-04-14,0.02,0.03,-0.01,-33.3333
99,1997-01-20,0.04,0.03,0.01,33.3333
100,1996-10-10,0.04,0.03,0.01,33.3333
101,1996-07-16,0.04,0.03,0.01,33.3333


This should not print if the continue function works
GGG has reported earnings 103 times, as of the current date.

This is the daily adjusted dataframe
GGG
2021-10-20

model has been put to sleep after the daily adjusted call


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,0,0,1,0,0
1,1,1,1,1,1,0
2,0,1,1,1,1,0
3,1,1,1,0,1,0
4,1,1,1,0,1,0
5,0,1,1,1,1,1
6,0,0,0,1,1,0
7,1,1,1,1,1,0
8,0,0,0,0,1,0
9,1,0,0,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,0,0,1,0,0
1,1,1,1,1,1,0
2,0,1,1,1,1,0
3,1,1,1,0,1,0
4,1,1,1,0,1,0
5,0,1,1,1,1,1
6,0,0,0,1,1,0
7,1,1,1,1,1,0
8,0,0,0,0,1,0
9,1,0,0,1,1,0



This is the buy signal:


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,0,0,4,1.0,1.0,1.0,100.0,Valid,Do NOT buy,GGG



This is the stock_api_call at each loop:

15

This is the ticker:

FR

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


FR


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,0.28,,
1,2021-07-21,0.48,0.4708,0.0092,1.9541
2,2021-04-21,0.46,0.4564,0.0036,0.7888
3,2021-02-10,0.44,0.4418,-0.0018,-0.4074
4,2020-10-21,0.49,0.45,0.04,8.8889
...,...,...,...,...,...
90,1999-03-31,0.4579,,,
91,1998-12-31,0.0255,,,
92,1998-09-30,0.4131,,,
93,1998-06-30,0.3208,,,


This should not print if the continue function works
FR has reported earnings 95 times, as of the current date.

This is the daily adjusted dataframe
FR
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,1,1,1,0,0
1,1,1,1,0,1,0
2,1,1,1,0,1,0
3,0,0,1,0,0,0
4,1,1,1,1,1,1
5,1,1,1,1,1,1
6,0,1,0,0,1,1
7,1,1,1,1,1,1
8,1,1,1,0,1,1
9,1,1,1,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,1,1,1,0,0
1,1,1,1,0,1,0
2,1,1,1,0,1,0
3,0,0,1,0,0,0
4,1,1,1,1,1,1
5,1,1,1,1,1,1
6,0,1,0,0,1,1
7,1,1,1,1,1,1
8,1,1,1,0,1,1
9,1,1,1,1,1,0



This is the buy signal:


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,1,0,2,1,0.833333,0.5,0.5,50.0,Valid,Do NOT buy,FR



This is the stock_api_call at each loop:

15

This is the ticker:

NEE

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


NEE


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,0.71,,
1,2021-07-23,0.71,0.691,0.019,2.7496
2,2021-04-21,0.67,0.6126,0.0574,9.3699
3,2021-01-26,0.4,0.384,0.016,4.1667
4,2020-10-21,0.665,0.646,0.019,2.9412
...,...,...,...,...,...
98,1997-04-17,0.29,0.29,0,0
99,1997-01-16,0.25,0.24,0.01,4.1667
100,1996-10-15,0.72,0.71,0.01,1.4085
101,1996-07-18,0.43,0.42,0.01,2.381


This should not print if the continue function works

Model has been put to sleep after historical earnings call
NEE has reported earnings 103 times, as of the current date.

This is the daily adjusted dataframe
NEE
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,1,1,1,0,0
1,1,1,0,1,1,0
2,1,1,0,0,1,0
3,0,1,1,0,0,0
4,1,1,1,0,1,1
5,1,1,1,0,1,0
6,1,1,0,1,1,0
7,0,0,1,1,0,1
8,1,1,1,1,0,0
9,1,1,1,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,1,1,1,1,0,0
1,1,1,0,1,1,0
2,1,1,0,0,1,0
3,0,1,1,0,0,0
4,1,1,1,0,1,1
5,1,1,1,0,1,0
6,1,1,0,1,1,0
7,0,0,1,1,0,1
8,1,1,1,1,0,0
9,1,1,1,1,1,0



This is the buy signal:


  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,1,0,3,0.5625,0.75,0.642857,75.0,Invalid,Do NOT buy,NEE



This is the stock_api_call at each loop:

15

This is the ticker:

FHN

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


FHN


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,0.41,,
1,2021-07-16,0.58,0.4372,0.1428,32.6624
2,2021-04-21,0.51,0.3705,0.1395,37.6518
3,2021-01-22,0.46,0.3352,0.1248,37.2315
4,2020-10-23,0.35,0.2294,0.1206,52.5719
...,...,...,...,...,...
98,1997-04-14,0.25,0.25,0,0
99,1997-01-21,0.33,0.34,-0.01,-2.9412
100,1996-10-17,0.3,0.31,-0.01,-3.2258
101,1996-07-15,0.26,0.27,-0.01,-3.7037


This should not print if the continue function works
FHN has reported earnings 103 times, as of the current date.

This is the daily adjusted dataframe
FHN
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,1,1,0,0
1,1,1,0,0,0,0
2,1,1,1,1,1,0
3,1,1,1,1,1,0
4,1,1,1,0,1,0
5,1,0,1,0,0,0
6,0,0,0,1,0,1
7,1,1,1,1,0,0
8,1,1,0,1,0,0
9,1,1,1,1,0,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,1,1,0,0
1,1,1,0,0,0,0
2,1,1,1,1,1,0
3,1,1,1,1,1,0
4,1,1,1,0,1,0
5,1,0,1,0,0,0
6,0,0,0,1,0,1
7,1,1,1,1,0,0
8,1,1,0,1,0,0
9,1,1,1,1,0,0



This is the buy signal:


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,0,0,4,1.0,1.0,1.0,100.0,Valid,Do NOT buy,FHN



This is the stock_api_call at each loop:

15

This is the ticker:

EFX

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


Model has been put to sleep after the alpha spy
EFX


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,1.72,,
1,2021-07-21,1.98,1.7152,0.2648,15.4384
2,2021-04-21,1.97,1.5727,0.3973,25.2623
3,2021-02-10,2,1.8295,0.1705,9.3195
4,2020-10-21,1.87,1.5985,0.2715,16.9847
...,...,...,...,...,...
98,1997-04-15,0.31,0.3,0.01,3.3333
99,1997-01-21,0.36,0.36,0,0
100,1996-10-16,0.32,0.31,0.01,3.2258
101,1996-07-17,0.28,0.28,0,0


This should not print if the continue function works
EFX has reported earnings 103 times, as of the current date.

This is the daily adjusted dataframe
EFX
2021-10-20


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,1,0,0,0
1,1,1,1,0,1,0
2,0,1,1,0,1,0
3,1,1,1,0,0,0
4,1,1,0,1,1,0
5,1,1,1,1,1,0
6,0,1,0,1,0,1
7,1,1,1,1,1,0
8,1,1,0,1,1,0
9,1,1,1,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,1,1,0,0,0
1,1,1,1,0,1,0
2,0,1,1,0,1,0
3,1,1,1,0,0,0
4,1,1,0,1,1,0
5,1,1,1,1,1,0
6,0,1,0,1,0,1
7,1,1,1,1,1,0
8,1,1,0,1,1,0
9,1,1,1,1,1,0



This is the buy signal:


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,0,0,4,1.0,1.0,1.0,100.0,Valid,Do NOT buy,EFX



This is the stock_api_call at each loop:

15

This is the ticker:

DFS

2021-10-20

This is the alphaSPY


Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,451.13,452.732,451.01,452.41,452.41,49571569,0.0,1.0,0.2837,0.3831,2021-10-20
1,448.92,450.71,448.27,450.64,450.64,46996827,0.0,1.0,0.3831,0.2518,2021-10-19
2,443.97,447.55,443.27,447.19,447.19,62213228,0.0,1.0,0.7253,0.1081,2021-10-18
3,444.75,446.26,444.09,445.87,445.87,66260210,0.0,1.0,0.2518,-0.3686,2021-10-15
4,439.08,442.66,438.58,442.5,442.5,70236825,0.0,1.0,0.7789,-0.3686,2021-10-14
5,434.71,436.05,431.54,435.18,435.18,72973979,0.0,1.0,0.1081,-0.3686,2021-10-13
6,435.67,436.1,432.78,433.62,433.62,71181163,0.0,1.0,-0.4705,0.0616,2021-10-12
7,437.16,440.26,434.62,434.69,434.69,65233285,0.0,1.0,-0.565,0.0616,2021-10-11
8,439.48,439.89,437.19,437.86,437.86,74557404,0.0,1.0,-0.3686,0.6647,2021-10-08
9,438.39,441.68,438.2,438.66,438.66,72437499,0.0,1.0,0.0616,0.6647,2021-10-07


DFS


This is the historical earnings dataframe


Unnamed: 0,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,2021-10-20,,3.53,,
1,2021-07-21,5.55,3.8432,1.7068,44.4109
2,2021-04-21,5.04,2.7744,2.2656,81.6609
3,2021-01-20,2.9423,2.3566,0.5857,24.8536
4,2020-10-21,2.45,1.5423,0.9077,58.8537
5,2020-07-22,-1.2,-0.0252,-1.1748,-4661.9048
6,2020-04-22,-0.25,1.7628,-2.0128,-114.182
7,2020-01-23,2.25,2.2356,0.0144,0.6441
8,2019-10-22,2.36,2.2736,0.0864,3.8001
9,2019-07-23,2.32,2.1073,0.2127,10.0935


This should not print if the continue function works
DFS has reported earnings 58 times, as of the current date.

This is the daily adjusted dataframe
DFS
2021-10-20

model has been put to sleep after the daily adjusted call


Run the analysis with just clean variables
Clean Variables:


This is the dataframe that contains all the clean variables necessary for analysis:


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,0,1,1,0,0
1,1,1,1,1,1,0
2,1,1,0,1,1,1
3,1,1,1,0,1,0
4,1,1,1,0,1,0
5,0,0,1,1,1,1
6,0,0,0,0,1,0
7,0,1,1,1,1,0
8,1,1,0,1,0,0
9,1,1,1,1,1,0


Unnamed: 0,earnings_event,surprise_event,adjusted_price_event,intraday_event,spy_event,class_variable
0,0,0,1,1,0,0
1,1,1,1,1,1,0
2,1,1,0,1,1,1
3,1,1,1,0,1,0
4,1,1,1,0,1,0
5,0,0,1,1,1,1
6,0,0,0,0,1,0
7,0,1,1,1,1,0
8,1,1,0,1,0,0
9,1,1,1,1,1,0



This is the buy signal:


  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,0,1,0,3,0.5625,0.75,0.642857,75.0,Invalid,Do NOT buy,DFS



[{'TP': [0], 'FP': [0], 'FN': [1], 'TN': [3], 'Precision WA': [1.0], 'Recall WA': [0.75], 'F1-Score WA': [0.8571428571428571], 'Model Score %': [75.0], 'Model Recommendation': ['Valid'], 'Buy Signal': ['Do NOT buy'], 'symbol': 'LVS'}, {'TP': [0], 'FP': [1], 'FN': [0], 'TN': [3], 'Precision WA': [0.5625], 'Recall WA': [0.75], 'F1-Score WA': [0.6428571428571428], 'Model Score %': [75.0], 'Model Recommendation': ['Invalid'], 'Buy Signal': ['Do NOT buy'], 'symbol': 'LAD'}, {'TP': [0], 'FP': [2], 'FN': [0], 'TN': [2], 'Precision WA': [0.25], 'Recall WA': [0.5], 'F1-Score WA': [0.3333333333333333], 'Model Score %': [50.0], 'Model Recommendation': ['Invalid'], 'Buy Signal': ['Do NOT buy'], 'symbol': 'IBM'}, {'TP': [0], 'FP': [1], 'FN': [0], 'TN': [3], 'Precision WA': [0.5625], 'Recall WA': [0.75], 'F1-Score WA': [0.6428571428571428], 'Model Score %': [75.0], 'Model Recommendation': ['Invalid'], 'Buy Signal': ['Do NOT buy'], 'symbol': 'HLX'}, {'TP': [0], 'FP': [0], 'FN': [0], 'TN': [4], 'Prec

Unnamed: 0,TP,FP,FN,TN,Precision WA,Recall WA,F1-Score WA,Model Score %,Model Recommendation,Buy Signal,symbol
0,[0],[0],[1],[3],[1.0],[0.75],[0.8571428571428571],[75.0],[Valid],[Do NOT buy],LVS
1,[0],[1],[0],[3],[0.5625],[0.75],[0.6428571428571428],[75.0],[Invalid],[Do NOT buy],LAD
2,[0],[2],[0],[2],[0.25],[0.5],[0.3333333333333333],[50.0],[Invalid],[Do NOT buy],IBM
3,[0],[1],[0],[3],[0.5625],[0.75],[0.6428571428571428],[75.0],[Invalid],[Do NOT buy],HLX
4,[0],[0],[0],[4],[1.0],[1.0],[1.0],[100.0],[Valid],[Do NOT buy],GGG
5,[1],[0],[2],[1],[0.8333333333333334],[0.5],[0.5],[50.0],[Valid],[Do NOT buy],FR
6,[0],[1],[0],[3],[0.5625],[0.75],[0.6428571428571428],[75.0],[Invalid],[Do NOT buy],NEE
7,[0],[0],[0],[4],[1.0],[1.0],[1.0],[100.0],[Valid],[Do NOT buy],FHN
8,[0],[0],[0],[4],[1.0],[1.0],[1.0],[100.0],[Valid],[Do NOT buy],EFX
9,[0],[1],[0],[3],[0.5625],[0.75],[0.6428571428571428],[75.0],[Invalid],[Do NOT buy],DFS


Thank you for using the model.
Please wait 60 seconds before running the model again.



In [98]:

stock_api_calls = 15

#this list will contain the confusion matrix for each stock in the for loop
#list of dictionaries
#     model_result_list = []
list_of_ticker_symbols = ['a','v','m','gv','f']


#for each ticker in the list of symbols, run each ticker through the functions
for ticker in list_of_ticker_symbols:
    if ticker == 'm':
        continue
    else:
        print('Stock Symbol:\n')
        print(ticker)
        print()

# print(stock_api_calls)    

Stock Symbol:

a

Stock Symbol:

v

Stock Symbol:

gv

Stock Symbol:

f



In [87]:
stock_api_calls = 10

#this list will contain the confusion matrix for each stock in the for loop
#list of dictionaries
#     model_result_list = []
list_of_ticker_symbols = ['a','v','m','gv','f']


#for each ticker in the list of symbols, run each ticker through the functions
for ticker in list_of_ticker_symbols:
    #while there is still stock information to be gathered:
    while stock_api_calls > 0:
        stock_api_calls = stock_api_calls - 3
        print(ticker)

print(stock_api_calls)    

a
a
a
a
-2


In [None]:


#     print()
#     print('This is the stock_api_call at each loop:')
#     print()
#     print(stock_api_calls)
#     print()
#     print('This is the ticker:')
#     print()
#     print(ticker)
#     print()


#     #change value of data_type_q so that the function can pull daily prices for SPY
#     data_type_q = '3'

#     #the symbol here SPY will be constant and won't change
#     alphaSPY  = alphaDataRetriever(api_q,data_type_q,'SPY',date_of_interest)

#     #this is the second call
#     total_api_calls = total_api_calls + 1 

#     #subtract the call from the stock count
#     stock_api_calls = stock_api_calls - 1


#     if total_api_calls == api_calls_per_minute:
#         total_api_calls = 0
#         time.sleep(60)

#     else:
#         total_api_calls = total_api_calls


#     #assign symbol the same value as ticker so that the tickers in the column are inputted into the historical earnings function
#     symbol = ticker    

#     #switch the value of data_type_q to 2 so that the historical earnings table can be pulled via alphaDataRetriever
#     data_type_q = '2'

#     #run the alphaDataretriever to get the historical earnings for each symbol in the list

#     try:
#         alphaHistoricalEarnings  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
#         print()
# #                         print('This is the historical earnings dataframe')
# #                         display(alphaHistoricalEarnings)
#     except:
#         print()
#         print(ticker)
#         print('An error has occured in the historical earnings function. Check inputs.')



#     #the historical earnings are called 
#     total_api_calls = total_api_calls + 1

#     #subtract the call from the stock count
#     stock_api_calls = stock_api_calls - 1


#     #if the total api calls are 5 then a full set has been completed and it can be subtracted from the total sets needed to gather the information
#     if total_api_calls == api_calls_per_minute:

#         #reset the number of total api calls
#         total_api_calls = 0

#         #sleep the model because an additional api call will break the model
#         time.sleep(60)

#     #if the total api calls isn't equal to 5 then keep the value the same
#     else:

#         total_api_calls = total_api_calls 


#     #the model should have at least 3 years worth of quarterly reports so that 2 years (8 quarters) can be used to train the model and 1 year (4 quarters) can be used to test
#     #if the model has 12 or more instances, then conduct the analysis and provide a buying signal
#     if alphaHistoricalEarnings.shape[0] >= 12:
#         print(symbol,'has reported earnings',alphaHistoricalEarnings.shape[0],'times, as of the current date.')
#         print()

#         #change the value of data_type_q to 3 to call the adjusted price function
#         data_type_q = '3'

#         #run the function that gets the daily adjusted prices for the ticker symbol in the loop
#         try:
#             alphaDailyAdjusted  = alphaDataRetriever(api_q,data_type_q,symbol,date_of_interest)
#             print()
#             alphaDailyAdjustedError = False
# #                             print('This is the daily ajusted individual')
# #                             display(alphaDailyAdjusted)
#         except:
#             alphaDailyAdjustedError = True
#             print('symbol:',symbol)
#             print()

#             print('date_of_interest',date_of_interest)
#             print('An error has occured in the alphaDailyAdjusted function. Check inputs.')

#         if alphaDailyAdjustedError:
#             continue

#         #count the call
#         total_api_calls = total_api_calls + 1

#         #subtract the set from the stock count
#         stock_api_calls = stock_api_calls - 1


#         #if the total api calls are 5 then a full set has been completed and it can be subtracted from the total sets needed to gather the information
#         if total_api_calls == api_calls_per_minute:
#             #reset the number of total api calls
#             total_api_calls = 0

#             #sleep the model because an additional api call will break the model
#             time.sleep(60)

#         #keep as is if the total api calls isnt equal to 5 
#         else:

#             total_api_calls = total_api_calls


#         #make sure to call the finhub_function for each stock

#         #debug
#         #finhub function is not needed because no real time data needs to be pulled
#         #call the finhub function that will be used to create buying signals
#         finhub_function = 'N/A'

#         try:

#             #call the historicalEarningsTransformer to calculate the first 2 variables
#             historical_earnings_main = historicalEarningsTransformer(date_type,alphaHistoricalEarnings,finhub_function)

#             #display
#             print()
# #                             print('Below is the historical main function that returns the two x variables')
# #                             display(historical_earnings_main.head(15)) 
#         except:
#             print()
#             print('An error has occured in the historicalEarningsTransformer function. Check inputs.')


#         try:
#             #this runs the first part of the model where the variables are transformed only if the api call count is within the 5 call limit
#             dailyAdustedPriceTransformer_main = dailyAdustedPriceTransformer(alphaDailyAdjusted,historical_earnings_main,alphaSPY)

#             #display
#             print()
# #                             print('This is the dataframe with the remaining 3 variables that pertain to the earnings data')
# #                             display(dailyAdustedPriceTransformer_main.head(15)) 
#         except:
#             print()
#             print('An error has occured in the dailyAdustedPriceTransformer function. Check inputs.')


#         #call the finalTransformation function that will get all the x and y variables ready
# #                             output_type = input('What type of output would you like?:\n\nLogic will display how the model variables were created, while Clean Variables will not:\n\n 1.Logic\n\n 2.Clean Variables\n\n')

# #                                         output_type = '2'

#         #if the user chooses output type logic, then print the following statements before running final transformer
#         if output_type == '1':

#             #print the following statements:
#             #first run the historical earnings report for the stock 
#             print('1. Retrieve the historical earnings report for the stock:\n')
#             print('This dataframe contains %s\'s historical earnings report. The first row pertains to either the most recent earnings date (if the date is in the past/the same as today) or the upcoming earnings report date (if the date is in the future)'%(ticker))
#             print()
#             try:
#                 display(alphaHistoricalEarnings.head(15))
#                 print()
#             except:
#                 print('Does not exist.')
#             #secondly
#             print('2. Insert the historical earnings report into the historicalEarningsTransformer function:\n')
#             print('This is %s\'s most updated earnings table that contains the earnings_event and surprise_event variables that will be used in the analysis:\n'%(ticker))

#             try:
#                 display(historical_earnings_main.head(15))
#                 print()
#             except:
#                 print('Does not exist.')

#             #thirdly
#             print('3. Retrieve the last 5,000 daily adjusted prices for %s in order to create the intraday_event and adjusted_price_event variables:\n'%(ticker))

#             try:
#                 display(alphaDailyAdjusted.head(15))
#                 print()
#             except:
#                 print('Does not exist.')

#             #fourth

#             try:
#                 print('4. Retrieve the last 5,000 daily adjusted prices for SPY in order to create the spy_event variable:\n')
#                 display(alphaSPY.head(15))
#                 print()
#             except:
#                 print('Does not exist.')

#             #fifth
#             try:
#                 print('5. Connect both daily adjusted price dataframes (Stock Prices and SPY Prices) with the historical earnings report in order to calculate the x variables intraday_event, adjusted_price_event, spy_event:\n')
#                 display(dailyAdustedPriceTransformer_main.head(15))
#                 print()
#             except:
#                 print('Does not exist.')

#             #sixth
#             print('6. Run the final tranformer function to generate the class variable:\n')
#             print()





#         #parameters include : individual_hist_earn,both_dailyAdjustedPrices,daily_prices,output_type
#         finTransformationTransformer_main = finalTransformation(historical_earnings_main,dailyAdustedPriceTransformer_main,alphaDailyAdjusted,output_type,date_type)
#         print()
#         print('This is the dataframe that contains all the clean variables necessary for analysis:')

#         try:
#             display(finTransformationTransformer_main.head(15))
#         except:
#             print('No variables to display.')
#             print()

#         try:
#             #run the model by calling the naivebayesPredictor function (returns a dictionary):
#             model_output = naiveBayesPredictor(finTransformationTransformer_main,date_type)
#         except:
#             print('No predicitive output to display.')
#             print()

#         #add the symbol to the dictionary values 
#         model_output['symbol'] = symbol 

#         model_result_list.append(model_output)

#         #convert the dictionary into a readable dataframe:
#         individual_analysis_results = pd.DataFrame.from_dict(model_output)

#         #display the pretty dataframe
#         display(individual_analysis_results)



In [24]:
import random
#Generate 5 random numbers between 10 and 30
randomlist = random.sample(range(10, 30), 10)
print(randomlist)

randomlist = randomlist[0:10]
randomlist

[28, 17, 20, 29, 15, 26, 19, 27, 23, 10]


[28, 17, 20, 29, 15, 26, 19, 27, 23, 10]

In [120]:
#inputs for the historical earnings transformer function

#first, get the historical earnings for a specific stock
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '3' 
symbol = 'ANTM'
date_of_interest = '2021-10-20'

present_day_test = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
present_day_test

Unnamed: 0,1. open,2. high,3. low,4. close,5. adjusted close,6. volume,7. dividend amount,8. split coefficient,Intraday_Returns,Weekly_Median_Return,trading_date
0,400.00,425.44,400.00,424.05,424.050000,2320068,0.0,1.0,6.0125,0.1376,2021-10-20
1,390.91,397.29,390.72,393.75,393.750000,1319681,0.0,1.0,0.7265,-0.5747,2021-10-19
2,391.64,392.85,387.30,388.21,388.210000,928129,0.0,1.0,-0.8758,-0.5747,2021-10-18
3,392.51,394.49,389.30,393.05,393.050000,1145502,0.0,1.0,0.1376,-0.5747,2021-10-15
4,387.71,399.16,386.85,390.71,390.710000,1753894,0.0,1.0,0.7738,-0.5747,2021-10-14
...,...,...,...,...,...,...,...,...,...,...,...
5024,43.50,43.50,42.75,42.89,18.184801,1381900,0.0,1.0,-58.1959,,2001-11-05
5025,42.05,43.10,42.02,42.90,18.189041,2037500,0.0,1.0,-56.7443,,2001-11-02
5026,42.25,42.60,41.80,42.17,17.879531,1950900,0.0,1.0,-57.6816,,2001-11-01
5027,41.10,42.50,41.01,41.88,17.756574,5552200,0.0,1.0,-56.7967,,2001-10-31


In [168]:
#inputs for the historical earnings transformer function

#first, get the historical earnings for a specific stock
api_key = 'TQ6TR98HVUPE3L9Z'
data_of_interest = '1'
symbol = ''
date_of_interest = '2021-10-20'

present_day_test = alphaDataRetriever(api_key,data_of_interest,symbol,date_of_interest)
# present_day_test.iloc[0:10,0]
present_day_test

Unnamed: 0,symbol,name,reportDate,estimate


In [None]:
data_type_q = '1'

#change symbol value
symbol = ''

#run the alphaDataRetriever to get the earnings calendar 
alphaEarningsCalendar  = alphaDataRetriever(api_key,data_type_q,symbol,date_of_interest)
print('This should be earnings calendar')
display(alphaEarningsCalendar)