### This Workbook will show how to download historical earnings and analyst consensus earnings estimates for any S&P500 constituent

Data Source: Alpha Vantage API for quarterly analyst consensus earnings estimates.<br> Alternative Sources: Zacks Estimates via Nasdaq Data Link or yFinance for quarterly and 1 to 2 years out earnings estimates.

__Note:__ Earnings estimates are not static. Consensus numbers are an average of estimates produced by analysts following a security. EPS estimates may be impacted by corporate actions (dividens, share buyback, stock splits) going forward. As such, any figures presented bellow are only a snapshot in time. 

On average, analyst earnings expectations at the beginning of the quarter are around **30% higher**, and then they bring those down towards reporting date. Operating forward earnings or pro forma expected earnings represent what the reported earnings would be if everything went perfectly right for the company. This is not always the case, hence, the discrepency between operating forward earnings and reported/trailing GAAP earnings. For example, analysts on average estimate elevated corporate profit margins going forward in an environment that does not resemble the period of 2020 - early 2022 (tight inventory due to shutdowns and supply side constraints, comparatively lower wage, benefits and interest costs, higher levels of fiscal spending). 

However, under normal conditions, company analysts tend to correctly estimate **trends** in earnings growth notwithstanding any systematic market dislocations (recessions, pandemic, drastic changes in monetary or fiscal policy etc.). Much of what makes equity prices drift higher or lower in the short-term after earnings also causes the shares to continue moving in the same direction over the next 90 days until the next earnings release. It is important to monitor path of earnings forecasts (revisions up vs. revisions down), changes in company forward guidance, positive/negative earnings release surprises, and generally, where the asset is in the **Earnings Expectations Life Cycle**.    

   

 

#### Import Libraries

In [1]:
# Provides ways to work with large multidimensional arrays
import numpy as np
# Allows for further data manipulation and analysis
import pandas as pd
# Download data via alpha vantage API
import alpha_vantage

import matplotlib.pyplot as plt # Plotting
import matplotlib.dates as mdates #styling dates
%matplotlib inline

import datetime as dt # For defining dates
import mplfinance as mpf # Matplotlib finance
import time

#getting data from directory
import os
from os import listdir
from os.path import isfile, join

import requests
from io import BytesIO

In [2]:
# API key from alpha vantage
key = open(r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Alpha_vantage_api_key.txt").read()
# Destination folder for downloads
folder = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/S&P500/")

#### Return DataFrame from CSV 

In [3]:
def get_df_from_csv(folder, ticker):
    try:  
        df = pd.read_csv(folder + ticker + ".csv") 
    except FileNotFoundError:
        print("File doesn't exist.")
    else:
        return df

#### Get Quarterly Earnings Estimates - Download and Save Data to CSV

In [4]:
info_not_downloaded = []

In [5]:
# get quarterly earnings estimates via EARNINGS_CALENDAR function
# without a ticker/tickers, function bellow will download all the available earnings calendar data 
horizon=["3month","6month","12month"]
base_url = 'https://www.alphavantage.co/query?'
def get_earnings_estimates(folder, api_key, horizon, ticker): 
    try:
        print("Get data for:", ticker)
        if ticker is not None:
            url = f'{base_url}function=EARNINGS_CALENDAR&symbol={ticker}&horizon={horizon}&apikey={key}'
            response = requests.get(url)
        else:
            url = f"{base_url}function=EARNINGS_CALENDAR&horizon={horizon}&apikey={key}"
            response = requests.get(url)
            
        df = pd.read_csv(BytesIO(response.content)) # data kept as bytes in an in-memory buffer
        df.to_csv(folder + ticker + ".csv") # save ticker data from df to csv
        
    except Exception as ex:
        info_not_downloaded.append(ticker) #append list not downloaded
        print("Couldn't get data for:", ticker)  

In [6]:
# get quarterly earnings estimates for any individual security. Downloaded on 23 Jan 2023. 
get_earnings_estimates(folder, key, horizon[2], 'TRV')

Get data for: TRV


In [7]:
TRV = get_df_from_csv(folder, 'TRV')
TRV

Unnamed: 0.1,Unnamed: 0,symbol,name,reportDate,fiscalDateEnding,estimate,currency
0,0,TRV,Travelers Companies Inc,2023-01-24,2022-12-31,3.44,USD
1,1,TRV,Travelers Companies Inc,2023-04-17,2023-03-31,3.83,USD
2,2,TRV,Travelers Companies Inc,2023-07-19,2023-06-30,,USD
3,3,TRV,Travelers Companies Inc,2023-10-17,2023-09-30,,USD


#### Get Earnings Estimates for 11 S&P500 Sectors 

Data downloaded on 23 Jan 2023. 

In [8]:
# read list of all S&P500 constituent securities
sec_df = pd.read_csv(r"C:/Users/User/Documents/PM Tools/Earnings Tracker/S&P500 constituents.csv")
sec_df

Unnamed: 0,Symbol,Name,Sector
0,MMM,3M,Industrials
1,AOS,A. O. Smith,Industrials
2,ABT,Abbott Laboratories,Health Care
3,ABBV,AbbVie,Health Care
4,ABMD,Abiomed,Health Care
...,...,...,...
497,YUM,Yum! Brands,Consumer Discretionary
498,ZBRA,Zebra Technologies,Information Technology
499,ZBH,Zimmer Biomet,Health Care
500,ZION,Zions Bancorp,Financials


In [9]:
# create dataframe's for all 11 S&P500 sectors
industrials_df = sec_df.loc[sec_df['Sector'] == 'Industrials']
health_df = sec_df.loc[sec_df['Sector'] == "Health Care"]
it_df = sec_df.loc[sec_df['Sector'] == "Information Technology"]
communications_df = sec_df.loc[sec_df['Sector'] == "Communication Services"]
staples_df = sec_df.loc[sec_df['Sector'] == "Consumer Staples"]
discretionary_df = sec_df.loc[sec_df['Sector'] == "Consumer Discretionary"]
utilities_df = sec_df.loc[sec_df['Sector'] == "Utilities"]
financials_df = sec_df.loc[sec_df['Sector'] == "Financials"]
materials_df = sec_df.loc[sec_df['Sector'] == "Materials"]
real_estate_df = sec_df.loc[sec_df['Sector'] == "Real Estate"]
energy_df = sec_df.loc[sec_df['Sector'] == "Energy"]

In [10]:
energy_df # S&P500 energy shares

Unnamed: 0,Symbol,Name,Sector
43,APA,APA Corporation,Energy
57,BKR,Baker Hughes,Energy
103,CVX,Chevron Corporation,Energy
123,COP,ConocoPhillips,Energy
130,CTRA,Coterra,Energy
142,DVN,Devon Energy,Energy
144,FANG,Diamondback Energy,Energy
171,EOG,EOG Resources,Energy
185,XOM,ExxonMobil,Energy
219,HAL,Halliburton,Energy


In [11]:
data_not_downloaded = [] # list of tickers where data didn't download

In [12]:
# get quarterly earnings estimates for selected S&P500 sector components
horizon=["3month","6month","12month"]
base_url = 'https://www.alphavantage.co/query?'
def get_sector_earnings_estimates(shares_df, folder, api_key, horizon): 
    # iterate through sector dataframe rows, where "row["Symbol"]" represents the required ticker
    for index, row in shares_df.iterrows():
        try:
            print("Get data for:", row["Symbol"])
            url = f'{base_url}function=EARNINGS_CALENDAR&symbol={row["Symbol"]}&horizon={horizon}&apikey={key}'
            response = requests.get(url)
                       
            df = pd.read_csv(BytesIO(response.content)) # data kept as bytes in an in-memory buffer
            df.to_csv(folder + row["Symbol"] + ".csv") # save ticker data from df to csv
        
        except Exception as ex:
            data_not_downloaded.append(ticker) #append list not downloaded
            print("Couldn't get data for:", row["Symbol"])

In [13]:
folder_energy = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Energy/")

In [14]:
get_sector_earnings_estimates(energy_df, folder_energy, key, horizon[2])

Get data for: APA
Get data for: BKR
Get data for: CVX
Get data for: COP
Get data for: CTRA
Get data for: DVN
Get data for: FANG
Get data for: EOG
Get data for: XOM
Get data for: HAL
Get data for: HES
Get data for: KMI
Get data for: MRO
Get data for: MPC
Get data for: OXY
Get data for: OKE
Get data for: PSX
Get data for: PXD
Get data for: SLB
Get data for: VLO
Get data for: WMB


In [79]:
len(data_not_downloaded)


0

In [15]:
EOG = get_df_from_csv(folder_energy, 'EOG')
EOG

Unnamed: 0.1,Unnamed: 0,symbol,name,reportDate,fiscalDateEnding,estimate,currency
0,0,EOG,EOG Resources Inc,2023-02-23,2022-12-31,3.47,USD
1,1,EOG,EOG Resources Inc,2023-05-03,2023-03-31,3.55,USD
2,2,EOG,EOG Resources Inc,2023-08-02,2023-06-30,,USD
3,3,EOG,EOG Resources Inc,2023-11-01,2023-09-30,,USD


In [16]:
folder_industrials = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Industrials/")

In [17]:
get_sector_earnings_estimates(industrials_df, folder_industrials, key, horizon[2])

Get data for: MMM
Get data for: AOS
Get data for: ALK
Get data for: ALLE
Get data for: AAL
Get data for: AME
Get data for: BA
Get data for: CHRW
Get data for: CARR
Get data for: CAT
Get data for: CTAS
Get data for: CPRT
Get data for: CSX
Get data for: CMI
Get data for: DE
Get data for: DAL
Get data for: DOV
Get data for: ETN
Get data for: EMR
Get data for: EFX
Get data for: EXPD
Get data for: FAST
Get data for: FDX
Get data for: FTV
Get data for: FBIN
Get data for: GNRC
Get data for: GD
Get data for: GE
Get data for: HON
Get data for: HWM
Get data for: HII
Get data for: IEX
Get data for: INFO
Get data for: ITW
Get data for: IR
Get data for: JBHT
Get data for: J
Get data for: JCI
Get data for: KSU
Get data for: LHX
Get data for: LDOS
Get data for: LMT
Get data for: MAS
Get data for: NLSN
Get data for: NSC
Get data for: NOC
Get data for: ODFL
Get data for: OTIS
Get data for: PCAR
Get data for: PH
Get data for: PNR
Get data for: PWR
Get data for: RTX
Get data for: RSG
Get data for: RHI
Ge

In [18]:
folder_health = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Health/")

In [19]:
get_sector_earnings_estimates(health_df, folder_health, key, horizon[2])

Get data for: ABT
Get data for: ABBV
Get data for: ABMD
Get data for: A
Get data for: ALGN
Get data for: ABC
Get data for: AMGN
Get data for: BAX
Get data for: BDX
Get data for: BIO
Get data for: TECH
Get data for: BIIB
Get data for: BSX
Get data for: BMY
Get data for: CAH
Get data for: CTLT
Get data for: CNC
Get data for: CERN
Get data for: CRL
Get data for: CI
Get data for: CVS
Get data for: DHR
Get data for: DVA
Get data for: XRAY
Get data for: DXCM
Get data for: EW
Get data for: LLY
Get data for: GILD
Get data for: HCA
Get data for: HSIC
Get data for: HOLX
Get data for: HUM
Get data for: IDXX
Get data for: ILMN
Get data for: INCY
Get data for: ISRG
Get data for: IQV
Get data for: JNJ
Get data for: LH
Get data for: MCK
Get data for: MDT
Get data for: MRK
Get data for: MTD
Get data for: MRNA
Get data for: OGN
Get data for: PKI
Get data for: PFE
Get data for: DGX
Get data for: REGN
Get data for: RMD
Get data for: STE
Get data for: SYK
Get data for: TFX
Get data for: COO
Get data for: 

In [20]:
folder_it = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/IT/")

In [21]:
get_sector_earnings_estimates(it_df, folder_it, key, horizon[2])

Get data for: ACN
Get data for: ADBE
Get data for: AMD
Get data for: AKAM
Get data for: APH
Get data for: ADI
Get data for: ANSS
Get data for: AAPL
Get data for: AMAT
Get data for: ANET
Get data for: ADSK
Get data for: ADP
Get data for: AVGO
Get data for: BR
Get data for: CDNS
Get data for: CDW
Get data for: CDAY
Get data for: CSCO
Get data for: CTXS
Get data for: CTSH
Get data for: GLW
Get data for: DXC
Get data for: ENPH
Get data for: FFIV
Get data for: FIS
Get data for: FISV
Get data for: FLT
Get data for: FTNT
Get data for: IT
Get data for: GPN
Get data for: HPE
Get data for: HPQ
Get data for: IBM
Get data for: INTC
Get data for: INTU
Get data for: IPGP
Get data for: JKHY
Get data for: JNPR
Get data for: KEYS
Get data for: KLAC
Get data for: LRCX
Get data for: MA
Get data for: MCHP
Get data for: MU
Get data for: MSFT
Get data for: MPWR
Get data for: MSI
Get data for: NTAP
Get data for: NLOK
Get data for: NVDA
Get data for: NXPI
Get data for: ORCL
Get data for: PAYX
Get data for: PA

In [22]:
folder_communications = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Communications/")

In [23]:
get_sector_earnings_estimates(communications_df, folder_communications, key, horizon[2])

Get data for: ATVI
Get data for: GOOGL
Get data for: GOOG
Get data for: T
Get data for: CHTR
Get data for: CMCSA
Get data for: DISCA
Get data for: DISCK
Get data for: DISH
Get data for: EA
Get data for: META
Get data for: FOXA
Get data for: FOX
Get data for: IPG
Get data for: LYV
Get data for: LUMN
Get data for: MTCH
Get data for: NFLX
Get data for: NWSA
Get data for: NWS
Get data for: OMC
Get data for: TMUS
Get data for: TTWO
Get data for: DIS
Get data for: VZ
Get data for: VIAC


In [25]:
folder_staples = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Staples/")

In [26]:
get_sector_earnings_estimates(staples_df, folder_staples, key, horizon[2])

Get data for: ADM
Get data for: MO
Get data for: BF-B
Get data for: CPB
Get data for: CHD
Get data for: CLX
Get data for: KO
Get data for: CL
Get data for: CAG
Get data for: STZ
Get data for: COST
Get data for: EL
Get data for: GIS
Get data for: HRL
Get data for: SJM
Get data for: K
Get data for: KMB
Get data for: KHC
Get data for: KR
Get data for: LW
Get data for: MKC
Get data for: TAP
Get data for: MDLZ
Get data for: MNST
Get data for: PEP
Get data for: PM
Get data for: PG
Get data for: SYY
Get data for: HSY
Get data for: TSN
Get data for: WBA
Get data for: WMT


In [27]:
folder_discretionary = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Discretionary/")

In [28]:
get_sector_earnings_estimates(discretionary_df, folder_discretionary, key, horizon[2])

Get data for: AAP
Get data for: AMZN
Get data for: APTV
Get data for: AZO
Get data for: BBWI
Get data for: BBY
Get data for: BKNG
Get data for: BWA
Get data for: CZR
Get data for: KMX
Get data for: CCL
Get data for: CMG
Get data for: DHI
Get data for: DRI
Get data for: DG
Get data for: DLTR
Get data for: DPZ
Get data for: EBAY
Get data for: ETSY
Get data for: EXPE
Get data for: F
Get data for: GPS
Get data for: GRMN
Get data for: GM
Get data for: GPC
Get data for: HBI
Get data for: HAS
Get data for: HLT
Get data for: HD
Get data for: LVS
Get data for: LEG
Get data for: LEN
Get data for: LKQ
Get data for: LOW
Get data for: MAR
Get data for: MCD
Get data for: MGM
Get data for: MHK
Get data for: NWL
Get data for: NKE
Get data for: NCLH
Get data for: NVR
Get data for: ORLY
Get data for: PENN
Get data for: POOL
Get data for: PHM
Get data for: PVH
Get data for: RL
Get data for: ROST
Get data for: RCL
Get data for: SBUX
Get data for: TPR
Get data for: TGT
Get data for: TSLA
Get data for: TJX


In [29]:
folder_utilities = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Utilities/")

In [30]:
get_sector_earnings_estimates(utilities_df, folder_utilities, key, horizon[2])

Get data for: AES
Get data for: LNT
Get data for: AEE
Get data for: AEP
Get data for: AWK
Get data for: ATO
Get data for: CNP
Get data for: CMS
Get data for: ED
Get data for: D
Get data for: DTE
Get data for: DUK
Get data for: EIX
Get data for: ETR
Get data for: EVRG
Get data for: ES
Get data for: EXC
Get data for: FE
Get data for: NEE
Get data for: NI
Get data for: NRG
Get data for: PNW
Get data for: PPL
Get data for: PEG
Get data for: SRE
Get data for: SO
Get data for: WEC
Get data for: XEL


In [32]:
folder_financials = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Financials/")

In [33]:
get_sector_earnings_estimates(financials_df, folder_financials, key, horizon[2])

Get data for: AFL
Get data for: ALL
Get data for: AXP
Get data for: AIG
Get data for: AMP
Get data for: AON
Get data for: AJG
Get data for: AIZ
Get data for: BAC
Get data for: BRK.B
Get data for: BLK
Get data for: BK
Get data for: BRO
Get data for: COF
Get data for: CBOE
Get data for: SCHW
Get data for: CB
Get data for: CINF
Get data for: C
Get data for: CFG
Get data for: CME
Get data for: CMA
Get data for: DFS
Get data for: RE
Get data for: FITB
Get data for: FRC
Get data for: BEN
Get data for: GL
Get data for: GS
Get data for: HBAN
Get data for: ICE
Get data for: IVZ
Get data for: JPM
Get data for: KEY
Get data for: LNC
Get data for: L
Get data for: MTB
Get data for: MKTX
Get data for: MMC
Get data for: MET
Get data for: MCO
Get data for: MS
Get data for: MSCI
Get data for: NDAQ
Get data for: NTRS
Get data for: PBCT
Get data for: PNC
Get data for: PFG
Get data for: PGR
Get data for: PRU
Get data for: RJF
Get data for: RF
Get data for: SPGI
Get data for: STT
Get data for: SIVB
Get dat

In [34]:
folder_materials = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Materials/")

In [35]:
get_sector_earnings_estimates(materials_df, folder_materials, key, horizon[2])

Get data for: APD
Get data for: ALB
Get data for: AMCR
Get data for: AVY
Get data for: BALL
Get data for: CE
Get data for: CF
Get data for: CTVA
Get data for: DOW
Get data for: DD
Get data for: EMN
Get data for: ECL
Get data for: FMC
Get data for: FCX
Get data for: IFF
Get data for: IP
Get data for: LIN
Get data for: LYB
Get data for: MLM
Get data for: NEM
Get data for: NUE
Get data for: PKG
Get data for: PPG
Get data for: SEE
Get data for: SHW
Get data for: MOS
Get data for: VMC
Get data for: WRK


In [36]:
folder_real_estate = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Real_Estate/")

In [37]:
get_sector_earnings_estimates(real_estate_df, folder_real_estate, key, horizon[2])

Get data for: ARE
Get data for: AMT
Get data for: AVB
Get data for: BXP
Get data for: CBRE
Get data for: CCI
Get data for: DLR
Get data for: EQIX
Get data for: EQR
Get data for: ESS
Get data for: EXR
Get data for: FRT
Get data for: PEAK
Get data for: HST
Get data for: IRM
Get data for: KIM
Get data for: MAA
Get data for: PLD
Get data for: PSA
Get data for: O
Get data for: REG
Get data for: SBAC
Get data for: SPG
Get data for: UDR
Get data for: VTR
Get data for: VNO
Get data for: WELL
Get data for: WY


#### Merge data and get earnings estimates for a specific sector or industry 





In [38]:
# get list of energy sector tickers 
PATH = folder_energy
files = [x for x in listdir(PATH) if isfile(join(PATH, x))]
tickers = [os.path.splitext(x)[0] for x in files]
tickers

['APA',
 'BKR',
 'COP',
 'CTRA',
 'CVX',
 'DVN',
 'EOG',
 'FANG',
 'HAL',
 'HES',
 'KMI',
 'MPC',
 'MRO',
 'OKE',
 'OXY',
 'PSX',
 'PXD',
 'SLB',
 'VLO',
 'WMB',
 'XOM']

In [39]:
# adjustment to get dataframe from csv with a single input (ticker)
def get_df_from_file(ticker):
    try:
        df = pd.read_csv(PATH + ticker + '.csv')
    except FileNotFoundError:
        pass
        # print("File Doesn't Exist")
    else:
        return df

In [40]:
# merge dataframes's to get sector earnings estimates for Q1 2023 
def merge_df_by_column_name(column_name, *tickers):
    mult_df = pd.DataFrame()
    
    for x in tickers:
        df = get_df_from_file(x)
        # use mask to select required data
        mask = df['fiscalDateEnding'] == '2023-03-31'
        mult_df[x] = df.loc[mask][column_name]
        
    return mult_df

In [41]:
# merge across "estimate" column and select fiscal date ending with 31 March 2023
mult_df = merge_df_by_column_name('estimate', *tickers)
mult_df

Unnamed: 0,APA,BKR,COP,CTRA,CVX,DVN,EOG,FANG,HAL,HES,...,MPC,MRO,OKE,OXY,PSX,PXD,SLB,VLO,WMB,XOM
1,1.77,0.2971,3.15,1.13,4.02,2.18,3.55,6.06,0.6412,2.19,...,3.27,1.02,1.12,1.86,2.99,6.44,,4.19,0.4848,2.85


In [42]:
# S&P500 energy sector earnings estimate for Q1 2023.
energy_estimates = mult_df.sum(axis=1) # sum estimates for all sector securities
print("$", energy_estimates[1])

$ 49.2131


#### Get Historical Earnings

Function bellow will download historical reported earnings data (quarterly or year end). 

In [27]:
earnings_not_downloaded = [] # list of data not downloaded
folder_1 = (r"C:/Users/User/Documents/PM Tools/Earnings Tracker/Reported Earnings/")
base_url = 'https://www.alphavantage.co/query?'
def get_historical_earnings(folder, api_key, ticker): 
    try:
        print("Get data for:", ticker)
        if ticker is not None:
            url = f'{base_url}function=EARNINGS&symbol={ticker}&apikey={key}'
            response = requests.get(url)
        else:
            url = f"{base_url}function=EARNINGS&apikey={key}"
            response = requests.get(url)
            
        df = pd.DataFrame(response.json()['quarterlyEarnings']) # "quarterlyEarnings" or "annualEarnings"
        df.to_csv(folder + ticker + ".csv") # save ticker data from df to csv
        
    except Exception as ex:
        earnings_not_downloaded.append(ticker) #append list not downloaded
        print("Couldn't get data for:", ticker)  

In [23]:
get_historical_earnings(folder_1, key, 'CVX')

Get data for: CVX


In [26]:
CVX = get_df_from_csv(folder_1, 'CVX')
CVX

Unnamed: 0.1,Unnamed: 0,fiscalDateEnding,reportedDate,reportedEPS,estimatedEPS,surprise,surprisePercentage
0,0,2022-09-30,2022-10-28,5.56,4.81,0.75,15.5925
1,1,2022-06-30,2022-07-29,5.82,5.10,0.72,14.1176
2,2,2022-03-31,2022-04-29,3.36,3.27,0.09,2.7523
3,3,2021-12-31,2022-01-28,2.56,3.12,-0.56,-17.9487
4,4,2021-09-30,2021-10-29,2.96,2.21,0.75,33.9367
...,...,...,...,...,...,...,...
102,102,1997-03-31,1997-04-23,0.62,0.55,0.07,12.7273
103,103,1996-12-31,1997-01-24,0.53,0.49,0.04,8.1633
104,104,1996-09-30,1996-10-21,0.50,0.48,0.02,4.1667
105,105,1996-06-30,1996-07-22,0.54,0.50,0.04,8.0000
