# **Module 1 Homework**
In this homework, we're going to download finance data from various sources and make simple calculations/analysis.



### **Question 1:** [Macro] Average growth of GDP in 2023
What is the average growth (in %) of GDP in 2023?

Download the timeseries Real Gross Domestic Product (GDPC1) from FRED (https://fred.stlouisfed.org/series/GDPC1). Calculate year-over-year (YoY) growth rate (that is, divide current value to one 4 quarters ago). Find the average YoY growth in 2023 (average from 4 YoY numbers). Round to 1 digit after the decimal point: e.g. if you get 5.66% growth => you should answer 5.7

#### *Answer*:

To calculate the average growth of GDP in 2023, I'll follow these steps:

1. Download the Real Gross Domestic Product (GDPC1) timeseries data from FRED.
2. Calculate the year-over-year (YoY) growth rate for each quarter in 2023.
3. Find the average YoY growth rate for 2023.

In [116]:
import pandas_datareader as pdr

In [132]:
# Download GDPC1 from FRED
gdpc = pdr.DataReader("GDPC1", "fred", start="2022-01-01")

In [138]:
gdpc['gdpc_us_yoy'] = (gdpc["GDPC1"]/gdpc["GDPC1"].shift(4)-1) * 100
gdpc.tail()

Unnamed: 0_level_0,GDPC1,gdpc_us_yoy
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
2022-10-01,21989.981,
2023-01-01,22112.329,1.717927
2023-04-01,22225.35,2.382468
2023-07-01,22490.692,2.926887
2023-10-01,22679.255,3.134491


In [139]:
gdpc_2023 = gdpc.loc["2023"]

In [140]:
# Average YoY growth rate for 2023
average_growth_2023 = gdpc_2023['gdpc_us_yoy'].mean()

In [141]:
print(f"The average growth of GDP in 2023 was {round(average_growth_2023,1)}%")

The average growth of GDP in 2023 was 2.5%


### **Question 2**. [Macro] Inverse "Treasury Yield"
Find the min value of (dgs10-dgs2) after since year 2000 (2000-01-01) and write it down as an answer, round to 1 digit after the decimal point.

Download DGS2 and DGS10 interest rates series (https://fred.stlouisfed.org/series/DGS2, https://fred.stlouisfed.org/series/DGS10). Join them together to one dataframe on date (you might need to read about pandas.DataFrame.join()), calculate the difference dgs10-dgs2 daily.

(Additional: think about what does the "inverted yield curve" mean for the market and investors? do you see the same thing in your country/market of interest? Do you think it can be a good predictive feature for the models?)

#### *Answer*

In [10]:
# Download DGS2 and DGS10 interest rates series
dgs2 = pdr.DataReader("DGS2", "fred", start="2000-01-01")
dgs10 = pdr.DataReader("DGS10", "fred", start="2000-01-01")

In [11]:
# Join the two dataframes on date
interest_rates = dgs10.join(dgs2, lsuffix="_10", rsuffix="_2")

In [12]:
# Calculate the difference (dgs10 - dgs2) daily
interest_rates['difference'] = interest_rates['DGS10'] - interest_rates['DGS2']

In [13]:
# Find the min value of the difference since year 2000
min_value = interest_rates['difference'].min()

In [14]:
# Round to 1 decimal place
min_value_rounded = round(min_value, 1)

print(f"The minimum value of (DGS10 - DGS2) since year 2000 is {min_value_rounded}%.")

The minimum value of (DGS10 - DGS2) since year 2000 is -1.1%.


### **Question 3.** [Index] Which Index is better recently?
Compare S&P 500 and IPC Mexico indexes by the 5 year growth and write down the largest value as an answer (%)

Download on Yahoo Finance two daily index prices for S&P 500 (^GSPC, https://finance.yahoo.com/quote/%5EGSPC/) and IPC Mexico (^MXX, https://finance.yahoo.com/quote/%5EMXX/). Compare 5Y growth for both (between 2019-04-09 and 2024-04-09). Select the higher growing index and write down the growth in % (closest integer %). E.g. if ratio end/start was 2.0925 (or growth of 109.25%), you need to write down 109 as your answer.

(Additional: think of other indexes and try to download stats and compare the growth? Do create 10Y and 20Y growth stats. What is an average yearly growth rate (CAGR) for each of the indexes you select?)

#### *Answer*

In [15]:
!pip install -qU yfinance

In [16]:
import yfinance as yf

# Define the start and end dates for the comparison
start_date = "2019-04-16"
end_date = "2024-04-16"

In [23]:
# Download daily index prices for S&P 500 and IPC Mexico indexes
sp500_data = yf.download("^GSPC", start=start_date, end=end_date)
ipc_data = yf.download("^MXX", start=start_date, end=end_date)

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


In [24]:
# Calculate the 5-year growth for each index
sp500_start_price = sp500_data.iloc[0]
sp500_end_price = sp500_data.iloc[-1]
sp500_growth = ((sp500_end_price / sp500_start_price) - 1) * 100

ipc_start_price = ipc_data.iloc[0]
ipc_end_price = ipc_data.iloc[-1]
ipc_growth = ((ipc_end_price / ipc_start_price) - 1) * 100


In [142]:
# Determine which index had the higher growth
if sp500_growth["High"] > ipc_growth["High"]:
    higher_growth_index = "S&P 500"
    growth_percent = round(sp500_growth["High"])
else:
    higher_growth_index = "IPC Mexico"
    growth_percent = round(ipc_growth["High"])

print(f"The index with the higher growth in the last 5 years is {higher_growth_index} with an index of {growth_percent}%.")


The index with the higher growth in the last 5 years is S&P 500 with an index of 77%.


Let me explore **S&P 500** and **NASDAQ** Composite:

In [27]:
import pandas as pd


# Define the indexes
indexes = ["^GSPC", "^IXIC"]

# Download historical data for the indexes
index_data = yf.download(indexes, start=start_date, end=end_date)
index_data.tail()

[*********************100%%**********************]  2 of 2 completed


Price,Adj Close,Adj Close,Close,Close,High,High,Low,Low,Open,Open,Volume,Volume
Ticker,^GSPC,^IXIC,^GSPC,^IXIC,^GSPC,^IXIC,^GSPC,^IXIC,^GSPC,^IXIC,^GSPC,^IXIC
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2024-04-09,5209.910156,16306.639648,5209.910156,16306.639648,5224.810059,16348.179688,5160.779785,16141.150391,5217.029785,16328.759766,3400680000,4869190000
2024-04-10,5160.640137,16170.360352,5160.640137,16170.360352,5178.430176,16200.099609,5138.700195,16092.019531,5167.879883,16104.009766,3845930000,5308250000
2024-04-11,5199.060059,16442.199219,5199.060059,16442.199219,5211.779785,16464.599609,5138.77002,16154.650391,5172.950195,16236.200195,3509380000,4714750000
2024-04-12,5123.410156,16175.089844,5123.410156,16175.089844,5175.029785,16341.459961,5107.939941,16125.330078,5171.509766,16293.030273,3963220000,4552740000
2024-04-15,5061.819824,15885.019531,5061.819824,15885.019531,5168.430176,16295.269531,5052.470215,15863.879883,5149.669922,16276.469727,3950210000,4910550000


In [28]:
# Calculate yearly returns for each index
yearly_returns = index_data['Adj Close'].resample('Y').ffill().pct_change()

# Calculate CAGR for 10 years and 20 years
years = [10, 20]
cagr_results = {}

for index in indexes:
    cagr_values = []
    for year in years:
        # Calculate CAGR
        start_value = index_data.loc[index_data.index[0], ('Adj Close', index)]
        end_value = index_data.loc[index_data.index[-1], ('Adj Close', index)]
        cagr = ((end_value / start_value) ** (1 / year) - 1) * 100
        cagr_values.append(cagr)
    cagr_results[index] = cagr_values

# Create a DataFrame to display CAGR results
cagr_df = pd.DataFrame(cagr_results, index=years)

# Print the CAGR DataFrame
print("CAGR (%) for S&P 500 and NASDAQ Composite over the past 10 and 20 years:")
print(cagr_df)


CAGR (%) for S&P 500 and NASDAQ Composite over the past 10 and 20 years:
       ^GSPC     ^IXIC
10  5.702502  7.099768
20  2.811722  3.489018


### **Question 4.** [Stocks OHLCV] 52-weeks range ratio (2023) for the selected stocks
Find the largest range ratio [=(max-min)/max] of Adj.Close prices in 2023

Download the 2023 daily OHLCV data on Yahoo Finance for top5 stocks on earnings (https://companiesmarketcap.com/most-profitable-companies/): 2222.SR,BRK-B, AAPL, MSFT, GOOG, JPM.

Here is the example data you should see in Pandas for "2222.SR": https://finance.yahoo.com/quote/2222.SR/history

Calculate maximum-minimim "Adj.Close" price for each stock and divide it by the maximum "Adj.Close" value. Round the result to two decimal places (e.g. 0.1575 will be 0.16)

(Additional: why this may be important for your research?)

#### *Answer*

In [36]:
# Define the list of selected stocks
stocks = ['2222.SR', 'BRK-B', 'AAPL', 'MSFT', 'GOOG', 'JPM']

# Download the daily OHLCV data for each stock for the year 2023
start_date = '2023-01-01'
end_date = '2023-12-31'

In [41]:
# Create an empty DataFrame to store the data for all stocks
all_data = pd.DataFrame()

for stock in stocks:
    # Download data for each stock
    data = yf.download(stock, start=start_date, end=end_date)
    # Add a new column 'Stocks'
    data['Stocks'] = stock
    # Concatenate data to the all_data DataFrame
    all_data = pd.concat([all_data, data])

# Calculate the 52-weeks range ratio for each stock
all_data['Range Ratio'] = (all_data['Adj Close'].max() - all_data['Adj Close'].min()) / all_data['Adj Close'].max()

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


In [42]:
all_data.tail()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Stocks,Range Ratio
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2023-12-22,167.5,168.229996,167.229996,167.399994,165.40921,6574900,JPM,0.928676
2023-12-26,167.460007,168.770004,167.179993,168.389999,166.387451,4683600,JPM,0.928676
2023-12-27,167.839996,169.470001,167.580002,169.399994,167.385437,6428600,JPM,0.928676
2023-12-28,169.350006,170.660004,169.0,170.300003,168.27475,6320100,JPM,0.928676
2023-12-29,170.0,170.690002,169.630005,170.100006,168.077133,6431800,JPM,0.928676


In [44]:
# Find the row with the largest range ratio among the selected stocks
largest_range_row = all_data.loc[all_data['Range Ratio'].idxmax()]


# Extract the stock symbol and largest range ratio
largest_range_stock = largest_range_row['Stocks']
largest_range_ratio = largest_range_row['Range Ratio']
largest_range_ratio_rounded = round(largest_range_ratio, 2)

# Print the result
print(f"The largest 52-weeks range ratio in 2023 among the selected stocks is {largest_range_ratio_rounded} belonging to {largest_range_stock}.")

The largest 52-weeks range ratio in 2023 among the selected stocks is 0.93 belonging to 2222.SR.


### **Question 5.** [Stocks] Dividend Yield
Find the largest dividend yield for the same set of stocks

Use the same list of companies (2222.SR,BRK-B, AAPL, MSFT, GOOG, JPM) and download all dividends paid in 2023. You can use get_actions() method or .dividends field in yfinance library (https://github.com/ranaroussi/yfinance?tab=readme-ov-file#quick-start)

Sum up all dividends paid in 2023 per company and divide each value by the closing price (Adj.Close) at the last trading day of the year.

Find the maximm value in % and round to 1 digit after the decimal point. (E.g., if you obtained $1.25 dividends paid and the end year stock price is $100, the dividend yield is 1.25% -- and your answer should be equal to 1.3)

#### *Answer*

In [60]:
# Define the list of stocks
stocks = ["2222.SR", "BRK-B", "AAPL", "MSFT", "GOOG", "JPM"]

# Download dividend data for each stock
dividends = {}
for stock in stocks:
    ticker = yf.Ticker(stock)
    # Get the historical data for the year
    stock_prices = ticker.history()
    div = ticker.dividends
    dividends[stock] = div

In [65]:
# get the closing price
closing_price = stock_prices['Close'].iloc[-1]
print(closing_price)

180.8000030517578


In [63]:
# Calculate dividend yield for each stock
dividend_yield = {}
for stock, div in dividends.items():
    try:
      # Calculate dividend yield
      total_dividends = div.sum()
      dividend_yield[stock] = (total_dividends / closing_price) * 100
    except KeyError:
        # Handle KeyError for missing data
        print(f"Data not available for {stock}")

In [64]:
# Find the maximum dividend yield
max_dividend_yield = max(dividend_yield.values())

# Round to 1 decimal place
max_dividend_yield_rounded = round(max_dividend_yield, 1)

max_dividend_yield_rounded

0.6

### **Question 6**. [Exploratory] Investigate new metrics
Free text answer

Download and explore a few additional metrics or time series that might be valuable for your project and write down why (briefly).



#### *Answer*

Here are a few examples of these metrics and their potential value:

**Volatility (Volatility Index - VIX)**: Volatility measures the variation in asset prices over time. A high volatility index like VIX indicates market uncertainty and potential risks. Monitoring VIX can help in assessing market sentiment and adjusting investment strategies accordingly.

**Leverage Ratios (Debt-to-Equity, Debt-to-Assets)**: These ratios evaluate a company's financial leverage and risk exposure. High leverage ratios may indicate higher financial risk, while lower ratios suggest better financial stability. Analyzing these ratios helps in understanding a company's capital structure and financial health.

**Price-to-Earnings Ratio (P/E Ratio)**: The P/E ratio compares a company's current stock price to its earnings per share (EPS). A high P/E ratio may indicate that a company's stock is overvalued, while a low P/E ratio may suggest undervaluation. Monitoring P/E ratios helps in identifying potential investment opportunities or overvalued stocks.

**Cash Flow Metrics (Operating Cash Flow, Free Cash Flow)**: Cash flow metrics assess a company's ability to generate cash from its core operations and available cash after capital expenditures. Positive cash flow indicates financial health and sustainability, while negative cash flow may raise concerns. Analyzing cash flow metrics provides insights into a company's liquidity and financial performance.

**Technical Indicators (Moving Averages, Relative Strength Index - RSI)**: Technical indicators use historical price and volume data to analyze market trends and momentum. Moving averages help in identifying price trends, while RSI indicates overbought or oversold conditions. Incorporating technical indicators into analysis complements fundamental analysis and aids in making informed trading decisions.

To try 2 of these metrics out:

In [67]:
# Download Volatility Index (VIX) data
vix_data = yf.download("^VIX", start="2023-01-01", end="2023-12-31")

# Print the VIX data
print(vix_data.head())

[*********************100%%**********************]  1 of 1 completed

                 Open   High        Low      Close  Adj Close  Volume
Date                                                                 
2023-01-03  23.090000  23.76  22.730000  22.900000  22.900000       0
2023-01-04  22.930000  23.27  21.940001  22.010000  22.010000       0
2023-01-05  22.200001  22.92  21.969999  22.459999  22.459999       0
2023-01-06  22.690001  22.90  21.000000  21.129999  21.129999       0
2023-01-09  21.750000  21.98  21.270000  21.969999  21.969999       0





In [68]:
# Download Price-to-Earnings Ratio (P/E Ratio) data for AAPL
stock_ticker = "AAPL"
stock = yf.Ticker(stock_ticker)
pe_ratio_data = stock.info["trailingPE"]

# Print the P/E Ratio data
print(f"Price-to-Earnings Ratio (P/E Ratio) for {stock_ticker}: {pe_ratio_data}")

Price-to-Earnings Ratio (P/E Ratio) for AAPL: 26.383179


### **Question 7**. [Exploratory] Time-driven strategy description around earnings releases
Free text answer

Explore earning dates for the whole month of April - e.g. using YahooFinance earnings calendar (https://finance.yahoo.com/calendar/earnings?from=2024-04-21&to=2024-04-27&day=2024-04-23). Compare with the previous closed earnings (e.g., recent dates with full data https://finance.yahoo.com/calendar/earnings?from=2024-04-07&to=2024-04-13&day=2024-04-08).

Describe an analytical strategy/idea (you're not required to implement it) to select a subset companies of interest based on the future events data.

#### *Answer*

I would prefer to scrap this information using python.

In [79]:
import requests
from bs4 import BeautifulSoup

In [114]:
def get_earnings_calendar_data(url):
    # Send a GET request to the URL
    response = requests.get(url)

    # Check if the request was successful
    if response.status_code == 200:
        # Parse the HTML content of the webpage
        soup = BeautifulSoup(response.content, 'html.parser')

        # Find the table containing earnings data
        table = soup.find('table', {'class': 'W(100%)'})

        # Extract data from the table
        # You'll need to inspect the HTML structure to extract the specific data you want
        earnings_data = []
        rows = table.find_all('tr')
        for row in rows[1:]:  # Skip the header row
            cells = row.find_all('td')
            symbol = cells[0].text.strip()
            company_name = cells[1].text.strip()
            earning_period = cells[2].text.strip()
            four = cells[3].text.strip()
            earnings_data.append({'Stock': symbol, 'Company': company_name, 'Earning Period': earning_period})
        return earnings_data
    else:
        print('Failed to retrieve data from the webpage.')
        return None

In [112]:
# URLs for the earnings calendars
april_earnings_url = 'https://finance.yahoo.com/calendar/earnings?from=2024-04-01&to=2024-04-30'
previous_closed_earnings_url = 'https://finance.yahoo.com/calendar/earnings?from=2024-03-25&to=2024-03-31'

In [115]:
# Scrape earnings data for April earnings
april_earnings_data = get_earnings_calendar_data(april_earnings_url)

# Scrape earnings data for previous closed earnings
previous_closed_earnings_data = get_earnings_calendar_data(previous_closed_earnings_url)

# Print the extracted data
print('Earnings Calendar for April:')
for entry in april_earnings_data:
    print(entry)

print('\nPrevious Closed Earnings:')
for entry in previous_closed_earnings_data:
    print(entry)

Earnings Calendar for April:
{'Stock': 'EFSH', 'Company': '1847 Holdings LLC', 'Earning Period': 'Q4 2023  Earnings Release'}
{'Stock': 'INLB', 'Company': 'Item 9 Labs Corp', 'Earning Period': 'Q4 2023  Earnings Release'}
{'Stock': 'INLB', 'Company': 'Item 9 Labs Corp', 'Earning Period': 'Q1 2024  Earnings Release'}
{'Stock': 'NMTRQ', 'Company': '9 Meters Biopharma Inc', 'Earning Period': 'Q4 2023  Earnings Release'}
{'Stock': 'ATCMF', 'Company': 'Atico Mining Corp', 'Earning Period': 'Q1 2024  Earnings Release'}
{'Stock': 'ARHTF', 'Company': 'ARHT Media Inc', 'Earning Period': 'Q4 2023  Earnings Release'}
{'Stock': 'DIT', 'Company': 'AMCON Distributing Co', 'Earning Period': 'Q2 2024  Earnings Release'}
{'Stock': 'BAC', 'Company': 'Bank of America Corp', 'Earning Period': 'Q1 2024  Earnings Release'}
{'Stock': 'BAC', 'Company': 'Bank of America Corp', 'Earning Period': 'Q1 2024  Earnings Call'}
{'Stock': 'DMKPQ', 'Company': 'Adamis Pharmaceuticals Corp', 'Earning Period': 'Q4 2023 DMK

An analytical strategy around earnings releases can involve several steps to select a subset of companies based on future events data. Here's a description of such a strategy:


**Analytical Approach**:

Compare the upcoming earnings dates with the historical earnings data to identify patterns and trends.


**Selection Criteria**:


Incorporate financial ratios like Price-to-Earnings (P/E) ratio, Price-to-Sales (P/S) ratio, and Debt-to-Equity (D/E) ratio to evaluate the financial health of companies.


**Risk Management**:

Implement risk management strategies by diversifying the portfolio across sectors and market capitalizations.
Set stop-loss levels and profit targets based on historical volatility and risk tolerance.


**Monitoring and Adjustment**:

Continuously monitor news and market sentiment related to selected companies.
Adjust the portfolio based on new earnings releases, market trends, and economic indicators.
Consider incorporating machine learning models or predictive analytics to enhance decision-making and automate the selection process.