# Homework 1 - Stock Markets Analytics Zoomcamp

## Introduction and Data Source

### Question 1: [Macro] Average growth of GDP in 2023

**What is the average growth (in %) of GDP in 2023?**

Download the timeseries Real Gross Domestic Product (GDPC1) from FRED (https://fred.stlouisfed.org/series/GDPC1). Calculate year-over-year (YoY) growth rate (that is, divide current value to one 4 quarters ago). Find the average YoY growth in 2023 (average from 4 YoY numbers). Round to 1 digit after the decimal point: e.g. if you get 5.66% growth => you should answer 5.7

In [3]:
import pandas as pd
import pandas_datareader.data as pdr
import datetime as dt

# Define start and end dates
start = dt.datetime(2022, 1, 1)
end = dt.datetime(2024, 1, 1)

# Download GDPC1 data from FRED
gdpc1 = pdr.DataReader("GDPC1", "fred", start, end)

# Calculate year-over-year growth rate
gdpc1_yoy = gdpc1.pct_change(4) * 100  # Multiply by 100 to get percentage change
gdpc1_yoy.columns = ['YoY_Growth']  # Rename column for clarity

# Extract YoY growth rates for 2023
gdpc1_yoy_2023 = gdpc1_yoy.loc['2023']

# Calculate average YoY growth in 2023
average_yoy_growth_2023 = gdpc1_yoy_2023.mean()

# Round to 1 decimal point
average_yoy_growth_2023_rounded = round(average_yoy_growth_2023, 1)

print("Average YoY Growth in 2023: {}%".format(average_yoy_growth_2023_rounded))

Average YoY Growth in 2023: YoY_Growth    2.5
dtype: float64%


### Question 2. [Macro] Inverse "Treasury Yield"

**Find the min value of (dgs10-dgs2) after since year 2000 (2000-01-01) and write it down as an answer, round to 1 digit after the decimal point.**

Download DGS2 and DGS10 interest rates series (https://fred.stlouisfed.org/series/DGS2, https://fred.stlouisfed.org/series/DGS10). Join them together to one dataframe on date (you might need to read about pandas.DataFrame.join()), calculate the difference dgs10-dgs2 daily.

(Additional: think about what does the "inverted yield curve" mean for the market and investors? do you see the same thing in your country/market of interest? Do you think it can be a good predictive feature for the models?)

In [5]:
# Define start and end dates
start = "2000-01-01"
end = "2024-01-01"

# Download DGS2 and DGS10 data from FRED
dgs2 = pdr.DataReader("DGS2", "fred", start, end)
dgs10 = pdr.DataReader("DGS10", "fred", start, end)

# Join the two dataframes on date
interest_rates = dgs2.join(dgs10, how="inner", lsuffix="_2", rsuffix="_10")

# Calculate the difference between DGS10 and DGS2 daily
interest_rates["DGS10_minus_DGS2"] = interest_rates["DGS10"] - interest_rates["DGS2"]

# Find the minimum value of (DGS10 - DGS2) since 2000
min_value_since_2000 = interest_rates["DGS10_minus_DGS2"].min()

# Round to 1 decimal point
min_value_rounded = round(min_value_since_2000, 1)

print("Minimum value of (DGS10 - DGS2) since 2000: {}%".format(min_value_rounded))

Minimum value of (DGS10 - DGS2) since 2000: -1.1%


Additional answers:
1. asdf
2. asdf

### Question 3. [Index] Which Index is better recently?

**Compare S&P 500 and IPC Mexico indexes by the 5 year growth and write down the largest value as an answer (%)**

Download on Yahoo Finance two daily index prices for S&P 500 (^GSPC, https://finance.yahoo.com/quote/%5EGSPC/) and IPC Mexico (^MXX, https://finance.yahoo.com/quote/%5EMXX/). Compare 5Y growth for both (between 2019-04-09 and 2024-04-09). Select the higher growing index and write down the growth in % (closest integer %). E.g. if ratio end/start was 2.0925 (or growth of 109.25%), you need to write down 109 as your answer.

(Additional: think of other indexes and try to download stats and compare the growth? Do create 10Y and 20Y growth stats. What is an average yearly growth rate (CAGR) for each of the indexes you select?)

In [7]:
import yfinance as yf
import datetime as dt

# Define start and end dates for the 5-year period
start_date = "2019-04-09"
end_date = "2024-04-09"

# Download S&P 500 (^GSPC) and IPC Mexico (^MXX) index data from Yahoo Finance
sp500_data = yf.download("^GSPC", start=start_date, end=end_date)
ipc_data = yf.download("^MXX", start=start_date, end=end_date)

# Calculate the 5-year growth for each index
sp500_start_price = sp500_data['Adj Close'].iloc[0]
sp500_end_price = sp500_data['Adj Close'].iloc[-1]
sp500_growth = ((sp500_end_price - sp500_start_price) / sp500_start_price) * 100

ipc_start_price = ipc_data['Adj Close'].iloc[0]
ipc_end_price = ipc_data['Adj Close'].iloc[-1]
ipc_growth = ((ipc_end_price - ipc_start_price) / ipc_start_price) * 100

# Compare the growth values and select the index with the higher growth
if sp500_growth > ipc_growth:
    selected_index = "S&P 500"
    selected_growth = round(sp500_growth,2)
else:
    selected_index = "IPC Mexico"
    selected_growth = round(ipc_growth,2)

print("\nIndex with the highest growth: {}".format(selected_index))
print("Growth percentage: {}%".format(selected_growth))

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


Index with the highest growth: S&P 500
Growth percentage: 80.75%





Additional Answers:
1. asdfs
2. asdfas


### Question 4. [Stocks OHLCV] 52-weeks range ratio (2023) for the selected stocks

**Find the largest range ratio [=(max-min)/max] of Adj.Close prices in 2023**

Download the 2023 daily OHLCV data on Yahoo Finance for top6 stocks on earnings (https://companiesmarketcap.com/most-profitable-companies/): 2222.SR,BRK-B, AAPL, MSFT, GOOG, JPM.

Here is the example data you should see in Pandas for "2222.SR": https://finance.yahoo.com/quote/2222.SR/history

Calculate maximum-minimim "Adj.Close" price for each stock and divide it by the maximum "Adj.Close" value. Round the result to two decimal places (e.g. 0.1575 will be 0.16)

(Additional: why this may be important for your research?)

In [8]:
import yfinance as yf

# Define the list of stock symbols
stock_symbols = ['2222.SR', 'BRK-B', 'AAPL', 'MSFT', 'GOOG', 'JPM']

# Download daily OHLCV data for each stock in 2023
stock_data = {}
for symbol in stock_symbols:
    stock_data[symbol] = yf.download(symbol, start='2023-01-01', end='2023-12-31')

# Calculate the maximum and minimum adjusted close prices for each stock in 2023
stock_max_min = {}
for symbol, data in stock_data.items():
    max_close = data['Adj Close'].max()
    min_close = data['Adj Close'].min()
    stock_max_min[symbol] = (max_close, min_close)

# Calculate the range ratio for each stock
range_ratios = {}
for symbol, (max_close, min_close) in stock_max_min.items():
    range_ratio = (max_close - min_close) / max_close
    range_ratios[symbol] = round(range_ratio, 2)

# Find the stock with the largest range ratio
largest_range_stock = max(range_ratios, key=range_ratios.get)
largest_range_ratio = range_ratios[largest_range_stock]

print("\nStock with the largest range ratio: {}".format(largest_range_stock))
print("\nLargest range ratio: {}".format(largest_range_ratio))

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


Stock with the largest range ratio: MSFT

Largest range ratio: 0.42





Additional Answers:
1. asdf
2. asdf


### Question 5. [Stocks] Dividend Yield

**Find the largest dividend yield for the same set of stocks**

Use the same list of companies (2222.SR,BRK-B, AAPL, MSFT, GOOG, JPM) and download all dividends paid in 2023. You can use get_actions() method or .dividends field in yfinance library (https://github.com/ranaroussi/yfinance?tab=readme-ov-file#quick-start)

Sum up all dividends paid in 2023 per company and divide each value by the closing price (Adj.Close) at the last trading day of the year.

Find the maximm value in % and round to 1 digit after the decimal point. (E.g., if you obtained $1.25 dividends paid and the end year stock price is $100, the dividend yield is 1.25% -- and your answer should be equal to 1.3)

In [9]:
import yfinance as yf

# Define the list of stock symbols
stock_symbols = ['2222.SR', 'BRK-B', 'AAPL', 'MSFT', 'GOOG', 'JPM']

# Initialize a dictionary to store dividend yields for each stock
dividend_yields = {}

# Loop through each stock symbol
for symbol in stock_symbols:
    # Download dividend data for the stock for the year 2023
    stock = yf.Ticker(symbol)
    dividends = stock.dividends

    # Sum up all dividends paid in 2023 per company
    total_dividends = dividends.loc['2023'].sum()

    # Get the closing price (adjusted close) of the stock on the last trading day of the year
    last_trading_day_data = yf.download(symbol, end='2023-12-31')
    last_trading_day_close = last_trading_day_data['Adj Close'].iloc[-1]

    # Calculate the dividend yield for the stock
    dividend_yield = (total_dividends / last_trading_day_close) * 100

    # Store the dividend yield in the dictionary
    dividend_yields[symbol] = dividend_yield

# Find the maximum dividend yield among all stocks
largest_dividend_yield = max(dividend_yields.values())

# Round the maximum dividend yield to 1 decimal place
largest_dividend_yield_rounded = round(largest_dividend_yield, 1)

print("\nLargest dividend yield: {}%".format(largest_dividend_yield_rounded))

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


Largest dividend yield: 2.8%





### Question 6. [Exploratory] Investigate new metrics

Free text answer

Download and explore a few additional metrics or time series that might be valuable for your project and write down why (briefly).

asd

### Question 7. [Exploratory] Time-driven strategy description around earnings releases

Free text answer

Explore earning dates for the whole month of April - e.g. using YahooFinance earnings calendar (https://finance.yahoo.com/calendar/earnings?from=2024-04-21&to=2024-04-27&day=2024-04-23). Compare with the previous closed earnings (e.g., recent dates with full data https://finance.yahoo.com/calendar/earnings?from=2024-04-07&to=2024-04-13&day=2024-04-08).

Describe an analytical strategy/idea (you're not required to implement it) to select a subset companies of interest based on the future events data.

asdf