## Module 1 Homework

In this homework, we're going to download finance data from various sources and make simple calculations/analysis.

### Question 1: [Macro] Average growth of GDP in 2023

**What is the average growth (in %) of GDP in 2023?**

* Download the timeseries Real Gross Domestic Product (GDPC1) from FRED (https://fred.stlouisfed.org/series/GDPC1). 
* Calculate year-over-year (YoY) growth rate (that is, divide current value to one 4 quarters ago). 
* Find the average YoY growth in 2023 (average from 4 YoY numbers).
* Round to 1 digit after the decimal point: e.g. if you get 5.66% growth => you should answer  5.7


In [40]:
import numpy as np
import pandas as pd

import yfinance as yf
import pandas_datareader as pdr

import plotly.graph_objs as go
import plotly.express as px

import time
from datetime import date, datetime

In [2]:
end = date.today()
print(f'Year = {end.year}; month= {end.month}; day={end.day}')

start = date(year=end.year-70, month=end.month, day=end.day)
print(f'Period for indexes: {start} to {end} ')

Year = 2024; month= 4; day=21
Period for indexes: 1954-04-21 to 2024-04-21 


In [3]:
gdpc1 = pdr.DataReader("GDPC1", "fred", start=start)


In [4]:
gdpc1['gdpc1_us_yoy'] = gdpc1.GDPC1/gdpc1.GDPC1.shift(4)-1
gdpc1.head()

Unnamed: 0_level_0,GDPC1,gdpc1_us_yoy
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
1954-07-01,2880.482,
1954-10-01,2936.852,
1955-01-01,3020.746,
1955-04-01,3069.91,
1955-07-01,3111.379,0.080159


In [5]:
start_2023 = date(year=2023, month=1, day=1)
end_2023 = date(year=2024,month=1, day=1)
gdpc1.loc[start_2023:end_2023]

Unnamed: 0_level_0,GDPC1,gdpc1_us_yoy
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
2023-01-01,22112.329,0.017179
2023-04-01,22225.35,0.023825
2023-07-01,22490.692,0.029269
2023-10-01,22679.255,0.031345


In [6]:
print(f"{round(gdpc1.loc[start_2023:end_2023]['gdpc1_us_yoy'].mean()*100, 1)}")

2.5


### Question 2. [Macro] Inverse "Treasury Yield"

**Find the min value of (dgs10-dgs2) after since year 2000 (2000-01-01) and write it down as an answer, round to 1 digit after the decimal point.**


* Download DGS2 and DGS10 interest rates series (https://fred.stlouisfed.org/series/DGS2,
 https://fred.stlouisfed.org/series/DGS10). 
 * Join them together to one dataframe on date (you might need to read about pandas.DataFrame.join()), 
 * Calculate the difference dgs10-dgs2 daily.

(Additional: think about what does the "inverted yield curve" mean for the market and investors? do you see the same thing in your country/market of interest? Do you think it can be a good predictive feature for the models?)

In [7]:
start = date(year=2000, day=1, month=1)
dgs2 = pdr.DataReader("DGS2", "fred", start=start)
dgs10= pdr.DataReader("DGS10", "fred", start=start)

In [8]:
joined_dgs = dgs2.join(dgs10)

In [9]:
joined_dgs['delta'] = joined_dgs.DGS10-joined_dgs.DGS2

In [10]:
joined_dgs['delta'].min()

-1.0800000000000005

The minimum value is -1.1

### Question 3. [Index] Which Index is better recently?

**Compare S&P 500 and IPC Mexico indexes by the 5 year growth and write down the largest value as an answer (%)**

* Download on Yahoo Finance two daily index prices for S&P 500 (^GSPC, https://finance.yahoo.com/quote/%5EGSPC/) and IPC Mexico (^MXX, https://finance.yahoo.com/quote/%5EMXX/). 
* Compare 5Y growth for both (between 2019-04-09 and 2024-04-09). 
* Select the higher growing index and write down the growth in % (closest integer %). 
* E.g. if ratio end/start was 2.0925 (or growth of 109.25%), you need to write down 109 as your answer.

(Additional: think of other indexes and try to download stats and compare the growth? Do create 10Y and 20Y growth stats. What is an average yearly growth rate (CAGR) for each of the indexes you select?)

In [11]:
start = date(year=2019, month=4, day=9)
end = date(year=2024, month=4, day=10)
end_1 = date(year=end.year, month=end.month, day=end.day-1)

spc_index = yf.download(tickers = "^GSPC",
                     period = "max",
                     interval = "1d",
                     start=start,
                     end=end)

mxx_index = yf.download(tickers = "^MXX",
                     period = "max",
                     interval = "1d",
                     start=start,
                     end=end)


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


In [12]:
mxx_index.loc[str(end_1)]['Adj Close']/mxx_index.loc[str(start)]['Adj Close']

1.2750624912566744

In [13]:
growth_rate = spc_index.loc[str(end_1)]['Adj Close']/spc_index.loc[str(start)]['Adj Close']
growth_rate

1.8101279426847174

In [14]:
print(f'Larger growth rate is {round((growth_rate-1)*100,1)}')

Larger growth rate is 81.0


### Question 4. [Stocks OHLCV] 52-weeks range ratio (2023) for the selected stocks


**Find the largest range ratio [=(max-min)/max] of Adj.Close prices in 2023**


* Download the 2023 daily OHLCV data on Yahoo Finance for top6 stocks on earnings (https://companiesmarketcap.com/most-profitable-companies/): 2222.SR,BRK-B, AAPL, MSFT, GOOG, JPM.

* Here is the example data you should see in Pandas for "2222.SR": https://finance.yahoo.com/quote/2222.SR/history

* Calculate maximum-minimim "Adj.Close" price for each stock and divide it by the maximum "Adj.Close" value.Round the result to two decimal places (e.g. 0.1575 will be 0.16)

(Additional: why this may be important for your research?)


In [15]:
start = date(year=2023, month=1, day=1)
end = date(year=2024, month=1, day=1)

stocks = ['2222.SR','BRK-B', 'AAPL', 'MSFT', 'GOOG', 'JPM']

stock_data  = {stock: yf.download(tickers = stock,
            interval = "1d",
            period = "max",
            start=start,
            end=end
            )["Adj Close"] for stock in stocks}


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


In [16]:

def min_max_relation(arr: np.array) -> float:
    return (arr.max()-arr.min())/arr.max()

In [17]:
stock_data_min_max = {key: min_max_relation(value) for key, value in stock_data.items()}

In [18]:
stock_data_min_max

{'2222.SR': 0.21393065379760481,
 'BRK-B': 0.20775750091289963,
 'AAPL': 0.37244419224463476,
 'MSFT': 0.4242066914981641,
 'GOOG': 0.3924520921912013,
 'JPM': 0.28249927707093897}

In [20]:
stock_data_min_max = dict(sorted(stock_data_min_max.items(), key=lambda item: item[1]))

In [22]:
print(f' This is {round(stock_data_min_max["MSFT"], 2)}')

 This is 0.42


Thos stocks with the "large" (max-min)/max ratio has the smalles price at the beginning and "large" relative growth.

### Question 5. [Stocks] Dividend Yield
**Find the largest dividend yield for the same set of stocks**

Use the same list of companies (2222.SR,BRK-B, AAPL, MSFT, GOOG, JPM) and download all dividends paid in 2023.
You can use `get_actions()` method or `.dividends` field in yfinance library (https://github.com/ranaroussi/yfinance?tab=readme-ov-file#quick-start)

Sum up all dividends paid in 2023 per company and divide each value by the closing price (Adj.Close) at the last trading day of the year.

Find the maximm value in % and round to 1 digit after the decimal point. (E.g., if you obtained $1.25 dividends paid and the end year stock price is $100, the dividend yield is 1.25% -- and your answer should be equal to 1.3)

In [55]:
dividends = { stock: yf.Ticker(stock).dividends for stock in stocks}
dividends

{'2222.SR': Date
 2020-03-19 00:00:00+03:00    0.060992
 2020-05-31 00:00:00+03:00    0.290744
 2020-08-12 00:00:00+03:00    0.290744
 2020-11-10 00:00:00+03:00    0.290744
 2021-03-23 00:00:00+03:00    0.290744
 2021-05-17 00:00:00+03:00    0.290744
 2021-08-16 00:00:00+03:00    0.290744
 2021-11-07 00:00:00+03:00    0.290744
 2022-05-24 00:00:00+03:00    0.290727
 2022-08-22 00:00:00+03:00    0.290727
 2022-11-09 00:00:00+03:00    0.290727
 2023-03-15 00:00:00+03:00    0.302364
 2023-05-17 00:00:00+03:00    0.302400
 2023-09-11 00:00:00+03:00    0.153000
 2023-11-15 00:00:00+03:00    0.153000
 2024-03-14 00:00:00+03:00    0.167000
 Name: Dividends, dtype: float64,
 'BRK-B': Series([], Name: Dividends, dtype: float64),
 'AAPL': Date
 1987-05-11 00:00:00-04:00    0.000536
 1987-08-10 00:00:00-04:00    0.000536
 1987-11-17 00:00:00-05:00    0.000714
 1988-02-12 00:00:00-05:00    0.000714
 1988-05-16 00:00:00-04:00    0.000714
                                ...   
 2023-02-10 00:00:00-0

In [65]:
dividends.keys()
dividends['JPM']

Date
1984-03-09 00:00:00-05:00    0.196667
1984-06-11 00:00:00-04:00    0.196667
1984-09-10 00:00:00-04:00    0.196667
1984-12-10 00:00:00-05:00    0.196667
1985-03-11 00:00:00-05:00    0.206667
                               ...   
2023-04-05 00:00:00-04:00    1.000000
2023-07-05 00:00:00-04:00    1.000000
2023-10-05 00:00:00-04:00    1.050000
2024-01-04 00:00:00-05:00    1.050000
2024-04-04 00:00:00-04:00    1.150000
Name: Dividends, Length: 162, dtype: float64

In [77]:
def filter_for_year(df: pd.DataFrame) -> pd.Series:
     return df.index.to_series().dt.tz_convert(None).between(datetime(year=2023, month=1, day=1), 
                                        datetime(year=2024, month=1, day=1))


In [78]:
dividends_2023 = { stock_name: stock_data[filter_for_year(stock_data)]  for stock_name, stock_data in dividends.items()}


In [83]:
dividends_2023_sum = {stock_name: dividend.sum() 
                      for stock_name, dividend in dividends_2023.items()}
dividends_2023_sum

{'2222.SR': 0.9107640000000001,
 'BRK-B': 0.0,
 'AAPL': 0.95,
 'MSFT': 2.79,
 'GOOG': 0.0,
 'JPM': 4.05}

In [87]:
last_adj_closed = {stock: stock_data[stock].iloc[-1] for stock in stocks}

In [91]:
ratio = {stock: round(100*dividends_2023_sum[stock]/last_adj_closed[stock], 1) for stock in stocks}

In [92]:
ratio

{'2222.SR': 2.8,
 'BRK-B': 0.0,
 'AAPL': 0.5,
 'MSFT': 0.7,
 'GOOG': 0.0,
 'JPM': 2.4}

### Question 6. [Exploratory] Investigate new metrics

**Free text answer**

Download and explore a few additional metrics or time series that might be valuable for your project and write down why (briefly).