# Resources: Course Software and Platform
We suggest you to have Anaconda installed on your computer as we will use this platform for our course. It is the same platform used in the subsequent machine learning courses as well as in industry.  A common used utility is called Jupyter Notebook.  You can find many free codes from GitHub.  Many people publish their work on website nowadays, an example is from 
[Noble Prize Winner Thomas Sargent's website](https://lectures.quantecon.org/py/pandas.html)

## Anaconda
On the platform of [Anaconda](https://www.anaconda.com/)
you can use Spider, Jupyter Notebook, and in the future,  VScode and others in Anaconda.
## Python
We will use Python for assignment. Here is a [Python tutorial](https://docs.python.org/3/tutorial/). Another reference book is:
[Python Crash Course](https://ehmatthes.github.io/pcc_2e/) and sample codes from the website. 
## Markdown 
A common text typesetting in Jupyter Notebook is called "Markdown". You find the information below:
[Markdown Cheat Sheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)
## Latex
Latex is used for most professional typesetting, especially articles with a lot of  mathematic formulas. ou can learn the [basics](https://www.latex-project.org/ ), [Math Formula used in Latex](http://tug.ctan.org/info/short-math-guide/short-math-guide.pdf)

# Data
We can use Python to get data from various sources, such as Yahoo Finance, Fed St. Louis (Fred), Google Finance etc.  Due to the service agreement or data license or traffic issues, it is common for some data providers to stop providing this kind of service. But you can always find the information from website.

There are different ways and APIs for this kind of tasks.  Here I am using Pandas-DataReader.  

## Usefull Link
Here is a useful link:
[How to get data from Internet](https://s3.amazonaws.com/assets.datacamp.com/production/course_3882/slides/ch2.pdf).
We use Pandas for data manupulation:
[Pandas Official Documents](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)

 ## Examples

### Example of Using Latex for math typing

This is a simple equation, $x^2 + y^2 = z^2$, which you have all learned in high school.

This is the expression for bivariate normal copula function:
\begin{equation}
C(u_{1}, u_{2}) = \Phi_{2}\left(\Phi^{-1}(u_{1}),\Phi^{-1}(u_{2}), \rho\right)
\end{equation}
where $\Phi_{2}$ is the bivariate normal distribution function, and $ \Phi^{-1}()$ is the inverse function of univariate normal distribution function. 

### Example of using Python
You need to declare on the very front when you Python packages.

In [4]:
# have packages imported for data
# from pandas_datareader import data

import matplotlib.pyplot as plt
import pandas_datareader.data as web
import pandas as pd
import numpy as np
import datetime as dt


### To get data from Fred (St. Louis Fed) 
Here we retrieve month CPI index from 2000.1.1 to current.

Source:  fred -Fed St. Louis

CPIAUCNS - CPI 

DGS10 - 10 year constant maturity rate

'DCOILWTICO' # West Texas Intermediate Oil Price.

[FRED Data Category List](https://fred.stlouisfed.org/categories)


### Example: Get data on CPI

In [5]:
sdt = dt.datetime(2018, 1, 1)
edt = dt.datetime(2024, 9, 1)
ticker = 'CPIAUCNS'  
source = 'fred'
cpi = web.DataReader(ticker, source, sdt, edt)
cpi.tail(24)

Unnamed: 0_level_0,CPIAUCNS
DATE,Unnamed: 1_level_1
2022-09-01,296.808
2022-10-01,298.012
2022-11-01,297.711
2022-12-01,296.797
2023-01-01,299.17
2023-02-01,300.84
2023-03-01,301.836
2023-04-01,303.363
2023-05-01,304.127
2023-06-01,305.109


### Example: Create a yield curve using constant maturity Yield
We use the following tenors: 1M, 3M, 6M, 1Y, 2Y, 3Y, 5Y, 7Y, 10Y, 20Y, 30Y

In [6]:
ticker = ['DGS1MO', 'DGS3MO', 'DGS6MO','DGS1','DGS2', 'DGS3', 'DGS5','DGS7', 'DGS10','DGS20','DGS30']
sdt = dt.datetime(2000, 1, 1)
edt = dt.datetime(2024, 9, 22)
source = 'fred'
yieldcurve = pd.DataFrame(web.DataReader(ticker, source, sdt, edt))
yieldcurve = yieldcurve.dropna()
yieldcurve.to_csv('yieldcurvenona.csv')

You can see what you have got 

In [7]:
yieldcurve.tail(10)

Unnamed: 0_level_0,DGS1MO,DGS3MO,DGS6MO,DGS1,DGS2,DGS3,DGS5,DGS7,DGS10,DGS20,DGS30
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2024-09-09,5.25,5.11,4.68,4.12,3.68,3.54,3.49,3.58,3.7,4.08,4.0
2024-09-10,5.18,5.06,4.65,4.07,3.59,3.42,3.43,3.53,3.65,4.04,3.97
2024-09-11,5.21,5.1,4.72,4.12,3.62,3.45,3.45,3.54,3.65,4.03,3.96
2024-09-12,5.18,5.06,4.68,4.09,3.64,3.47,3.47,3.57,3.68,4.07,4.0
2024-09-13,5.15,4.97,4.6,4.0,3.57,3.42,3.43,3.53,3.66,4.05,3.98
2024-09-16,5.11,4.96,4.55,3.96,3.56,3.42,3.41,3.51,3.63,4.01,3.94
2024-09-17,5.05,4.95,4.55,3.99,3.59,3.45,3.44,3.53,3.65,4.02,3.96
2024-09-18,4.91,4.84,4.5,3.95,3.61,3.49,3.47,3.58,3.7,4.08,4.03
2024-09-19,4.89,4.8,4.46,3.93,3.59,3.47,3.49,3.6,3.73,4.11,4.06
2024-09-20,4.87,4.75,4.43,3.92,3.55,3.46,3.48,3.59,3.73,4.1,4.07


In [8]:
# nyse = pd.read_excel('listings.xlsx', sheetname='nyse', na_values='n/a')
# nyse = nyse.sort_values('Market Capitalization', ascending=False)
# nyse[['Stock Symbol', 'Company Name']].head(3)

### Example: To get data from Yahoo Finance and run an LSTM model to predict

In [9]:
import requests
import os, time

# Set local time zone to NYC
#  os.environ['TZ']='America/New_York'  comment this one out by David Li
# time.tzset()   this is only valid for Linux or OS
t=time.localtime() # string
print('Run analysis @ {}'.format(time.ctime()))


ticker='SPY'
urx="https://query1.finance.yahoo.com/v8/finance/chart/{}?region=US&lang=en-US&includePrePost=false&interval={}&period1={}&period2={}"
# period1=(dt.datetime.today().date()-dt.timedelta(7)).strftime('%s')
# period2=dt.datetime.today().date().strftime('%s')
endDate=dt.datetime.today().date()
period2 = int((endDate-dt.date(1970,1,1)).total_seconds())
period1 = int((endDate-dt.timedelta(7)-dt.date(1970,1,1)).total_seconds())


url = urx.format(ticker,'1m',period1,period2)

jTmp = pd.read_json(url)['chart']['result'][0]
pbdatetime = [ dt.datetime.fromtimestamp(int(x)) for x in jTmp['timestamp'] ]
df=pd.DataFrame(jTmp['indicators']['quote'][0])
df.loc[:,'ticker']=ticker

# use numerical index instead of time index for better display multiple days plot
# df.set_index(pd.DatetimeIndex(pbdatetime),inplace=True)

df.dropna(inplace=True)
df = df[['open','high','low','close']]
title = '{} asof {}'.format(ticker,pbdatetime[-1])
fig, ax=plt.subplots(figsize=(12,6))
df.plot(ax=ax,title=title)

# set xticks for better display multiple days plot
plt.locator_params(axis='x', nbins=20)  # x-axis
vn=range(len(pbdatetime))
xtcks = [pbdatetime[int(j)].strftime('%m/%d-%H:%M') if j in vn else None for j in ax.get_xticks()]
ax.set_xticklabels(xtcks)

plt.xticks(rotation='20',fontsize=10)
plt.show()

Run analysis @ Wed Sep 25 10:10:40 2024


HTTPError: HTTP Error 403: Forbidden