# 1. Obtaining Data

## 1.1 Preparations

To obtain data with OpenBB, we need to register an account on OpenBB Hub, and would better generate a personal access token. Suppose we save the toke in the file `obbToken`:

In [8]:
from openbb import obb

with open("obbToken", "r") as f:
    token = f.readlines()[0]

obb.account.login(pat=token)

obb.user.preferences.output_type = "dataframe"

## 1.2 Historical Data of Equity Prices

### 1.2.1 Yahoo! Finance

[Yahoo! Finance](https://finance.yahoo.com/) provides us lots of financial data freely, including historical equity prices. The open source Python library [`yfinance`](https://github.com/ranaroussi/yfinance) offers an easy way to use Yahoo's publicly available APIs to obtain these data. 

To download historical price data of equities, the simplest method is to use the `.download()` function of the `yfinance` library, which will generate a DataFrame with a Datatimeindex "Date", and 6 columns ("Open", "High", "Low", "Close", "Adj Close", and "Volume"):

In [32]:
import yfinance as yf

spy_yfd = yf.download(
    tickers="SPY",
    start="2000-01-01",
    end="2024-08-31",
    progress=False
)

spy_yfd

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2000-01-03,148.250000,148.250000,143.875000,145.437500,93.290169,8164300
2000-01-04,143.531250,144.062500,139.640625,139.750000,89.641937,8089800
2000-01-05,139.937500,141.531250,137.250000,140.000000,89.802345,12177900
2000-01-06,139.625000,141.500000,137.750000,137.750000,88.359062,6227200
2000-01-07,140.312500,145.750000,140.062500,145.750000,93.490623,8066500
...,...,...,...,...,...,...
2024-08-26,563.179993,563.909973,559.049988,560.789978,560.789978,35788600
2024-08-27,559.489990,562.059998,558.320007,561.559998,561.559998,32693900
2024-08-28,561.210022,561.650024,555.039978,558.299988,558.299988,41066000
2024-08-29,560.309998,563.679993,557.179993,558.349976,558.349976,38715200


It also supports multiple tickers, generating a hierarchical indexing DataFrame:

In [36]:
tickerList = ["AAPL", "GOOG"]

data = yf.download(
    tickers=tickerList,
    start="2000-01-01",
    end="2024-08-31",
    progress=False
)

data

Price,Adj Close,Adj Close,Close,Close,High,High,Low,Low,Open,Open,Volume,Volume
Ticker,AAPL,GOOG,AAPL,GOOG,AAPL,GOOG,AAPL,GOOG,AAPL,GOOG,AAPL,GOOG
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2000-01-03 00:00:00+00:00,0.844004,,0.999442,,1.004464,,0.907924,,0.936384,,535796800,
2000-01-04 00:00:00+00:00,0.772846,,0.915179,,0.987723,,0.903460,,0.966518,,512377600,
2000-01-05 00:00:00+00:00,0.784155,,0.928571,,0.987165,,0.919643,,0.926339,,778321600,
2000-01-06 00:00:00+00:00,0.716296,,0.848214,,0.955357,,0.848214,,0.947545,,767972800,
2000-01-07 00:00:00+00:00,0.750226,,0.888393,,0.901786,,0.852679,,0.861607,,460734400,
...,...,...,...,...,...,...,...,...,...,...,...,...
2024-08-26 00:00:00+00:00,227.179993,167.929993,227.179993,167.929993,227.279999,169.380005,223.889999,166.320007,226.759995,168.154999,30602200,11990300.0
2024-08-27 00:00:00+00:00,228.029999,166.380005,228.029999,166.380005,228.850006,168.244995,224.889999,166.160004,226.000000,167.610001,35934600,13718200.0
2024-08-28 00:00:00+00:00,226.490005,164.500000,226.490005,164.500000,229.860001,167.389999,225.679993,163.279999,227.919998,166.779999,38052200,15208700.0
2024-08-29 00:00:00+00:00,229.789993,163.399994,229.789993,163.399994,232.919998,167.630005,228.880005,161.981995,230.100006,166.059998,51906300,17133800.0


To access the adjusted close price of AAPL, use `data['Adj Close']['AAPL']`.

To generate a different order of these multiple indexes, set the parameter `group_by="ticker"`, and then we shall use `data['AAPL']['Adj Close']` to access the above data:

In [37]:
data = yf.download(
    tickers=tickerList,
    start="2000-01-01",
    end="2024-08-31",
    group_by="ticker",
    progress=False
)

data

Ticker,AAPL,AAPL,AAPL,AAPL,AAPL,AAPL,GOOG,GOOG,GOOG,GOOG,GOOG,GOOG
Price,Open,High,Low,Close,Adj Close,Volume,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2000-01-03 00:00:00+00:00,0.936384,1.004464,0.907924,0.999442,0.844004,535796800,,,,,,
2000-01-04 00:00:00+00:00,0.966518,0.987723,0.903460,0.915179,0.772846,512377600,,,,,,
2000-01-05 00:00:00+00:00,0.926339,0.987165,0.919643,0.928571,0.784155,778321600,,,,,,
2000-01-06 00:00:00+00:00,0.947545,0.955357,0.848214,0.848214,0.716296,767972800,,,,,,
2000-01-07 00:00:00+00:00,0.861607,0.901786,0.852679,0.888393,0.750226,460734400,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...
2024-08-26 00:00:00+00:00,226.759995,227.279999,223.889999,227.179993,227.179993,30602200,168.154999,169.380005,166.320007,167.929993,167.929993,11990300.0
2024-08-27 00:00:00+00:00,226.000000,228.850006,224.889999,228.029999,228.029999,35934600,167.610001,168.244995,166.160004,166.380005,166.380005,13718200.0
2024-08-28 00:00:00+00:00,227.919998,229.860001,225.679993,226.490005,226.490005,38052200,166.779999,167.389999,163.279999,164.500000,164.500000,15208700.0
2024-08-29 00:00:00+00:00,230.100006,232.919998,228.880005,229.789993,229.789993,51906300,166.059998,167.630005,161.981995,163.399994,163.399994,17133800.0


Or we can use the function `.equity.price.historical()` of the `openbb` library in Python, with the parameter `provider="yfinance"`:

In [30]:
spy_obb = obb.equity.price.historical(
    "SPY",
    start_date="2000-01-01",
    end_date="2024-08-31",
    provider="yfinance"
)

spy_obb

Unnamed: 0_level_0,open,high,low,close,volume,split_ratio,dividend,capital_gains
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2000-01-03,148.250000,148.250000,143.875000,145.437500,8164300,0.0,0.0,0.0
2000-01-04,143.531250,144.062500,139.640625,139.750000,8089800,0.0,0.0,0.0
2000-01-05,139.937500,141.531250,137.250000,140.000000,12177900,0.0,0.0,0.0
2000-01-06,139.625000,141.500000,137.750000,137.750000,6227200,0.0,0.0,0.0
2000-01-07,140.312500,145.750000,140.062500,145.750000,8066500,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...
2024-08-26,563.179993,563.909973,559.049988,560.789978,35788600,0.0,0.0,0.0
2024-08-27,559.489990,562.059998,558.320007,561.559998,32693900,0.0,0.0,0.0
2024-08-28,561.210022,561.650024,555.039978,558.299988,41066000,0.0,0.0,0.0
2024-08-29,560.309998,563.679993,557.179993,558.349976,38715200,0.0,0.0,0.0


It is notable that there is an "Adj Close" volumn in the data downloaded with `yf.download()` which is absent in the data downloaed with `obb.equity.price.historical()`, while three volumns "split_ratio", "dividend", and "capital_gains" are only included in the latter. And only the column names in the previous are starting with capital letters.

## 1.3 Fundamental Data of Corporations

Here is an example of balance sheet metrics of three stocks:

In [25]:
obb.equity.fundamental.metrics(
    "AAPL,MSFT,GOOG",
    provider="yfinance"
).T

Unnamed: 0,0,1,2
symbol,AAPL,MSFT,GOOG
market_cap,3387017396224.0,3043383836672.0,1944289017856.0
pe_ratio,33.907154,34.727737,22.788794
forward_pe,29.782085,26.883783,18.210104
peg_ratio,3.0,2.14,1.01
peg_ratio_ttm,2.1366,2.2084,1.0692
enterprise_to_ebitda,26.002,23.686,16.284
earnings_growth,0.111,0.097,0.314
earnings_growth_quarterly,0.079,0.097,0.286
revenue_per_share,24.957,32.986,26.353
