# API

## What is a API?

An API (Application Programming Interface) serves as a bridge between different software applications, enabling them to communicate and share data seamlessly. It defines the methods and protocols that allow different pieces of software to interact. APIs specify how different software components should interact, making it easier for developers to use certain functionalities without having to write all the code from scratch. They provide a set of rules for building software applications and are crucial in modern technology, allowing various apps, services, or systems to work together efficiently.

![](./figs/API.png)

## How Does It Work?
The user or programmer needs specific information and queries the API. The query is typically a URL consisting of the following parts: the query domain, the desired output type (usually JSON), and parameters used to filter the required response type. After processing, the user or programmer will usually receive plain text with the requested information.

The following `url_example` can be an example of how an API's URL might look, although this model isn't universal; it heavily depends on the organization, the programming languages they use, the type of database employed, or the internal paradigms within the organization.

```python
Url_Example= "{query domain}/output={output_type}?start_date={start date - param}?end_date={end date - param}"
url_example= "api.example.com/output=json?start_date=2019-01?end_date=2020-12"
```

After the query, we obtain an "output," which needs to be processed according to the user or programmer's needs. In this example, we need to convert plain JSON text into a Python dictionary. To access the data, we first navigate through the key "output_example," then "data," to obtain the information. Afterward, we can convert this data into a Python DataFrame using `pd.DataFrame()`.

```json
"output_example": {
    "id_code": "sk_12km1",,
    "results": 1,
    "data": {
        "date": ["2019-01", "2019-02", "2019-03", ...],
        "col_info_1": [21, 12, 34],
        ...
    }

    }
```

## How do you make a request in Python?

In [1]:
# !pip install pandas requests
import pandas as pd, requests
import warnings
warnings.filterwarnings('ignore')

For this example, we'll take the BCRP API to demonstrate how the query is made and the information is prepared. In this case, we'll use the 'Lima's price index me' with the code 'PN001184MM' as the identifier.


![](figs/1_example.png)

In [3]:
query_domain ="https://estadisticas.bcrp.gob.pe/estadisticas/series/api"
id_query = '/PN00184MM'
output_format = "/json"
url_1 = query_domain + id_query + output_format
print(url_1)

https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN00184MM/json


To perform the request, we use the `requests` module and access its `.get` function to query the API, and `.json` to retrieve the content of the API response into `dict` format.

In [4]:
query = requests.get(url_1).json()
query

{'config': {'title': 'Liquidez de las sociedades creadoras de depósito (fin de periodo)',
  'series': [{'name': 'Liquidez de las sociedades creadoras de depósito (fin de periodo) - Cuasidinero MN (millones S/)',
    'dec': '0'}]},
 'periods': [{'name': 'Dic.2021', 'values': ['166829.834']},
  {'name': 'Ene.2022', 'values': ['164849.446']},
  {'name': 'Feb.2022', 'values': ['164073.937']},
  {'name': 'Mar.2022', 'values': ['164526.415']},
  {'name': 'Abr.2022', 'values': ['163792.147']},
  {'name': 'May.2022', 'values': ['161961.611']},
  {'name': 'Jun.2022', 'values': ['161883.542']},
  {'name': 'Jul.2022', 'values': ['168244.07']},
  {'name': 'Ago.2022', 'values': ['168391.996']},
  {'name': 'Sep.2022', 'values': ['171581.032']},
  {'name': 'Oct.2022', 'values': ['172269.681']},
  {'name': 'Nov.2022', 'values': ['171273.957']},
  {'name': 'Dic.2022', 'values': ['176440.532985']},
  {'name': 'Ene.2023', 'values': ['174845.280621']},
  {'name': 'Feb.2023', 'values': ['173178.233796']},


The dictionary has two main keys: `config`, which holds the `title` (title of the series) of the data, and `series`, which contains more metadata. On the other hand, `periods` holds all the information of the data. Within it, there is a 'name' representing the dates, and 'values,' which contains an array in string format representing the value for that date.


First, we'll demonstrate how to extract relevant information to create a database:

- Firstly, the title found within `config`, under the key `title`.
- Next, the period within `periods`, accessed through an index in the array under the key `name`.
- Lastly, the value, which is located within `periods` as well, accessed through an index in the array under the key `values`.

In [5]:
title: str = query['config']['title']
index_0 = 0
date_0: str = query['periods'][0]['name']
value_0: str = query['periods'][0]['values'][0]

print(
f"""Title: {title}
Values index 0:
    - Date: {date_0}
    - Value: {value_0}
"""
)

Title: Liquidez de las sociedades creadoras de depósito (fin de periodo)
Values index 0:
    - Date: Dic.2021
    - Value: 166829.834



Taking into account the previous information, we can structure the database into 2 columns: "Date" and "Value," where the "Value" column needs to be in float format. With Pandas, we proceed as follows:

In [6]:
date_n = [x['name'] for x in query['periods']]
value_n = [x['values'][0] for x in query['periods']]
data_1 = pd.DataFrame(
    {
        "Date": date_n,
        "Value": value_n
    }
)
print("n_rows, n_cols = ", data_1.shape)
data_1.head()

n_rows, n_cols =  (24, 2)


Unnamed: 0,Date,Value
0,Dic.2021,166829.834
1,Ene.2022,164849.446
2,Feb.2022,164073.937
3,Mar.2022,164526.415
4,Abr.2022,163792.147


## Banco Central de Reserva del Peru (BCRP) - API

`https://estadisticas.bcrp.gob.pe/estadisticas/series/api/[códigos de series]/[formato de salida]/[periodo inicial]/[periodo final]/[idioma]`

Source: [API](https://estadisticas.bcrp.gob.pe/estadisticas/series/ayuda/api)

Components:

- **[códigos de series]**: Refers to the series codes that uniquely identify specific economic or financial indicators within the BCRP's database. These codes likely represent distinct types of data, such as inflation rates, GDP figures, exchange rates, etc.

- **[formato de salida]** (html, xls, xml, json, txt, csv): Specifies the desired output format for the retrieved data. It might include options such as 'csv' for comma-separated values, 'json' for JavaScript Object Notation, 'xml' for Extensible Markup Language. This parameter determines the format in which the data will be returned by the API.

- **[periodo inicial]**: Denotes the start period for the data retrieval. It could be a specific date or time frame from which you want the data to begin.

- **[periodo final]**: Indicates the end period for the data retrieval. Similar to the initial period, it signifies the date or time frame at which the data retrieval should conclude.

- **[idioma]** (default = esp): Specifies the language for the returned data. It determines in which language the metadata or additional information accompanying the retrieved data will be presented.

If we create a function incorporating the previous processes and take the components as parameters, we can build a database scraper.

In [6]:

def bcrp_data(id_query: str, start_date = None, end_date=None, output_format = "json", query_domain: str="https://estadisticas.bcrp.gob.pe/estadisticas/series/api"):
    url = query_domain + "/" + id_query.upper().strip() + "/" + output_format 
    if start_date is not None:
        url = url + "/" +  start_date
    if end_date is not None:
        url = url + "/" +  end_date 

    q = requests.get(url).json()
    print(url, q['config']['title'])
    date_n = [x['name'] for x in q['periods']]
    value_n = [x['values'][0] for x in q['periods']]
    data = pd.DataFrame(
        {
            "Date": date_n,
            "Value": value_n
        }
    )
    return data

bcrp_data("PN00184MM").head()


https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN00184MM/json Liquidez de las sociedades creadoras de depósito (fin de periodo)


Unnamed: 0,Date,Value
0,Dic.2021,166829.834
1,Ene.2022,164849.446
2,Feb.2022,164073.937
3,Mar.2022,164526.415
4,Abr.2022,163792.147


In [7]:
bcrp_data("   RD38225BM", start_date="2005-01", end_date="2015-02")

https://estadisticas.bcrp.gob.pe/estadisticas/series/api/RD38225BM/json/2005-01/2015-02 Exportaciones por grupo de productos de Cajamarca (Valores FOB en millones de US$)


Unnamed: 0,Date,Value
0,Ene.2005,82.33294099
1,Feb.2005,134.088619
2,Mar.2005,121.104694
3,Abr.2005,94.074486
4,May.2005,105.66805
...,...,...
117,Oct.2014,146.56859541
118,Nov.2014,187.02393129
119,Dic.2014,187.44260031
120,Ene.2015,146.13384801


## Thrid-party software - official software

Websites or databases that receive constant requests typically have packages in various programming languages for use. For instance, in the case of the BCRP API, there are several internal functions that enable faster processing of the API, which is often computationally efficient.

Additionally, these APIs or databases require a bypass known as an API key to regulate query traffic and collect usage statistics. These API keys are typically not embedded within the internal code. One needs to refer to the documentation, request the API key, and place this variable within the execution environment of our project and the third-party software. In open-source projects, it's uncommon to collect API keys, yet it's advisable to authenticate the application before using it, especially when dealing with sensitive data.

## Federal Reserve Economic Data (FRED) - API

To access the FRED API, we need to have an account beforehand.
- Go to https://fred.stlouisfed.org/
- We sign up.
- Access "my account."
- Click on API keys.
- Fill out the form and accept the terms.
- Save the API key for future use.

Steps:

Create account

![](figs/1_fred_createaccount.png)

Create API key

![](figs/2_fred_api_key.png)

![](figs/3_fred_api_key_button.png)

![](figs/4_fred_api_key_request.png)

Save API key

![](figs/5_fred_api_key_result.png)

### Documentation [link](https://fred.stlouisfed.org/docs/api/fred/)

API

- Query Domain: `https://api.stlouisfed.org/`
- Url with API KEY: `{Query_domain}/{consults}&api_key={api_key}`

CONSULTS
- Series
    - fred/series - Get an economic data series.
    - fred/series/categories - Get the categories for an economic data series.
    - fred/series/observations - Get the observations or data values for an economic data series.
    - fred/series/release - Get the release for an economic data series.
    - fred/series/search - Get economic data series that match keywords.
    - fred/series/search/tags - Get the tags for a series search.
    - fred/series/search/related_tags - Get the related tags for a series search.
    - fred/series/tags - Get the tags for an economic data series.
    - fred/series/updates - Get economic data series sorted by when observations were updated on the FRED® server.
    - fred/series/vintagedates - Get the dates in history when a series' data values were revised or new data values were released.
- Sources
    - fred/sources - Get all sources of economic data.
    - fred/source - Get a source of economic data.
    - fred/source/releases - Get the releases for a source.



### Third-path softwares - [python](https://fred.stlouisfed.org/docs/api/fred/)

![](figs/1_python_fred.png)

### `pyfredapi` - [GitHub](https://github.com/gw-moore/pyfredapi)

`pyfredapi` is a Python library that makes it is easy to retrieve data from the FRED API web service. `pyfredapi` covers all the FRED API endpoints, and can retrieve data from FRED and ALFRED. Data can be returned as a pandas dataframe or as json. Requests to the FRED API can be customized according to the parameters made available by the web service endpoints.

In [8]:
%pip install pyfredapi -q

Note: you may need to restart the kernel to use updated packages.


You should consider upgrading via the 'c:\Users\Jhon\AppData\Local\Programs\Python\Python310\python.exe -m pip install --upgrade pip' command.


Add credentials (`APIKEY`)

In [9]:
import pyfredapi as pf
fred_api_key = "3430f99a19d5f62e73528885013f544c"


#### Series Metadata

You can query a series' information directly with get_series_info. The get_series_info function returns a SeriesInfo object that contains all the metadata for the given series.

In the below example, we request information for the U.S. GDP series. From the result, we can see that the GDP series is

In [10]:
gdp_info = pf.get_series_info(series_id="GDP", api_key=fred_api_key)
print(gdp_info)

id='GDP' realtime_start='2024-01-09' realtime_end='2024-01-09' title='Gross Domestic Product' observation_start='1947-01-01' observation_end='2023-07-01' frequency='Quarterly' frequency_short='Q' units='Billions of Dollars' units_short='Bil. of $' seasonal_adjustment='Seasonally Adjusted Annual Rate' seasonal_adjustment_short='SAAR' last_updated='2023-12-21 07:57:02-06' popularity=91 notes='BEA Account Code: A191RC\n\nGross domestic product (GDP), the featured measure of U.S. output, is the market value of the goods and services produced by labor and property located in the United States.For more information, see the Guide to the National Income and Product Accounts of the United States (NIPA) and the Bureau of Economic Analysis (http://www.bea.gov/national/pdf/nipaguid.pdf).'


#### Pull Data

The `get_series` function gets the latest data available for a given series. The default return for is a pandas dataframe. The `get_series` function also accepts a return_format argument that can be set to json to return the data in a json-like list of dictionaries.

In [11]:
gdp_df = pf.get_series(series_id="GDP", api_key=fred_api_key)
gdp_df.tail()

Unnamed: 0,realtime_start,realtime_end,date,value
306,2024-01-09,2024-01-09,2022-07-01,25994.639
307,2024-01-09,2024-01-09,2022-10-01,26408.405
308,2024-01-09,2024-01-09,2023-01-01,26813.601
309,2024-01-09,2024-01-09,2023-04-01,27063.012
310,2024-01-09,2024-01-09,2023-07-01,27610.128


#### Get releases as-of date

`get_series_asof_date` returns all releases of a series made on or before a given date. This is helpful if you want limit your analysis window to only the data know on or before a given date.

For example, suppose we want the GDP estimates available on or before 2022-09-01. We can use `get_series_asof_date` with the date `2022-09-01`. The response includes the Q2 2022 estimates for 2022-07-28 and 2022-08-25, but not 2022-09-29 since that is after 2022-09-01.

In [12]:
gdp_090122_df = pf.get_series_asof_date("GDP", date="2022-09-01", api_key=fred_api_key)
gdp_090122_df.tail()

Unnamed: 0,realtime_start,realtime_end,date,value
3071,2022-04-28,2022-05-25,2022-01-01,24382.683
3072,2022-05-26,2022-06-28,2022-01-01,24384.289
3073,2022-06-29,2022-09-01,2022-01-01,24386.734
3074,2022-07-28,2022-08-24,2022-04-01,24851.809
3075,2022-08-25,2022-09-01,2022-04-01,24882.878


#### Additional Parameters



In [13]:
extra_parameters = {
    "observation_start": "2020-01-01",
    "observation_end": "2020-12-31",
}

gdp_df = pf.get_series(series_id="GDP", **extra_parameters, api_key=fred_api_key)
gdp_df

Unnamed: 0,realtime_start,realtime_end,date,value
0,2024-01-09,2024-01-09,2020-01-01,21706.513
1,2024-01-09,2024-01-09,2020-04-01,19913.143
2,2024-01-09,2024-01-09,2020-07-01,21647.64
3,2024-01-09,2024-01-09,2020-10-01,22024.502


## Yahoo Finance

### `yahoo_fin`

yfinance is a popular open source library developed by Ran Aroussi.

It’s completely free and super easy to setup- a single line to install the library:


In [14]:
# %pip install yahoo_fin requests_html 

Methods

- `get_analysts_info`: Scrapes data from the Analysts page for the input ticker from Yahoo Finance (e.g. https://finance.yahoo.com/quote/NFLX/analysts?p=NFLX. This includes information on earnings estimates, EPS trends / revisions etc.
- `get_balance_sheet`: Scrapes the balance sheet for the input ticker from Yahoo Finance.
- `get_cash_flow`: Scrapes the cash flow statement for the input ticker from Yahoo Finance.
- `get_company_info`: Scrapes company information for ticker from Yahoo Finance.
- `get_currencies`: Retrieves information about currencies.
- `get_data`: Retrieves historical data for a given stock or index.
- `get_day_gainers`: Scrapes the top 100 (at most) stocks with the largest gains (on the given trading day) from Yahoo Finance.
- `get_day_losers`: Scrapes the top 100 (at most) worst performing stocks (on the given trading day) from Yahoo Finance.
- `get_day_most_active`: Retrieves the most actively traded stocks in a given day.
- `get_dividends`: Retrieves dividend-related data for a stock.
- `get_earnings`: Fetches earnings-related data for a company.
- `get_earnings_for_date`: Retrieves earnings data for a specific date.
- `get_earnings_in_date_range`: Retrieves earnings data within a specific date range.
- `get_earnings_history`: Fetches historical earnings data for a company.
- `get_financials`: Retrieves general financial information for a company.
- `get_futures`: Fetches information about futures.
- `get_holders`: Retrieves information about shareholders or stockholders.
- `get_income_statement`: Fetches the income statement for a company.
- `get_live_price`: Retrieves live price data for a specific stock or index.
- `get_market_status`: Retrieves information about the current market status.
- `get_next_earnings_date`: Retrieves the next earnings date for a company.
- `get_premarket_price`: Retrieves pre-market price data for a specific stock.
- `get_postmarket_price`: Retrieves post-market price data for a specific stock.
- `get_quote_data`: Fetches detailed quote data for a specific stock.
- `get_quote_table`: Fetches a table with various quote-related information for a stock.
- `get_top_crypto`: Retrieves information about top cryptocurrencies.
- `get_splits`: Retrieves information about stock splits.
- `get_stats`: Retrieves statistics for a stock.
- `get_stats_valuation`: Retrieves valuation-related statistics for a stock.
- `get_undervalued_large_caps`: Fetches undervalued large-cap stocks.
- `tickers_dow`: Retrieves the list of tickers in the Dow Jones Industrial Average.
- `tickers_ftse100`: Retrieves the list of tickers in the FTSE 100 index.
- `tickers_ftse250`: Retrieves the list of tickers in the FTSE 250 index.
- `tickers_ibovespa`: Retrieves the list of tickers in the Ibovespa index.
- `tickers_nasdaq`: Retrieves the list of tickers in the Nasdaq index.
- `tickers_nifty50`: Retrieves the list of tickers in the Nifty 50 index.
- `tickers_niftybank`: Retrieves the list of tickers in the Nifty Bank index.
- `tickers_other`: Retrieves tickers for other indices or categories.
- `tickers_sp500`: Retrieves the list of tickers in the S&P 500 index.


#### Download Price data

See [GitHub](https://github.com/atreadw1492/yahoo_fin)

In [15]:
from yahoo_fin import stock_info as yf

One of the core functions available is called `yf.get_data`, which retrieves historical price data for an individual stock. To call this function, just pass whatever ticker you want:

In [16]:
df1 = yf.get_data("nflx") # gets Netflix's data
df2 = yf.get_data("aapl") # gets Apple's data
df3 = yf.get_data("amzn") # gets Amazon's data
pd.concat([df1, df2, df3])

Unnamed: 0,open,high,low,close,adjclose,volume,ticker
2002-05-23,1.156429,1.242857,1.145714,1.196429,1.196429,104790000,NFLX
2002-05-24,1.214286,1.225000,1.197143,1.210000,1.210000,11104800,NFLX
2002-05-28,1.213571,1.232143,1.157143,1.157143,1.157143,6609400,NFLX
2002-05-29,1.164286,1.164286,1.085714,1.103571,1.103571,6757800,NFLX
2002-05-30,1.107857,1.107857,1.071429,1.071429,1.071429,10154200,NFLX
...,...,...,...,...,...,...,...
2024-01-03,149.199997,151.050003,148.330002,148.470001,148.470001,49425500,AMZN
2024-01-04,145.589996,147.380005,144.050003,144.570007,144.570007,56039800,AMZN
2024-01-05,144.690002,146.589996,144.529999,145.240005,145.240005,45124800,AMZN
2024-01-08,146.740005,149.399994,146.149994,149.100006,149.100006,46757100,AMZN


Pull data for a specific date range:

In [17]:
yf.get_data("amzn", start_date = "01/01/2017", end_date = "01/31/2017").head()

Unnamed: 0,open,high,low,close,adjclose,volume,ticker
2017-01-03,37.896,37.938,37.384998,37.683498,37.683498,70422000,AMZN
2017-01-04,37.919498,37.984001,37.709999,37.859001,37.859001,50210000,AMZN
2017-01-05,38.077499,39.119999,38.013,39.022499,39.022499,116602000,AMZN
2017-01-06,39.118,39.972,38.924,39.7995,39.7995,119724000,AMZN
2017-01-09,39.900002,40.088501,39.588501,39.846001,39.846001,68922000,AMZN


In [18]:
yf.get_data('msft' , start_date = '01/01/1999').head()

Unnamed: 0,open,high,low,close,adjclose,volume,ticker
1999-01-04,34.902344,36.3125,34.84375,35.25,21.853722,69305200,MSFT
1999-01-05,35.46875,37.0,35.359375,36.625,22.706175,64281600,MSFT
1999-01-06,37.375,37.875,36.6875,37.8125,23.442383,69064800,MSFT
1999-01-07,37.4375,37.65625,37.0625,37.625,23.326138,51150400,MSFT
1999-01-08,38.046875,38.125,36.75,37.46875,23.229261,50244800,MSFT


#### Get Live Stock Prices

Scrape live stock prices from yahoo Finance (real-time).

In [19]:
# get live price of Apple
n = 0
while n < 5:
    appl, amz = yf.get_live_price("aapl"), yf.get_live_price("amzn")
    print({"Appl": appl, "Amzn": amz})
    n+=1
 

{'Appl': 185.13999938964844, 'Amzn': 151.3699951171875}
{'Appl': 185.13999938964844, 'Amzn': 151.3699951171875}
{'Appl': 185.13999938964844, 'Amzn': 151.3699951171875}
{'Appl': 185.13999938964844, 'Amzn': 151.3699951171875}
{'Appl': 185.13999938964844, 'Amzn': 151.3699951171875}


Top gainers (live) -> https://finance.yahoo.com/gainers

In [20]:
yf.get_day_gainers()

Unnamed: 0,Symbol,Name,Price (Intraday),Change,% Change,Volume,Avg Vol (3 month),Market Cap,PE Ratio (TTM)
0,JNPR,"Juniper Networks, Inc.",36.81,6.59,21.79,29479000.0,3331000.0,11736000000.0,32.86
1,SRPT,"Sarepta Therapeutics, Inc.",119.77,17.16,16.72,5064000.0,1686000.0,11204000000.0,
2,PRCT,PROCEPT BioRobotics Corporation,48.6,6.39,15.14,1888000.0,496109.0,2454000000.0,
3,PRTA,Prothena Corporation plc,39.62,4.34,12.3,3144000.0,863979.0,2126000000.0,
4,AYI,"Acuity Brands, Inc.",227.93,23.41,11.45,1164000.0,299072.0,7037000000.0,21.18
5,ARDX,"Ardelyx, Inc.",9.1,0.92,11.25,21999000.0,7030000.0,2112000000.0,
6,VKTX,"Viking Therapeutics, Inc.",21.56,2.08,10.68,5770000.0,2495000.0,2157000000.0,
7,GPCR,Structure Therapeutics Inc.,49.14,4.71,10.6,1157000.0,1058000.0,2279000000.0,
8,NBGIF,National Bank of Greece S.A.,7.27,0.67,10.15,19439.0,4019.0,7217000000.0,4.95
9,ZKH,ZKH Group Limited,17.08,1.39,8.89,53683.0,91940.0,2744000000.0,


Top Loosers (live) -> https://finance.yahoo.com/losers/

In [21]:
yf.get_day_losers()

Unnamed: 0,Symbol,Name,Price (Intraday),Change,% Change,Volume,Avg Vol (3 month),Market Cap,PE Ratio (TTM)
0,GRFS,"Grifols, S.A.",8.70,-2.43,-21.83,23764000.0,911158.0,7.357000e+09,217.50
1,NARI,"Inari Medical, Inc.",58.99,-6.61,-10.08,1905000.0,792919.0,3.397000e+09,
2,HPE,Hewlett Packard Enterprise Company,16.14,-1.58,-8.92,32861000.0,10785000.0,2.098200e+10,10.48
3,CORT,Corcept Therapeutics Incorporated,25.10,-2.36,-8.59,2887000.0,884617.0,2.587000e+09,31.38
4,LTHM,Arcadium Lithium plc,16.51,-1.54,-8.53,39179000.0,6916000.0,2.970000e+09,9.17
...,...,...,...,...,...,...,...,...,...
95,MSTR,MicroStrategy Incorporated,577.29,-20.72,-3.46,1352000.0,1070000.0,9.649000e+09,282.99
96,SNEX,StoneX Group Inc.,67.50,-2.36,-3.38,168401.0,146016.0,2.113000e+09,8.23
97,SCBFY,Standard Chartered PLC,16.28,-0.57,-3.35,26782.0,99158.0,2.170000e+10,12.62
98,PENN,"PENN Entertainment, Inc.",24.54,-0.84,-3.31,5695000.0,5397000.0,3.724000e+09,
