# Retrieving data from internet - Euribor sample

### Retrieving EURIBOR data

download daily EURIBOR rates and add to database :
https://www.euribor-rates.eu/en/current-euribor-rates/1/euribor-rate-1-month/

## Step 1 : try to look for more formatted data

A link to csv file or json or any structured format or any kind of API call is a lot easier to exploit that to extract some data from HTML files. Also a lot of website are generating HTML from API calls via angular, vue or react.

For example this page is providing information easier to extract :

https://sdw.ecb.europa.eu/quickview.do?SERIES_KEY=143.FM.M.U2.EUR.RT.MM.EURIBOR3MD_.HSTA

In [1]:
import pandas as pd

url = "https://sdw.ecb.europa.eu/quickviewexport.do;" + \
      "jsessionid=7A955208FA0C54DE9310BBC89AC61AB0?SERIES_KEY=143.FM.M.U2.EUR.RT.MM.EURIBOR3MD_.HSTA&type=xls"
df = pd.read_csv(url, header=5)
df.head(5)

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,obs. status,obs. comment
0,2022Oct,1.4277,Normal value (A),
1,2022Sep,1.0109,Normal value (A),
2,2022Aug,0.3947,Normal value (A),
3,2022Jul,0.0366,Normal value (A),
4,2022Jun,-0.2392,Normal value (A),


In [2]:
df.columns = ['Date', 'EuriborRate', 'ObservationStatus', 'ObservationComment']
df.head(5)

Unnamed: 0,Date,EuriborRate,ObservationStatus,ObservationComment
0,2022Oct,1.4277,Normal value (A),
1,2022Sep,1.0109,Normal value (A),
2,2022Aug,0.3947,Normal value (A),
3,2022Jul,0.0366,Normal value (A),
4,2022Jun,-0.2392,Normal value (A),


Ok, but we have only monthly values, and we want daily ...

## Step 2 : extracting from HTML

Pandas is providing some very nice feature ! You just have to give an url, and pandas load all the table it can find in the html into dataframes !

In [3]:
import pandas as pd
dfs = pd.read_html(
    'https://www.euribor-rates.eu/en/current-euribor-rates/1/euribor-rate-1-month/'
)

In [4]:
print(len(dfs))

3


In [5]:
dfs[0]

Unnamed: 0,0,1
0,11/18/2022,1.413 %
1,11/17/2022,1.425 %
2,11/16/2022,1.395 %
3,11/15/2022,1.414 %
4,11/14/2022,1.405 %
5,11/11/2022,1.362 %
6,11/10/2022,1.395 %
7,11/9/2022,1.415 %
8,11/8/2022,1.415 %
9,11/7/2022,1.405 %


In [6]:
dfs[0].info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       10 non-null     object
 1   1       10 non-null     object
dtypes: object(2)
memory usage: 288.0+ bytes


In [7]:
def process_df(df):
    # Using pd.to_datetime to convert first column to Date
    df['Date'] = pd.to_datetime(df[0])
    # Converting percentage column to numeric
    df['Rate'] = pd.to_numeric(df[1].str.replace(' %', ''))
    
    return df.drop([0, 1], axis=1)

In [8]:
df_by_day, df_by_month, df_by_year = map(process_df, dfs)

In [9]:
df_by_day

Unnamed: 0,Date,Rate
0,2022-11-18,1.413
1,2022-11-17,1.425
2,2022-11-16,1.395
3,2022-11-15,1.414
4,2022-11-14,1.405
5,2022-11-11,1.362
6,2022-11-10,1.395
7,2022-11-09,1.415
8,2022-11-08,1.415
9,2022-11-07,1.405


In [10]:
df_by_day.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   Date    10 non-null     datetime64[ns]
 1   Rate    10 non-null     float64       
dtypes: datetime64[ns](1), float64(1)
memory usage: 288.0 bytes


In [11]:
import sqlite3
con = sqlite3.connect('euribor.db')
df_by_day.to_sql('EuriborByDay', con)

ValueError: Table 'EuriborByDay' already exists.