# 3.1 - Macrobond web API - Aligning multiple Time Series

*Using Macrobond's web API features to align various time series on a single calendar, frequency or currency and deal with missing values when observations do not all carry the same frequency.*

This notebook aims to provide examples of how to use Macrobond's web API call methods as well as insights on the various methodologies used to align our time series for analysis.

We will focus here on using the FetchUnifiedSeries POST call. This helps you doing the necessary pre-work before running your analysis or model.

*Full error handling is omitted for brevity*

***

## Importing packages

In [3]:
import statsmodels.api as statsmodels_api
from sklearn import linear_model

from macrobond_financial.common.enums import SeriesFrequency
from macrobond_financial.common.types import StartOrEndPoint
from macrobond_financial.web import WebClient

***

## Get the data - fetchunifiedseries

Note that we are using here the below time series in this example:
* cyinea0001 - Cyprus, Earnings, Wage Growth, Nominal
* cypric0014 - Cyprus, Consumer Price Index, Miscellaneous Goods & Services, Index
* cytour0076 - Cyprus, Income, Revenue, Total, EUR
* un_myos_cy_total - Cyprus, Human Development, Education, Mean Years of Schooling

Feel free to refer to https://api.macrobondfinancial.com/swagger/index.html to get the comprehensive list of web API endpoints and parameters used.

We want to look at data from Cyprus and conduct multiple regression analysis further down the notebook. Our dataset has the following features:

•	Our dependant variable will be nominal wage growth, which has an inception date of 1960 and is collected from the Cyprus Statistical Service (CYSTAT), the frequency is annual.

•	Our first independent variable will be Consumer Price Index for Miscellaneous Goods & Services which has an inception date of 2000 and is also collected from CYSTAT, and the frequency is monthly.

•	Our second independent variable is Income, total Revenue from foreign tourism (EUR) which has an inception date of 2001 and collected from CYSTAT, it has a monthly frequency.

•	Our final independent variable is Education, Mean Years of Schooling which is collected from United Nations Development Programme (UNDP) and has an inception date of 1990, the frequency is annual.
Immediately we can see that the current data set have different time scales, frequencies and currencies. So, in order for us to make the data comparable, we will utilise the 'Fetchallunifiedseries' endpoint which utilises a post request, which takes the following form below, let's see what each of these mean and how it can manipulate our data.

***

## Visualising the data
We have flattened the columns we want to portray in our chart (dates and values) to prepare our data to be graphed further down the notebook

In [4]:
with WebClient() as api:
    data_frame = api.get_unified_series(
        "cyinea0001",
        "cypric0014",
        "cytour0076",
        "un_myos_cy_total",
        frequency=SeriesFrequency.ANNUAL,
        currency="USD",
        start_point=StartOrEndPoint.data_in_all_series(),
        end_point=StartOrEndPoint.data_in_all_series(),
    ).to_pd_data_frame()
data_frame.columns = [
    "Wage Growth",
    "CPI",
    "Income from Foreign Tourism",
    "Mean Years of Schooling"
]
data_frame

Unnamed: 0,Wage Growth,CPI,Income from Foreign Tourism,Mean Years of Schooling
2001-01-01 00:00:00+00:00,5.1,74.092784,1928952000.0,10.0973
2002-01-01 00:00:00+00:00,5.6,79.499212,1853059000.0,10.2431
2003-01-01 00:00:00+00:00,6.3,83.221561,1970487000.0,10.3888
2004-01-01 00:00:00+00:00,4.3,87.258003,2063894000.0,10.4696
2005-01-01 00:00:00+00:00,5.4,90.819964,2118079000.0,10.6611
2006-01-01 00:00:00+00:00,5.4,92.985767,2220008000.0,10.9152
2007-01-01 00:00:00+00:00,5.2,95.475094,2550603000.0,11.2244
2008-01-01 00:00:00+00:00,6.4,97.811645,2667023000.0,11.3074
2009-01-01 00:00:00+00:00,3.0,100.489693,2104405000.0,11.2861
2010-01-01 00:00:00+00:00,2.4,102.628536,2021286000.0,11.4688


***

## Multiple Regression Analysis

Now that we have all the variables visually, we will use the package sklearn and from there use the linear_model package to make our model. Let us first start by defining our variables.

In [5]:
x = data_frame[["CPI", "Income from Foreign Tourism", "Mean Years of Schooling"]]
y = data_frame["Wage Growth"]

regr = linear_model.LinearRegression()
regr.fit(x, y)

x = statsmodels_api.add_constant(x)
Summary = statsmodels_api.OLS(y, x).fit()
Summary.summary()

  "anyway, n=%i" % int(n))


0,1,2,3
Dep. Variable:,Wage Growth,R-squared:,0.737
Model:,OLS,Adj. R-squared:,0.685
Method:,Least Squares,F-statistic:,14.04
Date:,"Fri, 17 Jun 2022",Prob (F-statistic):,0.000126
Time:,16:58:02,Log-Likelihood:,-33.443
No. Observations:,19,AIC:,74.89
Df Residuals:,15,BIC:,78.66
Df Model:,3,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,60.0032,9.825,6.107,0.000,39.061,80.945
CPI,0.2442,0.114,2.145,0.049,0.002,0.487
Income from Foreign Tourism,5.411e-09,2e-09,2.707,0.016,1.15e-09,9.67e-09
Mean Years of Schooling,-8.2373,1.935,-4.257,0.001,-12.362,-4.113

0,1,2,3
Omnibus:,11.398,Durbin-Watson:,1.293
Prob(Omnibus):,0.003,Jarque-Bera (JB):,8.726
Skew:,-1.517,Prob(JB):,0.0127
Kurtosis:,4.35,Cond. No.,67400000000.0


In [6]:
CYP_Wage_Growth = regr.predict([[100.010000, 2.994805e09, 12.1712]])
print("Cyprus Wage Growth Forecast")
print(CYP_Wage_Growth)

Cyprus Wage Growth Forecast
[0.3738423]


  "X does not have valid feature names, but"


***

## Conclusion

Here we can see how the FetchUnifiedSeries endpoint which utilises a POST request really eases workflows by simply querying the data needed in the model, applying the transformations and visualising the results, rather than doing a one-off mathematical transformation from scratch. Not only this feature saves a lot of time in the preparatory and necessary work but it also increases consistency across the various time series and models running off the back of the Macrobond data.  