# 2.2 - Macrobond web API - Screening on a Concept

*Performing coverage checks based on Macrobond's RegionKey attribute*

This notebook aims to provide examples of how to use Macrobond's web API call methods as well as insights on the key attributes used to display the output in an understandable format.

We will focus here on using the Search method based on a **RegionKey** input. This helps you find the market standard in each region for a pre-defined concept.

*The examples uses the Web API, but you can chose to use the desktop COM API and get the same result. Full error handling is omitted for brevity*

***

## Importing packages

In [None]:
from matplotlib import pyplot

from macrobond_data_api.web import WebClient

***

## Get the data
Our typical workflow starts with the information we find from a time series' metadata. From there, we will seek further information on these attributes. But, it is absolutely possible to skip this step and go to section **Searching time series carrying a similar concept** directly.
Let's now request a time series to have a look at its metadata.
Feel free to refer to https://api.macrobondfinancial.com/swagger/index.html to get the comprehensive list of web API endpoints and parameters used.

In the example below, we will use time series `aulama0227`: 
> **Australia, Unemployment, Total, Rate, SA**

***

## Visualising the metadata
Let's visualise the metadata in a Pandas dataframe. 
You will see below the full list of metadata attributes available for our time series. 
Feel free to visit this page for more information on these fields: https://help.macrobond.com/technical-information/common-metadata/

In [None]:
with WebClient() as api:
    pd_series = api.get_one_entity("aulama0227").metadata_to_pd_series()
pd_series

We can spot in the metadata response `RegionKey = labor_umemp`. Let's use it as a parameter in the Search method.

***

## Searching time series carrying a similar concept

In [None]:
with WebClient() as api:
    data_frame = api.entity_search(
        entity_types="TimeSeries",
        must_have_values={"RegionKey": "labor_unemp"},
        include_discontinued=False,
    ).to_pd_data_frame()
with WebClient() as api:
    data_frame = api.entity_search(
        entity_types="TimeSeries",
        must_have_values={"RegionKey": "labor_unemp"},
        include_discontinued=False,
    ).to_pd_data_frame()[
        [
            "Name",
            "FullDescription",
            "Region",
            "RegionKey",
            "Frequency",
            "Source",
            "FirstRevisionTimeStamp",
        ]
    ]
# data_frame.head(10)
data_frame.head(10)

### We will now focus on the Point-in-Time (PiT) series in this coverage check
Let's isolate the first element of the Region attribute. While most of the time series carry one region only, some can have multiple regions with for instance "gb" and "gb,city_[xxx]".
Let's also convert the date-time to years only.

In [None]:
data_frame["RegionString"] = data_frame["Region"].apply(lambda x: ", ".join(map(str, x)))
data_frame["FirstRevisionYear"] = data_frame["FirstRevisionTimeStamp"].str[:4]
data_frame.head(3)

### Displaying the new DataFrame
Let's see how our transformations have been applied by isolating on a few columns: `df.iloc[rows,[columns]]`. Note that we are also dropping NaN values in the FirstRevisionDate column: `df.dropna(subset=['FirstRevisionDate'])`

In [None]:
data_frame_final = data_frame.dropna(subset=["FirstRevisionYear"]).iloc[0:1000, [0, 1, 8, 4, 5, 7]]
data_frame_final

### Group the results by FirstRevisionYear and Frequency
Note that Macrobond started to systematically collect PiT data in 2018. 
PiT coverage prior to 2018 has been backfilled by leveraging the source or internal collection logs.

In [None]:
data_frame_group = (
    data_frame_final.groupby(["FirstRevisionYear", "Frequency"])["Name"].count().reset_index(name="Count")
)
data_frame_group

### Plot your results in a bar chart

In [None]:
pyplot.rcParams["figure.figsize"] = [16, 9]

Colours = [
    (27 / 255, 54 / 255, 93 / 255),
    (37 / 255, 107 / 255, 162 / 255),
    (142 / 255, 81 / 255, 168 / 255),
    (88 / 255, 162 / 255, 145 / 255),
    (205 / 255, 84 / 255, 91 / 255),
    (0 / 255, 79 / 255, 89 / 255),
]
pyplot.bar(data_frame_group["FirstRevisionYear"], data_frame_group["Count"], color=Colours)
pyplot.title("FirstRevisionYear per RegionKey", fontsize=14)
pyplot.xlabel("PiT Start Year", fontsize=14)
pyplot.ylabel("Count", fontsize=14)
pyplot.grid(False)
pyplot.autoscale()
pyplot.show()