# 2.1 - Macrobond web API - Categories Exploration

*Performing coverage checks based on Macrobond's Categories*

This notebook aims to provide examples of how to use Macrobond's web API call methods as well as insights on the key attributes used to display the output in an understandable format.

We will focus here on using the Search method based on a **Category** input.
Our data is arranged as a logical hierarchy of categories to help you find or narrow down related datasets quickly

*The examples uses the Web API, but you can chose to use the desktop COM API and get the same result. Full error handling is omitted for brevity*

***

## Importing packages

In [None]:
from macrobond_data_api.web import WebClient

***

## Get the data
Feel free to refer to https://api.macrobondfinancial.com/swagger/index.html to get the comprehensive list of web API endpoints and parameters used.

In the example below, we are using here the Search endpoint with filters on Category `inea` and Region `gb`: 
> **Income & Earnings - United Kingdom**

Feel free to use the notebook **1.1 - Macrobond web API - Metadata Navigation** to pull out a list of all available categories and regions.

***

## Visualising the data
Let's evaluate Macrobond's coverage for Financial Accounts-related time series in the United Kingdom.

In [None]:
with WebClient() as api:
    data_frame = api.entity_search(
        entity_types="TimeSeries",
        must_have_values={"Region": "gb", "category": "inea"},
    ).to_pd_data_frame()[
        [
            "Name",
            "FullDescription",
            "Region",
            "Frequency",
            "Source",
            "FirstRevisionTimeStamp",
        ]
    ]
data_frame.head(10)

### We will now focus on the Point-in-Time (PiT) series in this coverage check
Let's isolate the first element of the Region attribute. While most of the time series carry one region only, some can have multiple regions with for instance "gb" and "gb,city_[xxx]".

In [None]:
data_frame["RegionString"] = data_frame["Region"].apply(lambda x: ", ".join(map(str, x)))
data_frame.head(10)

### Let's convert the date-time to years only

In [None]:
data_frame["FirstRevisionYear"] = data_frame["FirstRevisionTimeStamp"].str[:4]

### Displaying the new DataFrame
Let's see how our transformations have been applied by isolating on a few columns: `df.iloc[rows,[columns]]`. Note that we are also dropping NaN values in the FirstRevisionDate column: `df.dropna(subset=['FirstRevisionDate'])`

In [None]:
data_frame_final = data_frame.dropna(subset=["FirstRevisionYear"]).iloc[0:1000, [0, 1, 6, 3, 4, 7]]
data_frame_final

### Group the results by FirstRevisionYear and Frequency
Note that Macrobond started to systematically collect PiT data in 2018. 
PiT coverage prior to 2018 has been backfilled by leveraging the source or internal collection logs.

In [None]:
df_group = data_frame_final.groupby(["FirstRevisionYear", "Frequency"])["Name"].count().reset_index(name="Count")
df_group