# 5.1 - Macrobond Data API for Python - Using the Macrobond Tree Structure

*Retrieving information from the Macrobond Tree Structure*
Note that the Tree Structure is subject to frequent evolution. Hence, it is not recommended using hardcoded parts of the tree to build and update your universe on production. The methods below can be mostly used for data exploration.

This notebook is designed to act as a template and guidline in which certain elements can be minipulated to get the desired outcome. Here we demonstrate how you can use the Macrobond Data API for Python to locate a Time Series in the Macrobond's proprietary Tree Structure - used across various Macrobond's products. 

Time Series have been grouped and organized into a tree-structure by Macrobond. They feature thousands of primary sources from around the world.

**National & International Databases**

For these databases, we have reproduced the original source structure and classification in the application and web API. For the most part, these databases are from international sources that provide harmonized data across countries. However, we have chosen to reproduce the structure of certain primary sources, as well. You’ll find the datasets typically cover most, if not all, the data available from the source.

When you wish to locate a Time Series within the Tree Structure, there are three database entry points in which the Time Series can be found:
- Concept & Category
- Country & Region
- Source & Release

Note that some Time Series can be found in all 3 entry points while others can be found in one or two of them only. A Time Series that carries a RegionKey will be present under Concept & Category for instance. 

**Country & Region**

For each of the three main entry points, the structure is organized according to a different principle, for example Country and Region, which is what you wouldd use if want to look at a variety of data for a particular country.

**Source & Release**

For finding data from a specific source, choose the Source & Release view, especially if you are familiar with the way the source organizes its data.

**Concept & Category**
If you are looking for one indicator to compare across several countries, unemployment for example, the Concept & Category view is the place to go.

You can find a full description of all methods and parameters used in the examples in the [documentation of the API](https://macrobond.github.io/macrobond-data-api/common/api.html).

*Full error handling is omitted for brevity*

***

## Importing packages

In [1]:
import macrobond_data_api as mda
from macrobond_data_api.web import WebClient
import pandas as pd

***

## Locating a time series in the Macrobond's Tree Structure

In the example below, we will use time series `arpric0432`: 
> **Argentina, Consumer Price Index, Total, Patagonia, Index**

The endpoint used is find_locations in WebClient().open()

In [2]:
with WebClient() as api:
    tree_structure = api.session.series_tree.find_locations("arpric0432")

We are now visualising the various ways to locate the time series in the Macrobond's Tree Structure.
Depending on the time series, it can be located in Country & Region and/or Source & Release and/or Concept & Category. The latter exists if and only if the time series used in our example carries a RegionKey (a.k.a. Concept in the Macrobond application).

A RegionKey associates a series with a concept, such as gdp_total. These concepts are identified by Macrobond and are used to find comparable series representing equivalent concepts across regions/countries. Each concept is held only by one designated series per region/country.
Feel free to refer to notebook 2.2 - Screening on a Concept for more information.

In [3]:
tree_structure_df = pd.json_normalize(tree_structure)
tree_structure_df

Finding the location(s) of time series can assist the workflow from research to production when users are finding time series in the Macrobond's application in an effort to document where these series could be found.

***

## Downloading the time series from a node within the Macrobond's Tree Structure

The endpoint used is SeriesTree/GetLeafSeries

When you find time series in a sepcific node of the Tree but cannot download all these time series due to the potential lack of RegionKey or search parameters through the source or a region, you can also download the time series included in this node.

In the example below, we are using time series `arpric0403`

> **Country & Region ? Argentina > Prices > Consumer Price Index > Regional > Argentina National Institute of Statistics & Censuses (INDEC) > Patagonia > All Items**


You can enter the path to the leaf of the tree composed of the node descriptions encoded using RFC 2396 separated by a '/'

Let's convert our path to the node, encoded using RFC 2396 separated by a '/'

In [4]:
def treestring(tree, result=""):
    if "child" in tree.keys():
        result = result + tree["title"] + "/"
        return treestring(tree["child"], result)
    else:
        result = result + tree["title"]
        return result


String = treestring(tree_structure[3])
String

In [5]:
with WebClient() as api:
    tree_structure = api.session.series_tree.get_leaf_series(String)

Here is below the standard response of GetLeafSeries:

In [6]:
df2 = pd.json_normalize(tree_structure)
pd.options.display.max_colwidth = 1000
df2

We can now flatten the series' names and title:

In [7]:
df3 = pd.json_normalize(tree_structure["aspects"], ["groups", ["series"]])
pd.options.display.max_colwidth = 1000
df3

You can now retrieve the time series dates | value | metadata from the Names extracted above.
Feel free to refer to notebook 1.3 - Macrobond web API - Fetching multiple Time Series (POST) for further examples.

We are now storing the list of time series in a json file:

In [8]:
df4 = pd.DataFrame(df3["properties.Name"]).rename(columns={"properties.Name": "Name"}).apply(pd.Series.explode)
df4

In [9]:
universe = df4.to_json(orient="records")
universe