# Discoverying content of interest in the Data Observatory

The Discovery API is a powerful tool for exploring the available datasets in our data lake. Through its methods you would be able to navigate through the datasets and their properties, thus knowing in advanced which sources may be of interest for you before even requesting access to them. 

## Catalog: the first step for discovery

The Catalog class provides the methods to be used as the starting point in your discovery. It allows you to get the complete list of countries related to the avilable datasets, for exampple.


### Get the list of countries

In [3]:
from cartoframes.data.observatory.catalog import Catalog
from cartoframes.data.observatory.country import Country, Countries

countries = Catalog.countries()

countries

Unnamed: 0,country_iso_code3
0,spain
1,usa


In [4]:
categories = Catalog.categories()
categories

Unnamed: 0_level_0,id,name,published_in_web
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
demographics,demographics,Demographics,True
environmental,environmental,Environmental,True
financial,financial,Financial,True
housing,housing,Housing,True
human_mobility,human_mobility,Human Mobility,True
points_of_interest,points_of_interest,Points of Interest,True
road_traffic,road_traffic,Road Traffic,True


In [5]:
cat1 = categories.loc['demographics']
cat1

id                  demographics
name                Demographics
published_in_web            True
Name: demographics, dtype: object

In [6]:
cat1['id']

'demographics'

In [7]:
cat1.index

Index(['id', 'name', 'published_in_web'], dtype='object')

In [10]:
names = categories['name']
names

id
demographics                Demographics
environmental              Environmental
financial                      Financial
housing                          Housing
human_mobility            Human Mobility
points_of_interest    Points of Interest
road_traffic                Road Traffic
Name: name, dtype: object

In [11]:
names.loc['demographics']

'Demographics'

In [37]:
from cartoframes.data.observatory.category import Category
isinstance(names, Category)

True

In [20]:
country1 = countries.iloc[0]
country1

country_iso_code3    spain
Name: 0, dtype: object

In [None]:
isinstance(countries, Countries)

In [None]:
import pandas as pd

isinstance(countries, pd.DataFrame)

### Filter one country 

Since the list of countries is also a Pandas' DataFrame, we can use its already familiar methods to explore the data.

In [None]:
filtered_country = countries.iloc[0]
filtered_country

In [None]:
isinstance(filtered_country, Country)

In [None]:
isinstance(filtered_country, pd.Series)

## Get a particular country

If we already know that a particular country has presence in the Catalog, we can retrieve it directly by using its id.

In [None]:
country1 = Catalog.countries().get_by_id('spain')

country1

In [None]:
isinstance(country1, Country)

In [None]:
isinstance(country1, pd.Series)

### Get the datasets for that country

Once we have a Country we can use the discovery methods to get the datasets related to that country.

In [None]:
esp_datasets = country1.datasets()
esp_datasets

In [None]:
isinstance(esp_datasets, pd.DataFrame)

In [None]:
from cartoframes.data.observatory.dataset import Datasets

isinstance(esp_datasets, Datasets)

Again, we are dealing with a pandas' DataFrame:

In [None]:
d1 = esp_datasets.iloc[0]
d1

In [None]:
from cartoframes.data.observatory.dataset import Dataset
isinstance(d1, Dataset)

And as well as Country, a Dataset can be used to extract related properties:

In [None]:
vars1 = d1.variables()
vars1

In [None]:
from cartoframes.data.observatory.variable import Variable, Variables

isinstance(vars1, Variables)