## Explore the Data Observatory catalog

The Data Observatory is a a spatial data platform that enables Data Scientists to augment their data and broaden their analysis. It offers a wide range of datasets from around the globe in a spatial data repository.

This guide is intended for those who are going to start augmenting their own data using CARTOframes and are willing to explore our public Data Observatory catalog on the seek of the datasets that best fit their use cases and analyses.

**Note: The catalog is public and you don't need a CARTO account to search for available datasets**

### Looking for demographics and financial data in the US in the catalog

In this guide we are going to filter the Data Observatory catalog looking for demographics and financial data in the US.

The catalog is comprised of thousands of curated spatial datasets, so when searching for
data the easiest way to find out what you are looking for is make use of a feceted search. A faceted (or hierarchical) search allows you to narrow down search results by applying multiple filters based on faceted classification of the catalog datasets.

Datasets are organized in three main hirearchies:

- Country
- Category
- Geography (or spatial resolution)

For our analysis we are looking for demographics and financial datasets in the US with a spatial resolution at the level of block groups. 

First we can start for discovering which available geographies (orspatial resolutions) we have for demographics data in the US, by filtering the `catalog` by `country` and `category` and listing the available `geographies`.

Let's start exploring the available categories of data for the US:

In [1]:
from cartoframes.data.observatory import Catalog
Catalog().country('usa').categories

[<Category.get('road_traffic')>,
 <Category.get('points_of_interest')>,
 <Category.get('human_mobility')>,
 <Category.get('financial')>,
 <Category.get('demographics')>,
 <Category.get('environmental')>]

For the case of the US, the Data Observatory provides six different categories of datasets. Let's discover the available spatial resolutions for the demographics category (which at a first sight will contain the population data we need).

In [2]:
from cartoframes.data.observatory import Catalog
geographies = Catalog().country('usa').category('demographics').geographies
geographies

[<Geography.get('ags_blockgroup_1c63771c')>,
 <Geography.get('ags_q17_4739be4f')>,
 <Geography.get('mbi_blockgroups_1ab060a')>,
 <Geography.get('mbi_counties_141b61cd')>,
 <Geography.get('mbi_county_subd_e8e6ea23')>,
 <Geography.get('mbi_pc_5_digit_4b1682a6')>,
 <Geography.get('usct_blockgroup_f45b6b49')>,
 <Geography.get('usct_cbsa_6c8b51ef')>,
 <Geography.get('usct_censustract_bc698c5a')>,
 <Geography.get('usct_congression_b6336b2c')>,
 <Geography.get('usct_county_ec40c962')>,
 <Geography.get('usct_place_12d6699f')>,
 <Geography.get('usct_puma_b859f0fa')>,
 <Geography.get('usct_schooldistr_515af763')>,
 <Geography.get('usct_schooldistr_da72a4cb')>,
 <Geography.get('usct_schooldistr_287be4f7')>,
 <Geography.get('usct_state_4c8090b5')>,
 <Geography.get('usct_zcta5_75071016')>]

Let's filter the geographies by those that contain information at the level of blockgroup. For that purpose we are converting the geographies to a pandas `DataFrame` and search for the string `blockgroup` in the `id` of the geographies:

In [3]:
df = geographies.to_dataframe()
df[df['id'].str.contains('blockgroup', case=False, na=False)]

Unnamed: 0,available_in,country_id,description,geom_coverage,geom_type,id,is_public_data,lang,name,provider_id,provider_name,slug,summary_json,update_frequency,version
0,[bq],usa,,0106000020E61000000800000001030000000100000009...,MULTIPOLYGON,carto-do.ags.geography_usa_blockgroup_2015,False,eng,USA Census Block Group,ags,Applied Geographic Solutions,ags_blockgroup_1c63771c,,,2015
2,,usa,MBI Digital Boundaries for USA at Blockgroups ...,01060000005A0100000103000000010000002900000013...,MULTIPOLYGON,carto-do.mbi.geography_usa_blockgroups_2019,False,eng,USA - Blockgroups,mbi,Michael Bauer International,mbi_blockgroups_1ab060a,,,2019
6,,usa,,0106000020E61000000800000001030000000100000009...,MULTIPOLYGON,carto-do-public-data.usa_carto.geography_usa_b...,True,eng,Topologically Integrated Geographic Encoding a...,usa_carto,CARTO shoreline-clipped USA Tiger geographies,usct_blockgroup_f45b6b49,,,2015


We have three available datasets, from three different providers: Michael Bauer International, Open Data and AGS. For this example, we are going to look for demographic datasets for the AGS blockgroups geography `ags_blockgroup_1c63771c`:

In [4]:
datasets = Catalog().country('usa').category('demographics').geography('ags_blockgroup_1c63771c').datasets
datasets

[<Dataset.get('ags_sociodemogr_e92b1637')>,
 <Dataset.get('ags_consumerspe_fe5d060a')>,
 <Dataset.get('ags_retailpoten_ddf56a1a')>,
 <Dataset.get('ags_consumerpro_e8344e2e')>,
 <Dataset.get('ags_businesscou_a8310a11')>,
 <Dataset.get('ags_crimerisk_9ec89442')>]

In [5]:
datasets.to_dataframe()

Unnamed: 0,available_in,category_id,category_name,country_id,data_source_id,description,geography_description,geography_id,geography_name,id,...,lang,name,provider_id,provider_name,slug,summary_json,temporal_aggregation,time_coverage,update_frequency,version
0,[bq],demographics,Demographics,usa,sociodemographic,Census and ACS sociodemographic data estimated...,,carto-do.ags.geography_usa_blockgroup_2015,USA Census Block Group,carto-do.ags.demographics_sociodemographic_usa...,...,eng,Sociodemographic,ags,Applied Geographic Solutions,ags_sociodemogr_e92b1637,"{'counts': {'rows': 217182, 'cells': 22369746,...",yearly,"[2019-01-01,2020-01-01)",,2019
1,[bq],demographics,Demographics,usa,consumerspending,The Consumer Expenditure database consists of ...,,carto-do.ags.geography_usa_blockgroup_2015,USA Census Block Group,carto-do.ags.demographics_consumerspending_usa...,...,eng,Consumer Spending,ags,Applied Geographic Solutions,ags_consumerspe_fe5d060a,"{'counts': {'rows': 217182, 'cells': 28016478,...",yearly,"[2018-01-01,2019-01-01)",,2018
2,[bq],demographics,Demographics,usa,retailpotential,The retail potential database consists of aver...,,carto-do.ags.geography_usa_blockgroup_2015,USA Census Block Group,carto-do.ags.demographics_retailpotential_usa_...,...,eng,Retail Potential,ags,Applied Geographic Solutions,ags_retailpoten_ddf56a1a,"{'counts': {'rows': 217182, 'cells': 28668024,...",yearly,"[2018-01-01,2019-01-01)",,2018
3,[bq],demographics,Demographics,usa,consumerprofiles,Segmentation of the population in sixty-eight ...,,carto-do.ags.geography_usa_blockgroup_2015,USA Census Block Group,carto-do.ags.demographics_consumerprofiles_usa...,...,eng,Consumer Profiles,ags,Applied Geographic Solutions,ags_consumerpro_e8344e2e,"{'counts': {'rows': 217182, 'cells': 31057026,...",yearly,"[2018-01-01,2019-01-01)",,2018
4,[bq],demographics,Demographics,usa,businesscounts,Business Counts database is a geographic summa...,,carto-do.ags.geography_usa_blockgroup_2015,USA Census Block Group,carto-do.ags.demographics_businesscounts_usa_b...,...,eng,Business Counts,ags,Applied Geographic Solutions,ags_businesscou_a8310a11,"{'counts': {'rows': 217182, 'cells': 25627476,...",yearly,"[2018-01-01,2019-01-01)",,2018
5,[bq],demographics,Demographics,usa,crimerisk,Using advanced statistical methodologies and a...,,carto-do.ags.geography_usa_blockgroup_2015,USA Census Block Group,carto-do.ags.demographics_crimerisk_usa_blockg...,...,eng,Crime Risk,ags,Applied Geographic Solutions,ags_crimerisk_9ec89442,"{'counts': {'rows': 217182, 'cells': 3040548, ...",yearly,"[2018-01-01,2019-01-01)",,2018


They comprise different information: consumer spending, retail potential, consumer profiles, etc.

At a first sight, it looks the dataset with `data_source_id: sociodemographic` might contain the population information we are looking for. Let's try to understand a little bit better what data this dataset contains by looking at its variables:

In [6]:
from cartoframes.data.observatory import Dataset
dataset = Dataset.get('ags_sociodemogr_e92b1637')
variables = dataset.variables
variables

[<Variable.get('HINCYMED65_310bc888')> #'Median Household Income: Age 65-74 (2019A)',
 <Variable.get('HINCYMED55_1a269b4b')> #'Median Household Income: Age 55-64 (2019A)',
 <Variable.get('HINCYMED45_33daa0a')> #'Median Household Income: Age 45-54 (2019A)',
 <Variable.get('HINCYMED35_4c7c3ccd')> #'Median Household Income: Age 35-44 (2019A)',
 <Variable.get('HINCYMED25_55670d8c')> #'Median Household Income: Age 25-34 (2019A)',
 <Variable.get('HINCYMED24_22603d1a')> #'Median Household Income: Age < 25 (2019A)',
 <Variable.get('HINCYGT200_e552a738')> #'Household Income > $200000 (2019A)',
 <Variable.get('HINCY6075_1933e114')> #'Household Income $60000-$74999 (2019A)',
 <Variable.get('HINCY4550_f7ad7d79')> #'Household Income $45000-$49999 (2019A)',
 <Variable.get('HINCY4045_98177a5c')> #'Household Income $40000-$44999 (2019A)',
 <Variable.get('HINCY3540_73617481')> #'Household Income $35000-$39999 (2019A)',
 <Variable.get('HINCY2530_849c8523')> #'Household Income $25000-$29999 (2019A)',
 <V

In [7]:
from cartoframes.data.observatory import Dataset
vdf = variables.to_dataframe()
vdf

Unnamed: 0,agg_method,column_name,dataset_id,db_type,description,id,name,slug,starred,summary_json,variable_group_id
0,AVG,HINCYMED65,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Median Household Income: Age 65-74 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYMED65,HINCYMED65_310bc888,False,"{'head': [67500, 0, 0, 50000, 0, 0, 0, 0, 0, 0...",
1,AVG,HINCYMED55,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Median Household Income: Age 55-64 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYMED55,HINCYMED55_1a269b4b,False,"{'head': [67500, 87500, 0, 30000, 0, 0, 0, 0, ...",
2,AVG,HINCYMED45,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Median Household Income: Age 45-54 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYMED45,HINCYMED45_33daa0a,False,"{'head': [67500, 0, 0, 60000, 0, 0, 0, 0, 0, 0...",
3,AVG,HINCYMED35,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Median Household Income: Age 35-44 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYMED35,HINCYMED35_4c7c3ccd,False,"{'head': [0, 87500, 0, 5000, 0, 0, 0, 0, 0, 0]...",
4,AVG,HINCYMED25,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Median Household Income: Age 25-34 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYMED25,HINCYMED25_55670d8c,False,"{'head': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",
5,AVG,HINCYMED24,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Median Household Income: Age < 25 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYMED24,HINCYMED24_22603d1a,False,"{'head': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",
6,AVG,HINCYGT200,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Household Income > $200000 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCYGT200,HINCYGT200_e552a738,False,"{'head': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",
7,AVG,HINCY6075,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Household Income $60000-$74999 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCY6075,HINCY6075_1933e114,False,"{'head': [5, 0, 0, 2, 0, 0, 0, 0, 0, 0], 'tail...",
8,AVG,HINCY4550,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Household Income $45000-$49999 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCY4550,HINCY4550_f7ad7d79,False,"{'head': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",
9,AVG,HINCY4045,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Household Income $40000-$44999 (2019A),carto-do.ags.demographics_sociodemographic_usa...,HINCY4045,HINCY4045_98177a5c,False,"{'head': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",


We can see there are several variables related to population, so this is the `Dataset` we are looking for.

In [8]:
vdf[vdf['description'].str.contains('pop', case=False, na=False)]

Unnamed: 0,agg_method,column_name,dataset_id,db_type,description,id,name,slug,starred,summary_json,variable_group_id
22,SUM,EDUCYSHSCH,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Pop 25+ 9th-12th grade no diploma (2019A),carto-do.ags.demographics_sociodemographic_usa...,EDUCYSHSCH,EDUCYSHSCH_5c444deb,False,"{'head': [0, 0, 0, 4, 4, 0, 0, 0, 0, 0], 'tail...",
23,SUM,EDUCYLTGR9,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Pop 25+ less than 9th grade (2019A),carto-do.ags.demographics_sociodemographic_usa...,EDUCYLTGR9,EDUCYLTGR9_cbcfcc89,False,"{'head': [1, 1, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",
24,SUM,EDUCYHSCH,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Pop 25+ HS graduate (2019A),carto-do.ags.demographics_sociodemographic_usa...,EDUCYHSCH,EDUCYHSCH_b236c803,False,"{'head': [5, 0, 0, 8, 14, 0, 0, 0, 0, 0], 'tai...",
25,SUM,EDUCYGRAD,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Pop 25+ graduate or prof school degree (2019A),carto-do.ags.demographics_sociodemographic_usa...,EDUCYGRAD,EDUCYGRAD_d0179ccb,False,"{'head': [0, 0, 0, 1, 3, 0, 0, 0, 0, 0], 'tail...",
26,SUM,EDUCYBACH,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Pop 25+ Bachelors degree (2019A),carto-do.ags.demographics_sociodemographic_usa...,EDUCYBACH,EDUCYBACH_c2295f79,False,"{'head': [0, 0, 0, 1, 7, 0, 0, 0, 0, 0], 'tail...",
31,SUM,AGECYGT85,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Population age 85+ (2019A),carto-do.ags.demographics_sociodemographic_usa...,AGECYGT85,AGECYGT85_b9d8a94d,False,"{'head': [1, 0, 0, 2, 2, 0, 0, 0, 0, 0], 'tail...",
32,SUM,AGECYGT25,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Population Age 25+ (2019A),carto-do.ags.demographics_sociodemographic_usa...,AGECYGT25,AGECYGT25_433741c7,False,"{'head': [6, 3, 0, 18, 41, 0, 0, 0, 0, 0], 'ta...",
33,SUM,AGECYGT15,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Population Age 15+ (2019A),carto-do.ags.demographics_sociodemographic_usa...,AGECYGT15,AGECYGT15_681a1204,False,"{'head': [6, 3, 0, 20, 959, 0, 0, 0, 0, 0], 't...",
34,SUM,AGECY8084,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Population age 80-84 (2019A),carto-do.ags.demographics_sociodemographic_usa...,AGECY8084,AGECY8084_b25d4aed,False,"{'head': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'tail...",
35,SUM,AGECY7579,carto-do.ags.demographics_sociodemographic_usa...,INTEGER,Population age 75-79 (2019A),carto-do.ags.demographics_sociodemographic_usa...,AGECY7579,AGECY7579_15dcf822,False,"{'head': [0, 0, 0, 1, 0, 0, 0, 0, 0, 0], 'tail...",


We can follow the very same process to discover `financial` datasets, let's see how it works by first listing the geographies available for the category `financial` in the US:

In [9]:
Catalog().country('usa').category('financial').geographies

[<Geography.get('mc_block_9ebc626c')>,
 <Geography.get('mc_blockgroup_c4b8da4c')>,
 <Geography.get('mc_county_31cde2d')>,
 <Geography.get('mc_state_cc31b9d1')>,
 <Geography.get('mc_tract_3704a85c')>,
 <Geography.get('mc_zipcode_263079e3')>]

We can clearly identify a geography at the blockgroup resolution, provided by Mastercard:

In [10]:
from cartoframes.data.observatory import Geography
Geography.get('mc_blockgroup_c4b8da4c').to_dict()

{'id': 'carto-do.mastercard.geography_usa_blockgroup_2019',
 'slug': 'mc_blockgroup_c4b8da4c',
 'name': 'USA Census Block Groups',
 'description': None,
 'country_id': 'usa',
 'provider_id': 'mastercard',
 'provider_name': 'Mastercard',
 'lang': 'eng',
 'geom_type': 'MULTIPOLYGON',
 'update_frequency': None,
 'version': '2019',
 'is_public_data': False}

Now we can list the available datasets provided by Mastercard for the US Census blockgroups spatial resolution:

In [11]:
Catalog().country('usa').category('financial').geography('mc_blockgroup_c4b8da4c').datasets.to_dataframe()

Unnamed: 0,available_in,category_id,category_name,country_id,data_source_id,description,geography_description,geography_id,geography_name,id,...,lang,name,provider_id,provider_name,slug,summary_json,temporal_aggregation,time_coverage,update_frequency,version
0,,financial,Financial,usa,mrli,"MRLI scores validate, evaluate and benchmark t...",,carto-do.mastercard.geography_usa_blockgroup_2019,USA Census Block Groups,carto-do.mastercard.financial_mrli_usa_blockgr...,...,eng,MRLI Data for Census Block Groups,mastercard,Mastercard,mc_mrli_35402a9d,"{'counts': {'rows': 1072383, 'cells': 22520043...",monthly,,monthly,2019


Let's finally inspect the variables available in the dataset:

In [12]:
Dataset.get('mc_mrli_35402a9d').variables

[<Variable.get('transactions_st_d22b3489')> #'Same as transactions_score, but only comparing ran...',
 <Variable.get('region_id_3c7d0d92')> #'Region identifier (construction varies depending o...',
 <Variable.get('category_8c84b3a7')> #'Industry/sector categories (Total Retail, Retail e...',
 <Variable.get('month_57cd6f80')> #'Name of the month the data refers to',
 <Variable.get('region_type_d875e9e7')> #'Administrative boundary type (block, block group, ...',
 <Variable.get('stability_state_8af6b92')> #'Same as stability_score, but only comparing rankin...',
 <Variable.get('sales_score_49d02f1e')> #'Rank based on the average monthly sales for the pr...',
 <Variable.get('stability_score_6756cb72')> #'Rank based on the change in merchants between the ...',
 <Variable.get('ticket_size_sta_3bfd5114')> #'Same as ticket_size_score, but only comparing rank...',
 <Variable.get('sales_metro_sco_e088134d')> #'Same as sales_score, but only comparing ranking wi...',
 <Variable.get('transactions_

### Dataset and variables metadata

The Data Observatory catalog is not only a repository of curated spatial datasets, it also contains valuable information that helps on understanding better the underlying data for every dataset, so you can take an informed decision on what data best fits your problem.

Some of the augmented metadata you can find for each dataset in the catalog is:

- `head` and `tail` methods to get a glimpse of the actual data. This helps you to understand the available columns, data types, etc. To start modelling your problem right away.
- `geom_coverage` to visualize on a map the geographical coverage of the data in the `Dataset`.
- `counts`, `fields_by_type` and a full `describe` method with stats of the actual values in the dataset, such as: average, stdev, quantiles, min, max, median for each of the variables of the dataset.

You don't need a subscription to a dataset to be able to query the augmented metadata, it's just publicly available for anyone exploring the Data Observatory catalog.

Let's overview some of that information, starting by getting a glimpse of the ten first or last rows of the actual data of the dataset:

In [None]:
from cartoframes.data.observatory import Dataset
dataset = Dataset.get('ags_sociodemogr_e92b1637')

In [None]:
dataset.head()

Alternatively, you can get the last ten ones with `dataset.tail()`

An overview of the coverage of the dataset

In [None]:
dataset.geom_coverage()

Some stats about the dataset:

In [None]:
dataset.counts()

In [None]:
dataset.fields_by_type()

In [None]:
dataset.describe()

Every `Dataset` instance in the catalog contains other useful metadata:

- slug: A short ID
- name and description: Free text attributes
- country
- geography: Every dataset is related to a Geography instance
- category
- provider
- data source
- lang
- temporal aggregation
- time coverage
- update frequency
- version
- is_public_data: whether you need a license to use the dataset for enrichment purposes or not

In [None]:
dataset.to_dict()

There's also some intersting metadata, for each variable in the dataset:

- id
- slug: A short ID
- name and description
- column_name: Actual column name in the table that contains the data
- db_type: SQL type in the database
- dataset_id
- agg_method: Aggregation method used
- temporal aggregation and time coverage

Variables are the most important asset in the catalog and when exploring datasets in the Data Observatory catalog it's very important that you understand clearly what variables are available to enrich your own data.

For each `Variable` in each dataset, the Data Observatory provides (as it does with datasets) a set of methods and attributes to understand their underlaying data.

Some of them are:

- `head` and `tail` methods to get a glimpse of the actual data and start modelling your problem right away.
- `counts`, `quantiles` and a full `describe` method with stats of the actual values in the dataset, such as: average, stdev, quantiles, min, max, median for each of the variables of the dataset.
- an `histogram` plot with the distribution of the values on each variable.

Let's overview some of that augmented metadata for the variables in the AGS population dataset.

In [None]:
from cartoframes.data.observatory import Variable
variable = Variable.get('POPPY_946f4ed6')
variable

In [None]:
variable.to_dict()

There's also some utility methods ot understand the underlying data for each variable:

In [None]:
variable.head()

In [None]:
variable.counts()

In [None]:
variable.quantiles()

In [None]:
variable.histogram()

In [None]:
variable.describe()

### Subscribe to a Dataset in the catalog

Once you have explored the catalog and have detected a dataset with the variables you need for your analysis and the right spatial resolution, you have to look at the `is_public_data` to know if you can just use it from CARTOframes or you first need to subscribe for a license.

Subscriptions to datasets allow you to use them from CARTOframes to enrich your own data or to download them. See the enrichment guides for more information about this.

Let's see the dataset and geography in our previous example:

In [None]:
dataset = Dataset.get('ags_sociodemogr_e92b1637')

In [None]:
dataset.is_public_data

In [None]:
from cartoframes.data.observatory import Geography
geography = Geography.get(dataset.geography)

In [None]:
geography.is_public_data

Both `dataset` and `geography` are not public data, that means you need a subscription to be able to use them to enrich your own data.

**To subscribe to data in the Data Observatory catalog you need a CARTO account with access to Data Observatory**

In [None]:
from cartoframes.auth import set_default_credentials
YOUR_CARTO_USER_NAME = ''
YOUR_CARTO_API_KEY = ''
set_default_credentials(username=YOUR_CARTO_USER_NAME, api_key=YOUR_CARTO_API_KEY)
dataset.subscribe()

In [None]:
geography.subscribe()

**Licenses to data in the Data Observatory grant you the right to use the data subscribed for the period of one year. Every dataset or geography you want to use to enrich your own data, as lons as they are not public data, require a valid license.**

You can check the actual status of your subscriptions directly from the catalog.

In [None]:
Catalog().subscriptions()

## About nested filters in the Catalog instance

**Note that every time you search the catalog you create a new instance of the `Catalog` class. Alternatively, when applying `country`, `category` and `geography` filters a catalog instance, you can reuse the same instance of the `catalog` by using the `catalog.clean_filters()` method.**

So for example, if you've filtered the catalog this way:

In [None]:
catalog = Catalog()
catalog.country('usa').category('demographics').datasets

And now you want to take the `financial` datasets for the use, you should:

1. Create a new instance of the catalog: `catalog = Catalog()`
2. Call to `catalog.clean_filters()` over the existing instance.

Another point to remark is that, altough a recommended way to discover data is nesting filters over a `Catalog` instance, you don't need to follow the complete hierarchy (`country`, `category`, `geography`) to list the available datasets.

Alternatively, you can just list all the datasets in the `US` or list all the datasets for the `demographics` category, and continue exploring the catalog locally with pandas.

Let's see an example of that, in which we filter public data for the `demographics` category world wide:

In [None]:
df = Catalog().category('demographics').datasets.to_dataframe()
df[df['is_public_data'] == True]

### Learn more

We recommend you to check also these resources if you want to know more about the Data Observatory catalog:

- The CARTOframes Enrichment guides and examples
- [our public website](https://carto.com/platform/location-data-streams/)
- Your user dashboard: Under the data section
- The CARTOframes catalog reference 