# Taxonomy - Regions
This notebook helps retrieving the full list of Regions used by factiva. Returned codes can be added to the Retrieval API payload.

## Code Initialisation
Dependencies and environment initialisation. Taxonomy requests require authentication.

Ensure there's a `.env` file with your credentials in the same directory as this script. Use the `.env.example` file as template.

In [52]:
import os
import requests as r
import pandas as pd
from IPython.display import Markdown
import utils as u
from dotenv import load_dotenv

load_dotenv()

True

## Constants

In [53]:
API_HOST = 'api.dowjones.com'
AUTH_HOST = 'accounts.dowjones.com'
CLIENT_ID = os.getenv('FACTIVA_CLIENTID')
USERNAME = os.getenv('FACTIVA_USERNAME')
PASSWORD = os.getenv('FACTIVA_PASSWORD')
AUTH_URL = f"https://{AUTH_HOST}/oauth2/v1/token"
REG_URL = f"https://{API_HOST}/taxonomy/factiva-regions/list"

## Authentication - Generate Bearer

For details about getting the `bearer_token`, please see the `utils.py` file.

In [54]:
bearer_token = u.get_bearer_token(CLIENT_ID, USERNAME, PASSWORD, AUTH_URL)
if bearer_token:
    display(Markdown(f"**Authentication Successful**: Bearer token created for user {USERNAME.split('@')[0].split('-')[0]}"))
else:
    display(Markdown(f"**Authentication Failed**: Cannot obtain the Bearer token for the user {USERNAME.split('@')[0].split('-')[0]}"))
    
req_headers = {
    "Authorization": f"Bearer {bearer_token}",
    "Content-Type": "application/json",
    "Accept": "application/json"
}

**Authentication Successful**: Bearer token created for user 9ZZZ159100

## Taxonomy API Request

In [55]:
reg_response = r.get(f"{REG_URL}?language=en&parts=All", headers=req_headers)
reg_dict = reg_response.json()['data']['attributes']['regions']
flat_reg = []
for item in reg_dict:
    parents = []
    if 'parent' in item and item['parent']:
        # parent can be a dict or a list of dicts
        if isinstance(item['parent'], dict):
            parents.append(item['parent'].get('code'))
        elif isinstance(item['parent'], list):
            parents = [p.get('code') for p in item['parent'] if 'code' in p]
    flat_reg.append({
        'reg_code': item.get('code'),
        'reg_name': item.get('descriptor'),
        'description': item.get('description'),
        'region_type': item.get('regionType'),
        'parents': parents
    })
reg_df = pd.DataFrame(flat_reg)
if reg_df.shape[0] > 5:
    display(Markdown("**Regions Retrieved Successfully**"))
    display(Markdown(f"Returned {reg_df.shape[0]} regions"))
else:
    display(Markdown("**Regions Retrieval Failed**"))

**Regions Retrieved Successfully**

Returned 977 regions

## Displaying and filtering News Subjects

### Display & Filter

In [56]:
# Show all
# reg_df
# Filter by parent
reg_df[reg_df['parents'].apply(lambda x: 'ITALY' in x)]

Unnamed: 0,reg_code,reg_name,description,region_type,parents
3,ABZZO,Abruzzo,Abruzzo is a region in southern Italy.,StateProvince,[ITALY]
45,AOSTA,Aosta Valley,Aosta Valley is an autonomous region in northw...,StateProvince,[ITALY]
47,APULIA,Apulia,Apulia is a region in southern Italy.,StateProvince,[ITALY]
97,BASILC,Basilicata,Basilicata is a region in southern Italy.,StateProvince,[ITALY]
156,CALABR,Calabria,Calabria is a region in southern Italy.,StateProvince,[ITALY]
161,CAMPAN,Campania,Campania is a region in southern Italy.,StateProvince,[ITALY]
276,EMILIA,Emilia-Romagna,Emilia-Romagna is a region in northeastern Italy.,StateProvince,[ITALY]
302,FRIULI,Friuli-Venezia Giulia,Friuli-Venezia Giulia is an autonomous region ...,StateProvince,[ITALY]
463,LAZIO,Lazio,Lazio is a region in Central Italy.,StateProvince,[ITALY]
471,LIGUR,Liguria,Liguria is a region in northwestern Italy.,StateProvince,[ITALY]


### Search Keywords

In [57]:
# By reg_code
# reg_df[reg_df.reg_code == 'SPAIN']
# By reg_name
# reg_df[reg_df.reg_name.str.contains('island', case=False)].head(5)
# By reg_description
reg_df[reg_df.description.str.contains('overseas', case=False)].head(5)

Unnamed: 0,reg_code,reg_name,description,region_type,parents
37,ANGUIL,Anguilla,Anguilla is an overseas territory of the Unite...,Country,[CARIBZ]
116,BERM,Bermuda,Bermuda is a self-governing overseas territory...,Country,[NAMZ]
140,BIOT,British Indian Ocean Territory,The British Indian Ocean Territory is an overs...,Country,[SASIAZ]
141,BVI,British Virgin Islands,The British Virgin Islands is a self-governing...,Country,[CARIBZ]
178,CAYI,Cayman Islands,The Cayman Islands is an overseas territory of...,Country,[CARIBZ]
