The purpose of this notebook is to demonstrate how you can get an aggregated set of relevant forecast-ready features for several of your locations. To do this, we will use the Beam API's [Analysis Groups](https://docs.predicthq.com/api/beam/analysis-groups). 

Make sure you have have successfully uploaded demand data to Beam for all your locations before running this notebook. The output is a list of important features aggregated across all your locations.

# Steps

* [Setup](#setup)
* [Step 1. Prepare Groups](#step-1-prepare-groups)
* [Step 2. Beam: Create Groups](#step-2-beam-create-groups)
* [Step 3. Beam: Get Aggregated Feature Importance](#step-3-beam-get-aggregated-feature-importance)

# Setup

Complete the following steps before proceeding:

1. Install `requirements.txt`
2. Update `DATA_DIR` and `OUTPUT_DIR` as necessary
3. Replace `ACCESS_TOKEN` with a valid token (for help creating an access token, see [the API Quickstart](https://docs.predicthq.com/getting-started/api-quickstart))

In [1]:
# install requirements
# %pip install --user -r requirements.txt

In [2]:
import pandas as pd
import json
import os
import requests
from predicthq import Client

import beam_api_utils as bau

In [3]:
DATA_DIR = "data"
OUTPUT_DIR = "output"

ACCESS_TOKEN = "REPLACE_WITH_ACCESS_TOKEN"

In [4]:
phq = Client(access_token=ACCESS_TOKEN)

# Step 1. Prepare Groups

**New Groups**

Grouping Analyses are valuable when insights are needed across multiple Analyses. Please refer to this [guide](https://www.predicthq.com/support/grouping-analyses-in-beam) for more information such as tips and watchouts when creating groups. 

Prepare a group config file with the following information per `group`:

1. `name`: a user-created free-form string to reference the Group in Beam

2. `analysis_ids`: a list of Analysis IDs to be included in the Group

You may already have a `config` file with `location` level information or you may have `analysis_ids` defined elsewhere. See the example `group_config` below for how this should look. 

In [5]:
# load example config file
with open(os.path.join(OUTPUT_DIR, "config_with_features.json"), "r") as f:
    config = json.load(f)

# example locations
locations = list(config.keys())
analysis_ids = [info["analysis_id"] for location, info in config.items()]

# example group config
group_config = {
    "group_A": {
        "name": "group_A_analysis",
        "locations": locations[:2],
        "analysis_ids": analysis_ids[:2],
    },
    "group_B": {
        "name": "group_B_analysis",
        "locations": locations[1:],
        "analysis_ids": analysis_ids[1:],
    },
}

group_config

{'group_A': {'name': 'group_A_analysis',
  'locations': ['store_0', 'store_1'],
  'analysis_ids': ['ho8jzrnJLwU', 'KjQmR4C5wGo']},
 'group_B': {'name': 'group_B_analysis',
  'locations': ['store_1', 'store_2'],
  'analysis_ids': ['KjQmR4C5wGo', 'kpm5DOSBU_4']}}

**Existing Groups**

If you have existing groups in Beam, prepare a group config file with the following information:

1. `group_id`: the Group ID

Then skip to [Step 3. Beam: Get Feature Importance](#step-3-beam-get-aggregated-feature-importance). 

In [6]:
# # example group config
# group_config = {
#     "group_A": {
#         "group_id": "abc123",
#     },
#     "group_B": {
#         "group_id": "def456",
#     },
# }

# # get Group name and analysis_ids
# for group, info in group_config.items():
#     group_info = bau.get_group(group_id=info["group_id"])
#     group_config[group].update(group_info)

# group_config

# Step 2. Beam: Create Groups

**New Groups**

This step involves using the Beam API's [Analysis Groups](https://docs.predicthq.com/api/beam/analysis-groups) to:

1. Create a `group_id` for each Group
2. Check `group_status` of the Group and make sure it is `ready` before proceeding

For more info on the Beam API and other functionality such as updating and deleting Analysis Groups, see the [PredictHQ Docs](https://docs.predicthq.com/api/beam/analysis-groups).

In [7]:
# create Groups
for group, info in group_config.items():
    print(f"Creating Group for {group}...")

    try:
        group_id = bau.create_group(
            name=info["name"],
            analysis_ids=info["analysis_ids"],
            access_token=ACCESS_TOKEN,
        )
        info["group_id"] = group_id["group_id"]

        print("--- group created successfully")

    except Exception as e:
        print(f"--- an error occurred: {e}")
        continue

Creating Group for group_A...
--- group created successfully
Creating Group for group_B...
--- group created successfully


Groups need to be `ready` and Feature Importance processing need to be completed before proceeding to the next steps. Refresh as needed to get the latest status.

In [8]:
# check Group status
for group, info in group_config.items():
    print(f"Group status for {group}...")
    status = bau.group_status(group_id=info["group_id"], access_token=ACCESS_TOKEN)
    print(f"--- {status}")
    group_config[group]["group_status"] = status

Group status for group_A...
--- {'readiness_status': 'ready', 'feature_importance_processing_completed': True}
Group status for group_B...
--- {'readiness_status': 'ready', 'feature_importance_processing_completed': True}


# Step 3. Beam: Get Aggregated Feature Importance

Feature Importance results can be retrieved for all Groups via their `group_id`.

**All Groups**


In [9]:
# get Group-level feature importance
for group, info in group_config.items():
    print(f"Getting aggregated feature importance for {group}...")

    try:
        group_feature_importance = bau.get_group_feature_importance(
            group_id=info["group_id"], access_token=ACCESS_TOKEN
        )
        important_features = [
            item
            for feature in group_feature_importance["feature_importance"]
            if feature["important"]
            for item in feature["features"]
        ]
        info["important_features"] = important_features
        info["feature_importance"] = group_feature_importance["feature_importance"]

        print("--- feature importance retrieved")

    except Exception as e:
        print(f"--- an error occurred: {e}")

# save group config file
with open(os.path.join(OUTPUT_DIR, "group_config_with_features.json"), "w") as f:
    json.dump(group_config, f, indent=4)

Getting aggregated feature importance for group_A...
--- feature importance retrieved
Getting aggregated feature importance for group_B...
--- feature importance retrieved


In [10]:
group_config

{'group_A': {'name': 'group_A_analysis',
  'locations': ['store_0', 'store_1'],
  'analysis_ids': ['ho8jzrnJLwU', 'KjQmR4C5wGo'],
  'group_id': 'BEnSsmK0CGg',
  'group_status': {'readiness_status': 'ready',
   'feature_importance_processing_completed': True},
  'important_features': ['phq_attendance_concerts',
   'phq_attendance_conferences',
   'phq_attendance_expos',
   'phq_attendance_festivals',
   'phq_rank_observances',
   'phq_attendance_performing_arts',
   'phq_rank_public_holidays',
   'phq_attendance_school_holidays',
   'phq_impact_severe_weather_air_quality_retail',
   'phq_impact_severe_weather_blizzard_retail',
   'phq_impact_severe_weather_cold_wave_retail',
   'phq_impact_severe_weather_cold_wave_snow_retail',
   'phq_impact_severe_weather_cold_wave_storm_retail',
   'phq_impact_severe_weather_dust_retail',
   'phq_impact_severe_weather_dust_storm_retail',
   'phq_impact_severe_weather_flood_retail',
   'phq_impact_severe_weather_heat_wave_retail',
   'phq_impact_sever