# Hands-on: Generate Summary of Planning Analytics Data 

## Overview

This Jupyter Notebook provides an example of how to:

1. Connect to Planning Analytics server and retrieve dataset from Cube and View as a dataframe using TM1py.

2. Construt a prompt and pass the tabular dataset to Large Language Model (LLM) to generate a summary of the data.

In [1]:
# Install library
%pip install TM1py

Note: you may need to restart the kernel to use updated packages.


In [2]:
# Import libraries
import json
import os
import requests

from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

# WML python SDK
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes, DecodingMethods

from TM1py.Services import TM1Service
from TM1py.Exceptions import TM1pyException

## 1. Setup Connection to Planning Analytics

To get connected to Planning Analytics server, you need these informations:
- address
- port
- user
- password
- namespace

Your connection is successful when you see the *Server Name* and *Product Version* information.

In [3]:
#Set up connection to Planning Analytics server
try:
    with TM1Service(
        address='<YOUR PA SERVER ADDRESS HERE>,
        port= <YOUR PA PORT NUMBER HERE>,
        ssl=False,
        user="pm",
        password="IBMDem0s",
        namespace='Harmony LDAP'
    ) as tm1:
        print("Server Name:", tm1.server.get_server_name())
        print("Product Version:",tm1.server.get_product_version())

# Error Handling        
except TM1pyException as e:
    if e.status_code == 401:
        print('Wrong credentials')
    elif e.status_code == 404:
        print('Wrong connection')
    else:
        print('Something else went wrong. Check error code:', str(e))

Server Name: 24Retail
Product Version: 11.8.02300.10


## 2. Retrieve dataset from Cube and View

Define your existing Cube and View name and retrieve it as a dataframe, then convert to markdown.

In [4]:
# Define Cube and View name to be retrieved
cube_name = 'Revenue'
view_name = 'Input'

In [5]:
# Retrieve data from the Cube and View as a dataframe
# Read more here: https://code.cubewise.com/blog/getting-data-from-tm1-with-python/
df = tm1.cubes.cells.execute_view_dataframe(cube_name=cube_name, 
                                            view_name=view_name, 
                                            private=False)
#Display the dataframe
df

Unnamed: 0,Channel,Month,Value
0,Channel Total,Jan,388.216048
1,Channel Total,Feb,408.470769
2,Channel Total,Mar,397.142034
3,Channel Total,Apr,393.118312
4,Channel Total,May,396.867192
5,Channel Total,Jun,400.012776
6,Channel Total,Jul,386.708433
7,Channel Total,Aug,389.41607
8,Channel Total,Sep,352.259358
9,Channel Total,Oct,350.727883


In [6]:
#Convert dataframe to markdown
df_md = df.to_markdown(index=False)
print(df_md)

| Channel       | Month   |   Value |
|:--------------|:--------|--------:|
| Channel Total | Jan     | 388.216 |
| Channel Total | Feb     | 408.471 |
| Channel Total | Mar     | 397.142 |
| Channel Total | Apr     | 393.118 |
| Channel Total | May     | 396.867 |
| Channel Total | Jun     | 400.013 |
| Channel Total | Jul     | 386.708 |
| Channel Total | Aug     | 389.416 |
| Channel Total | Sep     | 352.259 |
| Channel Total | Oct     | 350.728 |
| Channel Total | Nov     | 348.037 |
| Channel Total | Dec     | 352.958 |
| Channel Total | Year    | 376.546 |
| 10            | Jan     | 382.746 |
| 10            | Feb     | 394.207 |
| 10            | Mar     | 372.614 |
| 10            | Apr     | 365.049 |
| 10            | May     | 370.822 |
| 10            | Jun     | 376.801 |
| 10            | Jul     | 390.453 |
| 10            | Aug     | 396.828 |
| 10            | Sep     | 330.063 |
| 10            | Oct     | 328.06  |
| 10            | Nov     | 324.078 |
| 10        

## 3. Creating Prompt

Construct your prompt which will be passed to Large Language Model (LLM).

In [7]:
prompt = f"Extract the key findings from the table regarding units sold and describe their development.\n\n{df_md}"
print(prompt)

Extract the key findings from the table regarding units sold and describe their development.

| Channel       | Month   |   Value |
|:--------------|:--------|--------:|
| Channel Total | Jan     | 388.216 |
| Channel Total | Feb     | 408.471 |
| Channel Total | Mar     | 397.142 |
| Channel Total | Apr     | 393.118 |
| Channel Total | May     | 396.867 |
| Channel Total | Jun     | 400.013 |
| Channel Total | Jul     | 386.708 |
| Channel Total | Aug     | 389.416 |
| Channel Total | Sep     | 352.259 |
| Channel Total | Oct     | 350.728 |
| Channel Total | Nov     | 348.037 |
| Channel Total | Dec     | 352.958 |
| Channel Total | Year    | 376.546 |
| 10            | Jan     | 382.746 |
| 10            | Feb     | 394.207 |
| 10            | Mar     | 372.614 |
| 10            | Apr     | 365.049 |
| 10            | May     | 370.822 |
| 10            | Jun     | 376.801 |
| 10            | Jul     | 390.453 |
| 10            | Aug     | 396.828 |
| 10            | Sep     | 330.

## 4. Configuring watsonx.ai

The following section defines the input to the Large Language Model (LLM).

Provides the credential for watsonx.ai as indicated below

1. `watsonx_project_id` - The watsonx.ai **Project ID** provided in watsonx.ai project -> Manage -> Project id
2. `api_key` - The **API Key** provided in IBM Cloud -> Manage -> API Key

In [8]:
# URL of the hosted LLMs is hardcoded because at this time all LLMs share the same endpoint
url = "https://us-south.ml.cloud.ibm.com"

# Replace with your watsonx project id (look up in the project Manage tab)
watsonx_project_id = "<YOUR WATSONX.AI PROJECT ID HERE>"

# Replace with your IBM Cloud key
api_key = "<YOUR IBM CLOUD API KEY HERE>"

In [9]:
model_init = None
# Initialize the watsonx model
def get_model(model_type,max_tokens,min_tokens,decoding,temperature):#, repetition_penalty):

    generate_params = {
        GenParams.MAX_NEW_TOKENS: max_tokens,
        GenParams.MIN_NEW_TOKENS: min_tokens,
        GenParams.DECODING_METHOD: decoding,
        GenParams.TEMPERATURE: temperature,
    }
    global model_init
    if model_init == None:
        model_init = Model(
            model_id=model_type,
            params=generate_params,
            credentials={
                "apikey": api_key,
                "url": url
            },
            project_id= watsonx_project_id
            )

    return model_init

The following block specifies the the specifics for the LLM. In a PoX, you may want to vary these values to show a client how they can get the best results.

1. **model_type** specifies the LLM being used. In the example below it is the llama-2-70b-chat model. You can change it to other models. Note that the size of the model will have implications on resource usage. You may wish to try some of the other ones in a PoX and see if they will provide different results. In the block below, there are 4 models (with 3 commented out, so llama2 is being used - you can comment out different ones to try). Refer [here](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-model-ids.html?context=wx&audience=wdp), for exhaustive list of models supported.

2. **max_tokens** specifies the maximum number of output tokens. Keep in mind that 1 token does not equal 1 word. In general, you can estimate roughly 3 tokens per word.

3. **min_tokens** specifies the minimum number of output tokens.

4. **decoding** specifies the decoding method. You can also choose to do **sampling** decoding - in which case you can specify more parameters (such as **Top-P** and **Top-K**). More information on these additional parameters can be found from the Watsonx.ai Technical Sales Level 3 class (https://learn.ibm.com/course/view.php?id=13452).

5. **temperature** specifies how conservative or creative the model will be. The lower it is, the more conservative it it. The range is from 0 to 2.

In [11]:
# Set up watsonx model and parameters
model_type = "meta-llama/llama-2-70b-chat"
# model_type = "google/flan-t5-xxl"
# model_type = "ibm/granite-13b-chat-v1"
# model_type = "ibm/granite-13b-instruct-v1"
# model_id = "ibm/mpt-7b-instruct2"
max_tokens = 1000
min_tokens = 50
decoding = DecodingMethods.GREEDY
temperature = 0.7

# Get the watsonx model
model = get_model(model_type, max_tokens, min_tokens, decoding, temperature)

## 5. Summary Generation

This block generates a summary based on the input prompt and the specified parameters.

In [12]:
# Send a prompt to model
generated_response = model.generate(prompt)
response_text = generated_response['results'][0]['generated_text']

# Print model response
print("--------------------------------- Generated response -----------------------------------")
print(response_text)

--------------------------------- Generated response -----------------------------------


The table shows the sales data for three different channels (10, 20, and 30) over a period of 12 months. The data includes the total value sold for each channel in each month, as well as the yearly total.

The key findings from the table regarding units sold are:

1. Channel 10: The total value sold for Channel 10 ranges from 358.097 in January to 396.828 in August, with a yearly total of 376.546.
2. Channel 20: The total value sold for Channel 20 ranges from 405.93 in January to 368.681 in August, with a yearly total of 388.22.
3. Channel 30: The total value sold for Channel 30 ranges from 353.966 in January to 462.833 in March, with a yearly total of 438.881.

In terms of development, it appears that Channel 10 has had a steady increase in sales over the year, with a peak in August. Channel 20 has had a similar trend, with a peak in June. Channel 30 has had a more inconsistent trend, with a pea