# Hands-on: Question Answering by Querying Planning Analytics Data

## Overview

This Jupyter Notebook provides an example of how to:

1. Develop a Question Answering application that can answer user's question based on Planning Analytics data.

2. Construct a One-Shot Prompt and pass the prompt to Large Language Model (LLM) to generate MDX statement based on user's question.

3. Construct another Prompt and pass the retrieved tabular dataset to Large Language Model (LLM) to generate answer based on the PA data.

In [None]:
# Install library
%pip install TM1py

In [None]:
# Import libraries
import json
import os
import random
import requests
from TM1py.Services import TM1Service
from TM1py.Exceptions import TM1pyException
from TM1py.Utils.Utils import build_pandas_dataframe_from_cellset
from TM1py.Utils.Utils import build_cellset_from_pandas_dataframe

from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

# WML python SDK
from ibm_watson_machine_learning.foundation_models import Model
from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams
from ibm_watson_machine_learning.foundation_models.utils.enums import ModelTypes, DecodingMethods

from TM1py.Services import TM1Service
from TM1py.Exceptions import TM1pyException

## 1. Setup Connection to Planning Analytics

To get connected to Planning Analytics server, you need these informations:
- address
- port
- user
- password
- namespace

Your connection is successful when you see the *Server Name* and *Product Version* information.

In [None]:
# Set up connection to Planning Analytics server
try:
    with TM1Service(
        address="<YOUR PA SERVER ADDRESS HERE>",
        port=<YOUR PA PORT NUMBER HERE>,
        ssl=False,
        user="pm",
        password="IBMDem0s",
        namespace="Harmony LDAP"
    ) as tm1:
        print("Server Name:", tm1.server.get_server_name())
        print("Product Version:",tm1.server.get_product_version())

# Error Handling        
except TM1pyException as e:
    if e.status_code == 401:
        print('Wrong credentials')
    elif e.status_code == 404:
        print('Wrong connection')
    else:
        print('Something else went wrong. Check error code:', str(e))

## 2. User's Question

User's question in natural language.

In [None]:
# User's question
question = "How many units of product 21002 did we sell in year 1 in organization 102 through channel 10?"

## 3. Configuring watsonx.ai

The following section defines the input to the Large Language Model (LLM).
Provides the credential for watsonx.ai as indicated below

1. `watsonx_project_id`: The watsonx.ai **Project ID** provided in watsonx.ai project -> Manage -> Project ID
2. `api_key`: The **API Key** provided in IBM Cloud -> Manage -> API Key

In [None]:
project_id = os.environ["PROJECT_ID"]

In [None]:
# URL of the hosted LLMs is hardcoded because at this time all LLMs share the same endpoint
url = "https://us-south.ml.cloud.ibm.com"

# Replace with your watsonx project id (look up in the project Manage tab)
#watsonx_project_id = "<YOUR WATSONX.AI PROJECT ID HERE>"
watsonx_project_id = project_id

# Replace with your IBM Cloud key
api_key = "<YOUR IBM CLOUD API KEY HERE>"

In [None]:
# Initialize the watsonx model
model_init = None

# Function for model to generate MDX statements
def get_model_mdx(model_type, max_tokens, min_tokens, decoding, stop_sequences):
    generate_params = {
        GenParams.MAX_NEW_TOKENS: max_tokens,
        GenParams.MIN_NEW_TOKENS: min_tokens,
        GenParams.DECODING_METHOD: decoding,
        GenParams.STOP_SEQUENCES: stop_sequences
    }
    global model_init
    if model_init is None:
        model_init = Model(
            model_id=model_type,
            params=generate_params,
            credentials={
                "apikey": api_key,
                "url": url
            },
            project_id=watsonx_project_id
        )

    return model_init

In [None]:
# Function for model to generate summary
def get_model (model_type, max_tokens, min_tokens, decoding, repetition_penalty):
    generate_params = {
        GenParams.MAX_NEW_TOKENS: max_tokens,
        GenParams.MIN_NEW_TOKENS: min_tokens,
        GenParams.DECODING_METHOD: decoding,
        GenParams.REPETITION_PENALTY: repetition_penalty
    }
    global model_init
    if model_init is None:
        model_init = Model(
            model_id=model_type,
            params=generate_params,
            credentials={
                "apikey": api_key,
                "url": url
            },
            project_id=watsonx_project_id
        )

    return model_init

## 4. Creating Prompt to generate MDX statement to query data in PA

- Construct a One-Shot Prompt and pass the prompt to Large Language Model (LLM) to generate MDX statement.
- Based on user's question in natural language, LLM will generate the MDX statement.
- The MDX statement will be used to query data in Planning Analytics.

In [None]:
prompt_mdx = """Create an MDX Statement for a Planning Analytics View to display the desired data

Input:
How many units of product 21001 did we sell in year 2 in organization 101 through channel 10?

Output:
SELECT {[Revenue].[Revenue].[Units Sold]} ON 0, {[product].[product].[21001]}*{DRILLDOWNMEMBER({[Year].[Year].[Y2]} , {[Year].[Year].[Y2]})}*{TM1SubsetToSet([Month].[Month],"MY","public")} ON 1 FROM [Revenue] WHERE ([organization].[organization].[101], [Channel].[Channel].[10], [Version].[Version].[Actual])

Input:
""" + question + """

Output:
"""
print(prompt_mdx)

## 5. Model Parameter for LLM to generate MDX statement to query data in PA

The following block specifies the the specifics for the LLM. In a PoX, you may want to vary these values to show a client how they can get the best results.

1. **model_type** specifies the LLM being used. In the example below is the **codellama/codellama-34b-instruct-hf** model that is good at coding. You can change it to other coding models. Note that the size of the model will have implications on resource usage. You may wish to try some of the other ones in a PoX and see if they will provide different results. Refer [here](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-model-ids.html?context=wx&audience=wdp), for exhaustive list of models supported.

2. **max_tokens** specifies the maximum number of output tokens. Keep in mind that 1 token does not equal 1 word. In general, you can estimate roughly 3 tokens per word.

3. **min_tokens** specifies the minimum number of output tokens.

4. **decoding** specifies the decoding method. You can also choose to do **sampling** decoding - in which case you can specify more parameters (such as **Top-P** and **Top-K**). More information on these additional parameters can be found from the watsonx.ai Technical Sales Level 3 class (https://learn.ibm.com/course/view.php?id=13452).

5. **stop_sequences** specifies sequences of tokens that, when encountered, will cause the model to stop generating further tokens. This is useful for controlling the end of the output and preventing the generation of unwanted or irrelevant content.

In [None]:
# Set up watsonx model and parameters
model_type = "codellama/codellama-34b-instruct-hf" # Coding model
max_tokens = 200
min_tokens = 10
decoding = DecodingMethods.GREEDY
stop_sequences = ["\n\n"]

# Get the watsonx model
model_mdx = get_model_mdx(model_type, max_tokens, min_tokens, decoding, stop_sequences)

## 6. LLM Inferencing to generate MDX statement to query data in PA

In [None]:
# Send a prompt to model
generated_response = model_mdx.generate(prompt_mdx)
response_mdx = generated_response['results'][0]['generated_text']

# Print model response
print("--------------------------------- Generated response -----------------------------------")
print(response_mdx)

## 7. Query data in PA based on the AI-generated MDX statement

In [None]:
# Retrieve data from the Cube and View as a dataframe
# Reference: https://code.cubewise.com/blog/getting-data-from-tm1-with-python/
data = tm1.cubes.cells.execute_mdx(mdx=response_mdx, private=False, use_compact_json=True)
df = build_pandas_dataframe_from_cellset(data, multiindex=False, sort_values=False)
df

In [None]:
# Convert the dataframe to markdown format
df = df.rename({'Revenue': 'KPI'}, axis=1)
df_md = df.to_markdown(index=False)
print(df_md)

## 8. Model Parameter for LLM to generate MDX statement to query data in PA

The following block specifies the the specifics for the LLM. In a PoX, you may want to vary these values to show a client how they can get the best results.

1. **model_type** specifies the LLM being used. In the example below is the **ibm/granite-13b-instruct-v2** model. You can change it to other models. Note that the size of the model will have implications on resource usage. You may wish to try some of the other ones in a PoX and see if they will provide different results. Refer [here](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-api-model-ids.html?context=wx&audience=wdp), for exhaustive list of models supported.

2. **max_tokens** specifies the maximum number of output tokens. Keep in mind that 1 token does not equal 1 word. In general, you can estimate roughly 3 tokens per word.

3. **min_tokens** specifies the minimum number of output tokens.

4. **decoding** specifies the decoding method. You can also choose to do **sampling** decoding - in which case you can specify more parameters (such as **Top-P** and **Top-K**). More information on these additional parameters can be found from the watsonx.ai Technical Sales Level 3 class (https://learn.ibm.com/course/view.php?id=13452).

5. **repetition_penalty** controls the model's tendency to repeat the same phrases or tokens in its output. A higher repetition penalty discourages the model from generating repetitive content, which can enhance the diversity and readability of the output. This is particularly useful for improving the quality of generated text by reducing redundancy.

In [None]:
# Set up watsonx model and parameters
model_type = 'ibm/granite-13b-instruct-v2'
max_tokens = 1000
#min_tokens = 10
decoding = DecodingMethods.GREEDY
repetition_penalty = 1

# Get the watsonx model
model_anwser = get_model(model_type, max_tokens, min_tokens, decoding, repetition_penalty)

## 9. Creating Prompt to generate MDX statement to query data in PA

- Construct a One-Shot Prompt and pass the prompt to Large Language Model (LLM) to generate MDX statement.
- Based on user's question in natural language, LLM will generate the MDX statement.
- The MDX statement will be used to query data in Planning Analytics.

In [None]:
prompt = """<s>[INST] <<SYS>> Answer the question with the information contained in the following table. If the question is unanswerable, say 'unanswerable'. <</SYS>> Question: """ + question + df_md + '[/INST]'

print(prompt)

## 10. LLM Inferencing to generate answer based on the data queried from PA

In [None]:
# Send a prompt to model
generated_response = model_anwser.generate(prompt)
response_text = generated_response['results'][0]['generated_text']

# Print model response
print("--------------------------------- Generated response -----------------------------------")
print(response_text)