# POC for ExecutionDefinition generation.

This notebook explores options for generating the ExecutionDefinition using Large Language Models (LLMs) from written text.
- It currently supports OpenAI's GPT 4 and 3.5 APIs.

Most of the actual code is 'hidden' in the './agent.py' and './tools.py' files, making this notebook as user-friendly as possible.

This solution works out of the box with GoodData CE. However, with minor tweaks, it should also be compatible with Tiger/Panther instances.

## Before You Start

1. Set the default profile in ./profiles.yaml to your desired credentials.
    - You can also use the provided profile, which works 'out-of-the-box' for the Community Edition.
2. In .env, set OPENAI_API_KEY.
3. Ensure your GoodData (GD) docker is running.
4. If you're not using the Community Edition, be aware that larger workspaces might not work 'out-of-the-box' due to the 8K token limitations.
5. Ensure you have at least one workspace. You can specify the workspace ID in `tools.py - get_workspace_id()` method.

## Usage

I opted for simplicity in this design; I'm focusing solely on attributes and metrics. I've ensured both are mandatory, and I've hardcoded aspects such as `filter=[]`. I've also enriched the dimensions with `measureGroup`.

Since this `.ipynb` file is intended for a technical audience, no visuals are included. To pose questions, you must first set up an Agent class from `agent.py`, and provide it with the parameters `open_ai_model` and `method`.

Here are two concise tables that explain each of these parameters:

**Models:**

| Enum  | Value  | Description  |
|---|---|---|
| Model.GPT_4  | "gpt-4"   |  Raw GPT4 model, which does not support function calling. |
| Model.GPT_4_FUNC  | "gpt-4-0613"  | GPT4 model, which supports function calling.  |
| Model.GPT_3  | "gpt-3.5-turbo"  | Raw GPT3.5 model, which does not support function calling.|
| Model.GPT_3_FUNC  | "gpt-3.5-turbo-0613"  | GPT3.5 model, which supports function calling.  |

**Methods:**

| Enum  | Supported Models | Description  |
|---|---|---|
| AIMethod.RAW  | All  | Calls OpenAI API chatCompletions, without any specialties. |
| AIMethod.GPT_4_FUNC  | GPT_4_FUNC, GPT_3_FUNC | Calls OpenAI API chatCompletions, with a pre-defined .json structure, using function calling. |
| AIMethod.LANGCHAIN  | All  | Uses Langchain to create an embedding from provided documents in the "./data/" folder, which is then used to enrich the prompt. |

To generate a plot, simply ask a question in the format that you would use to name your visualization (assuming the target audience had an IQ of 80 - so GPT can understand it), such as `Clothing, Electronics and Home revenue per region`.

### Questions to try

When posing "questions", aim for precision in the desired information. For example:
- Revenue per month
- Revenue per customer per year
- Number of orders and revenue per product name

You can also experiment with less conventional queries:
- Faulty descriptions:
    - Money by people
    - Monthly money by user
    - Revenue by customer
- Non-English descriptions:
    - Měsíční peníze na zákazníka
    - Výdělek jednotlivých kampaní
    - Výdělek a útrata jednotlivých kampaní
    - Prachy na kampaň
    - 每月收入 (Monthly revenue - traditional Chinese)


### Disclaimer

Since the entire process is dependent on LLM, the results might not always be accurate. The examples provided are not 100% reliable and may occasionally result in exceptions. If you encounter any bugs or typos, please contact us. We would be more than happy to fix them!


### Generate using function calls

Function calls are a killer feature that enable programmers to specify the desired outcome in a pre-defined .json structure.

The structure we are using in this approach is as follows:

In [None]:
{
    "name": "ExecutionDefinition",
    "description": "Create ExecutionDefinition for data visualization",
    "parameters":{
        "type": "object",
        "properties": {
            "attributes": {
                "type": "array",
                "items": {
                    "type": "string",
                    "description": "local_id of an attribute"
                },
                "description": "List of local_id of attributes to be used in the visualization"
            },
            "metrics": {
                "type": "array",
                "items": {
                    "type": "string",
                    "description": "local_id of a metric"
                },
                "description": "List of local_id of metrics to be used in the visualization"
            }
        },
        "required": ["attributes", "metrics"]
    }
}

To try it for yourself, simply run the following snippet:

In [None]:
from agent import Agent
from tools import answer_to_plot

question="Clothing, Electronics and Home revenue by state"

ag = Agent(
    open_ai_model=Agent.Model.GPT_3_FUNC,
    method=Agent.AIMethod.FUNC
)
answer = ag.ask(question)
answer_to_plot(answer)

One significant advantage of this approach is its remarkable speed (4-5s with GPT-4) and reliability. Unlike the "raw" GPT-4, it can be designed to always provide a .json format output.

However, like many aspects of life, there are some drawbacks:

- Currently, it is not possible to fine-tune the model, which supports function calls.
    - The context must be provided either in the prompt or in the system message.
    - This requirement makes it difficult to use.
- Despite its benefits, it is still quite expensive (approximately $0.03 per request with GPT-4).
- You may need to disclose potentially sensitive information to a third party.

### Generate through "raw" chatCompletions

This method is somewhat more complex, as we can't always guarantee a .json format response.

I've experimented with various approaches and discovered three key strategies to maximize success rates and produce the best outcomes:

- Inject the context directly into the prompt, enclosing it as `context:"""_the actual context_"""`
- Label your content extensively with a distinctive identifier. I chose `AHAHA_workspace` for the workspace.
- Ensure the entire .json format is thoroughly documented. If not, there may be comprehension difficulties.

Now, let's put theory into practice and try it out!

In [None]:
from agent import Agent
from tools import answer_to_plot

question="Ukaž prachy"

ag = Agent(
    open_ai_model=Agent.Model.GPT_3,
    method=Agent.AIMethod.RAW
)
answer = ag.ask(question)
answer_to_plot(answer)

Pros:
- Even faster than function calls
- Prompts can be easily enriched with vector DB, which makes this scale significantly better than function calls.

Cons:
- Still quite expensive (especially with GPT-4)
- Less reliable (approximately 90% reliability?)
- You may still need to provide potentially sensitive information to a third party.

In conclusion, the raw chatCompletions method is comparable in quality to function calls, though it is somewhat less reliable.


### Generating with LangChain

This method may be somewhat controversial. While LangChain has its merits, for a task as simple as this, it is the least efficient of the three options.

The primary issue is the time it takes to execute. This is likely due to my decision not to persist the vector database, resulting in each snippet creating a new one from scratch.

A second issue is that generation will always be slower, even with persistence. Since there's no way to force the agent to output only the .json format, the answers are longer, as they contain a lengthy explanation.

However, occasionally it can return just the .json format, nearly matching the speed of the other methods when it does so.

This approach would work best when combined with function calls. However, this combination isn't supported 'out-of-the-box' and is well beyond the scope of this proof of concept.

The current workflow involves the entire "intelligence" residing in the `./data/` folder. LangChain then ingests this folder, enriches a very basic prompt, and sends it to the OpenAI chatCompletions API.


In [None]:
from agent import Agent
from tools import answer_to_plot

question="Clothing, Electronics and Home revenue per region"

ag = Agent(
    open_ai_model=Agent.Model.GPT_3,
    method=Agent.AIMethod.LANGCHAIN
)
answer = ag.ask(question)
answer_to_plot(answer)

One significant observation from using LangChain is that most prompt engineering best practices fail in this context.

With additional effort, it might be possible to enhance its performance and smoothness.

Pros:
- More affordable
- Scales better

Cons:
- Slower
- Even less predictable, than "raw"

## Conclusion

The most shocking conclusion from this proof of concept (POC) is that the quality of the prompting is as crucial, if not more so, as the tooling that surrounds it.

In fact, 'agent.py' contains as much prompt text as it does code. This can be attributed to the relative maturity of the OpenAI tooling, despite the recent inception of this technological boom. It also underscores the fact that when providing prompts, specificity is often required.

However, the quality of the responses in this case was, at best, questionable. There is significant room for improvement, and if we were to launch this in its current form, it would likely be met with laughter.

It's also important to note that the data this POC relies on is simplistic, and this simplicity is further magnified by the reduced number of options from which GPT can choose.

In conclusion, while LLMs are far from perfect, they can still be useful for tasks of moderate complexity. Personally, I believe it's more important to acknowledge and highlight these flaws rather than indulge in marketing BS, as many sadly do, which could only backfire in the long run.