<a href="https://colab.research.google.com/github/eekaiboon/gen_ai/blob/main/export_error.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Goal

Provide suggestion on how to resolve export error.

# Scenario

Try to export the following expense to Quickbooks Online:

```
{
    "time": "2015-06-29T12:43:42.132-07:00",
    "JournalEntry": {
        "SyncToken": "0",
        "domain": "QBO",
        "TxnDate": "2015-06-29",
        "sparse": false,
        "Line": [
            {
                "Description": "Four sprinkler heads damaged",
                "JournalEntryLineDetail": {
                    "PostingType": "Debit",
                    "AccountRef": {
                        "name": "Job Expenses:Job Materials:Fountain and Garden Lighting",
                        "value": "65"
                    },
                    "Entity": {
                        "Type": "Customer",
                        "EntityRef": {
                            "name": "Amy's Bird Sanctuary",
                            "value": "1"
                        }
                    }
                },
                "DetailType": "JournalEntryLineDetail",
                "ProjectRef": {
                    "value": "39298034"
                },
                "Amount": 25.54,
                "Id": "0"
            },
            {
                "JournalEntryLineDetail": {
                    "PostingType": "Credit",
                    "AccountRef": {
                        "name": "Notes Payable",
                        "value": "44"
                    }
                },
                "DetailType": "JournalEntryLineDetail",
                "Amount": 25.54,
                "Id": "1",
                "Description": "Sprinkler Hds - Sprinkler Hds Inventory Adjustment"
            }
        ],
        "Adjustment": false,
        "Id": "227",
        "TxnTaxDetail": {},
        "MetaData": {
            "CreateTime": "2015-06-29T12:33:57-07:00",
            "LastUpdatedTime": "2015-06-29T12:33:57-07:00"
        }
    }
}
```

But export fails with the following error:

```
Business Validation Error: When you use Accounts Payable, you must choose a vendor in the Name field.
```



# Takeaways

*   There are different LLM models (e.g. OpenAI gpt-3, OpenAI gpt-4, etc) and different prompting techniques (e.g. RAG, ReAct, OpenAI Assistant API, etc) that we can use to tackle this problem. Each combination of them will yield different results.
*   Hence, I think it is vital for us to invest in evaluation earlier so that we can make an informed decison on what LLM model, what technique, and other dimension of consideration not covered here when applying LLM in our product.

# Approach #1 - LlamaIndex Query engine approach: LLM response based on provided API documentation

*   In this example, we have 2 API documentation: (1) QBO API documentation (2) Netsuite API documentation. These 2 API documentations are stored in our "knowledge base". We will let LLM to decide which documentation to reference when resolving export error of specific ERP (e.g. when resolving QBO export error, it should have reference the QBO API documentation).
*   We probably don't need to provide API documentation in prompt if the LLM is API documentation is publicly available on the internet (i.e. not behind paywall) and the API version that we use is covered by the LLM knowledge cutoff date (e.g. the knowledge cutoff date for OpenAI GPT-4 Turbo is April 2023).

In [32]:
!pip install --upgrade llama-index

Defaulting to user installation because normal site-packages is not writeable


In [1]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

In [2]:
from llama_index import load_index_from_storage

try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/export_error/qbo_api"
    )
    qbo_api_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/export_error/netsuite"
    )
    netsuite_api_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

print(index_loaded)

False


In [3]:
!mkdir -p 'data/export_error/'
!wget 'https://drive.google.com/file/d/1Z6Gyv6rghFTrVECEEqOBfGOkWNtCLTE-/view?usp=sharing' -O 'data/export_error/qbo_journal_entry.md'
!wget 'https://drive.google.com/file/d/1FYfbclnIooUN7mqlZErURHlovKyz7d9p/view?usp=sharing' -O 'data/export_error/netsuite_journal_entry.md'

--2023-11-09 15:13:34--  https://drive.google.com/file/d/1Z6Gyv6rghFTrVECEEqOBfGOkWNtCLTE-/view?usp=sharing
Resolving drive.google.com (drive.google.com)... 2607:f8b0:4005:80c::200e, 142.250.189.238
Connecting to drive.google.com (drive.google.com)|2607:f8b0:4005:80c::200e|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘data/export_error/qbo_journal_entry.md’

data/export_error/q     [ <=>                ]  82.44K  --.-KB/s    in 0.03s   

2023-11-09 15:13:35 (2.60 MB/s) - ‘data/export_error/qbo_journal_entry.md’ saved [84422]

--2023-11-09 15:13:35--  https://drive.google.com/file/d/1FYfbclnIooUN7mqlZErURHlovKyz7d9p/view?usp=sharing
Resolving drive.google.com (drive.google.com)... 2607:f8b0:4005:80c::200e, 142.250.189.238
Connecting to drive.google.com (drive.google.com)|2607:f8b0:4005:80c::200e|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘data/export_error/ne

In [4]:
from llama_index import SimpleDirectoryReader, VectorStoreIndex, StorageContext

if not index_loaded:
    # load data
    qbo_api_docs = SimpleDirectoryReader(
        input_files=["./data/export_error/qbo_journal_entry.md"]
    ).load_data()
    netsuite_api_docs = SimpleDirectoryReader(
        input_files=["./data/export_error/netsuite_journal_entry.md"]
    ).load_data()

    # build index
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/export_error/qbo_api"
    )
    qbo_api_index = VectorStoreIndex.from_documents(qbo_api_docs, storage_context=storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/export_error/netsuite"
    )
    netsuite_api_index = VectorStoreIndex.from_documents(netsuite_api_docs, storage_context=storage_context)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.open

In [26]:
journal_entry_payload = """
{
    "time": "2015-06-29T12:43:42.132-07:00",
    "JournalEntry": {
        "SyncToken": "0",
        "domain": "QBO",
        "TxnDate": "2015-06-29",
        "sparse": false,
        "Line": [
            {
                "Description": "Four sprinkler heads damaged",
                "JournalEntryLineDetail": {
                    "PostingType": "Debit",
                    "AccountRef": {
                        "name": "Job Expenses:Job Materials:Fountain and Garden Lighting",
                        "value": "65"
                    },
                    "Entity": {
                        "Type": "Customer",
                        "EntityRef": {
                            "name": "Amy's Bird Sanctuary",
                            "value": "1"
                        }
                    }
                },
                "DetailType": "JournalEntryLineDetail",
                "ProjectRef": {
                    "value": "39298034"
                },
                "Amount": 25.54,
                "Id": "0"
            },
            {
                "JournalEntryLineDetail": {
                    "PostingType": "Credit",
                    "AccountRef": {
                        "name": "Notes Payable",
                        "value": "44"
                    }
                },
                "DetailType": "JournalEntryLineDetail",
                "Amount": 25.54,
                "Id": "1",
                "Description": "Sprinkler Hds - Sprinkler Hds Inventory Adjustment"
            }
        ],
        "Adjustment": false,
        "Id": "227",
        "TxnTaxDetail": {},
        "MetaData": {
            "CreateTime": "2015-06-29T12:33:57-07:00",
            "LastUpdatedTime": "2015-06-29T12:33:57-07:00"
        }
    }
}
"""

export_error = "Business Validation Error: When you use Accounts Payable, you must choose a vendor in the Name field."

system_prompt = """
You are an ERP integration expert who are familiar with integrating financial product like Brex to ERP solution like Quickbooks Online, Netsuite, etc.
"""

user_prompt = f"""
We are trying to create the following journal entry object in Quickbooks Online using API call.

Journal entry object request payload:
```json
{journal_entry_payload}
```

However, we ran into the following error when create the journal entry object via API.

API error response:
{export_error}

How should I fix the API error?

Please do 2 things:
1. Please provide an updated request payload and show me which part of the request payload needs to be updated with a comment.
2. Please provide a user friendly error that I can use to display in my web application.
""".replace(
    "{", "{{"
).replace(
    "}", "}}"
)

print(user_prompt)


We are trying to create the following journal entry object in Quickbooks Online using API call.

Journal entry object request payload:
```json

{{
    "time": "2015-06-29T12:43:42.132-07:00",
    "JournalEntry": {{
        "SyncToken": "0",
        "domain": "QBO",
        "TxnDate": "2015-06-29",
        "sparse": false,
        "Line": [
            {{
                "Description": "Four sprinkler heads damaged",
                "JournalEntryLineDetail": {{
                    "PostingType": "Debit",
                    "AccountRef": {{
                        "name": "Job Expenses:Job Materials:Fountain and Garden Lighting",
                        "value": "65"
                    }},
                    "Entity": {{
                        "Type": "Customer",
                        "EntityRef": {{
                            "name": "Amy's Bird Sanctuary",
                            "value": "1"
                        }}
                    }}
                }},
            

In [8]:
from llama_index import LLMPredictor, ServiceContext
from llama_index.llms import OpenAI
from llama_index.tools import QueryEngineTool, ToolMetadata

llm = OpenAI(model="gpt-3.5-turbo-1106")
llm_predictor = LLMPredictor(system_prompt=system_prompt, llm=llm)

service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

# Define tools.
qbo_tool = QueryEngineTool(
    query_engine=qbo_api_index.as_query_engine(service_context=service_context),
    metadata=ToolMetadata(
            name="qbo_api",
            description=("""
              Use this query engine to answer Quickbooks Online related API questions.
              However, only use this query engine for journal entry related questions since
              we only have journal entry API documentation in the knowledge base"""
            ),
        ),
)
netsuite_tool = QueryEngineTool(
    query_engine=netsuite_api_index.as_query_engine(service_context=service_context),
    metadata=ToolMetadata(
            name="netsuite_api",
            description=("""
              Use this query engine to answer Netsuite related API questions.
              However, only use this query engine for journal entry related questions since
              we only have journal entry API documentation in the knowledge base"""
            ),
        ),
)

In [None]:
from llama_index.query_engine import RouterQueryEngine
from llama_index.selectors.llm_selectors import LLMSingleSelector

# Define query engine.
# Define a RouterQueryEngine over the tools.
# By default, this uses a LLMSingleSelector as the router, which uses the LLM to choose the best sub-index to router the query to, given the descriptions.
query_engine = RouterQueryEngine.from_defaults(
    selector=LLMSingleSelector.from_defaults(service_context=service_context),
    query_engine_tools=[qbo_tool, netsuite_tool]
)

response = query_engine.query(user_prompt)
print(response)

# Approach #2 - OpenAI Assistant API

*   Very similar to approach #1, but using OpenAI Assistant API which handles tool selection automatically.
*   But it seems like the LlamaIndex OpenAIAssistantAgent library is making a lot more call to OpenAI before returning the final response -> more costly with this approach (it could just be the way how this library behaves; didn't dig deeper to understand why there are so many API calls to OpenAI)


In [11]:
from llama_index.agent import OpenAIAssistantAgent

agent = OpenAIAssistantAgent.from_new(
    name="ERP integration expert",
    instructions="You are an ERP integration expert who are familiar with integrating financial product like Brex to ERP solution like Quickbooks Online, Netsuite, etc.",
    tools=[qbo_tool, netsuite_tool],
    verbose=True,
    run_retrieve_sleep_time=1.0,
)

response = agent.chat(user_prompt)
print(response)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/assistants "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/assistants "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/threads "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/threads "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/threads/thread_eAsApJmlERhNmA9IMmyVSiZj/messages "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/threads/thread_eAsApJmlERhNmA9IMmyVSiZj/messages "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/threads/thread_eAsApJmlERhNmA9IMmyVSiZj/runs "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/threads/thread_eAsApJmlERhNmA9IMmyVSiZj/runs "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://api.openai.com/v1/threads/thread_eAsApJmlERhNmA9IMmyVSiZj/runs/run_4aPEvvlkssYYWG4GRrxG8bxl "HTTP/1.1 200 OK"
HTTP Request: GET https://api.openai.com/v1/threads/thread_eAsApJmlERhNmA9IMmyV

# Approach #3 - ReAct (Reason+Act)

*   In this example, we will use the `SELF_ASK_WITH_SEARCH` agent in LangChain. This agent will perform google search to answer question.
*   With LangChain, we could have opt to create a custom ReAct chain with more tools (e.g. google search tool, vector index retrieval tool, etc). But for demostration purpose, we will just use a simple ReAct chain with google search tool (this is out-of-box in LangChain).
*   This approach doesn't work well be gpt-3, but it works fine with gpt-4.
*   We probably need to do some prompt engineering so that ReAct can reach a conclusive response. Here, I stopped the reason + act loop after 2 iterations. Otherwise, it will just keep looping indefinitely (until the default max iteration) since LLM thinks that the response is still incomplete.

In [27]:
!pip install google-search-results
!pip install --upgrade langchain
!pip install --upgrade openai

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable


In [2]:
!pip list

Package                   Version
------------------------- ------------
aiohttp                   3.8.6
aiosignal                 1.3.1
aiostream                 0.5.2
altgraph                  0.17.2
annotated-types           0.6.0
anyio                     3.7.1
appnope                   0.1.3
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 2.4.1
async-lru                 2.0.4
async-timeout             4.0.3
attrs                     23.1.0
Babel                     2.13.1
beautifulsoup4            4.12.2
bleach                    6.1.0
certifi                   2023.7.22
cffi                      1.16.0
charset-normalizer        3.3.1
click                     8.1.7
comm                      0.1.4
dataclasses-json          0.5.14
datasets                  2.14.6
debugpy                   1.8.0
decorator                 5.1.1
defusedxml                0.7.1
Deprecated           

In [28]:
#from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.utilities import SerpAPIWrapper
from langchain.agents import Tool
from langchain.agents import AgentType

llm = ChatOpenAI(model_name='gpt-4-1106-preview')
search = SerpAPIWrapper()
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run,
        description="Useful for when you need to ask with search",
    )
]

In [29]:
from langchain.agents import initialize_agent

self_ask_with_search = initialize_agent(
    tools,
    llm,
    agent=AgentType.SELF_ASK_WITH_SEARCH,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=2,
)
self_ask_with_search.run(user_prompt)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mCould not parse output: Yes, follow-up questions are needed here to clarify the requirements for the error message.

Follow-up question: What is the specific message you would like to display to the user when they encounter this error?

Based on the error received from the Quickbooks API, it appears that the issue is with the account type used in the journal entry line that is supposed to credit an account. When crediting an Accounts Payable type account, Quickbooks requires a vendor to be specified. This means we need to update the payload to include a "Vendor" reference in the "Entity" section of the "JournalEntryLineDetail" where the "AccountRef" with a name of "Notes Payable" is used.

Here is the updated request payload with the necessary changes:

```json
{
    "time": "2015-06-29T12:43:42.132-07:00",
    "JournalEntry": {
        "SyncToken": "0",
        "domain": "QBO",
        "TxnDate": "2015-06-29",
        "spars

'Agent stopped due to iteration limit or time limit.'

# Approach #4 - Ensemble query engine

*   Use different query tools to address user prompt and let LLM to conclude the final result based on the response from all query tools.

# Reference


1.   [Simple Website Loader](https://llamahub.ai/l/web-simple_web)
2.   [LlamaIndex OpenAI Colab](https://github.com/run-llama/llama_index/blob/main/docs/examples/llm/openai.ipynb)
3.   [Install wget on M1](https://stackoverflow.com/questions/64963370/error-cannot-install-in-homebrew-on-arm-processor-in-intel-default-prefix-usr)
4.   [OpenAI Agent with Query Engine Tools](https://github.com/run-llama/llama_index/blob/2d30e96671d2b53a5299801969f39a7d336cf9f8/docs/examples/agent/openai_agent_with_query_engine.ipynb)
5.   [QBO JournalEntry API doc](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/all-entities/journalentry)
6.   [LlamaIndex Q&A patterns](https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/q_and_a.html#routing-over-heterogeneous-data)
7.   [OpenAI Assistant Agent](https://gpt-index.readthedocs.io/en/stable/examples/agent/openai_assistant_agent.html)
8.   [Ensemble Query Engine](https://gpt-index.readthedocs.io/en/stable/examples/query_engine/ensemble_query_engine.html)
9.   [Self-ask with Search](https://python.langchain.com/docs/modules/agents/agent_types/self_ask_with_search)