## Assignmen Week 3

### Using OpenAI API to generate Json from Invoice text

Providing specific, clear instructions in the prompt to LLMs generate better relevant answers. The process of crafting prompts to get the right output from a LLM is known as `prompt engineering`. By `prompt engineering`, we provide appropriate precise instructions, examples, and necessary context, formats, phrases, words to LLMs which helps in improving the quality and accuracy of the output. In the activities below prompt engineering approach is leveraged in simplest form to generate better quality json data from an invoice using OpenAI LLM (chatgp-4o-mini). 

In [1]:
from openai import OpenAI
import json
import os

In [2]:
# get api key from file
with open("../../apikeys/openai-keys.json", "r") as key_file:
    api_key = json.load(key_file)["default_api_key"]
os.environ["OPENAI_API_KEY"] = api_key

In [3]:
# get client for API call
client = OpenAI()

In [4]:
invoice_text = """321 Avenue A Date: 6/28/2024
Portland, OR 12345 Invoice # 1111
Phone: (206) 555-1163 For PO # 123456
Fax: (206) 555-1164
someone@websitegoeshere.com
Quantity Description Unit price Amount Discount applied
1 Item Number 1 $ 2.00 2.00 $
1 Item Number 2 $ 2.00 2.00 $
1 Item Number 3 $ 2.00 2.00 $
$ -
$ -
$ -
$ -
$ -
$ -
$ -
$ -
Subtotal 6.00 $
Credit $ 1,000.00
Tax 9.80%
Additional discount 12%
Balance due (994.20) $
Company Name
INVOICE
If you have any questions concerning this invoice, contact
<Name> at <phone or email>.
Make all checks payable to <Company name>.
Thank you for your business!
Bill To:
Natasha Jones
Central Beauty
123 Main St.
% discount 10%
Manhattan, NY 98765
(321) 555-1234
Items over this amount qualify for an additional discount $100"""

#### Prompt 1

The OpenAI `chat completion` api allows to provide multiple messages with different `roles` to influene how the model interpret the input and geneate output. 

This prompt uses `user` role which is used to ask for a particular response from the LLM. We can think `user` role analogous to the input we type in a ChatGPT web interface. 

Few prompt engineering strategies used in the promt are:
- provided the context of the task by mentioning that `the input data is an invoice in unstructured text format`
- separated the input invoice text in a `separate line` so that the model can clearly distinguish the actual invoice data fom the instructions

In [6]:
# API call using prompt 1
response1 = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user", 
            "content": f"Convert unstructured text from the invoice to json data.\n Here is the unstructured text invoice: {invoice_text}"
        }
    ],
)

print(f"Output json:\n {response1.choices[0].message.content}")

Output json:
 Based on the provided unstructured text from the invoice, here is an organized JSON representation of the data:

```json
{
  "invoice_details": {
    "invoice_number": "1111",
    "date": "2024-06-28",
    "bill_to": {
      "name": "Natasha Jones",
      "company": "Central Beauty",
      "address": {
        "street": "123 Main St.",
        "city": "Manhattan",
        "state": "NY",
        "zip": "98765"
      },
      "phone": "(321) 555-1234"
    },
    "contact": {
      "phone": "(206) 555-1163",
      "fax": "(206) 555-1164",
      "email": "someone@websitegoeshere.com"
    },
    "items": [
      {
        "quantity": 1,
        "description": "Item Number 1",
        "unit_price": 2.00,
        "amount": 2.00,
        "discount": 0
      },
      {
        "quantity": 1,
        "description": "Item Number 2",
        "unit_price": 2.00,
        "amount": 2.00,
        "discount": 0
      },
      {
        "quantity": 1,
        "description": "Item Number 3"

In [7]:
#show token usage of api call
print(response1.usage)

CompletionUsage(completion_tokens=669, prompt_tokens=279, total_tokens=948, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))


The response of the `prompt 1` is very promising. The output json contain **most information** from the actual invoice and it is interesting to notice how the model created nested json for sub levels of `bill to information` and `contact information`. The output also explains the structure and information represented by each object in the json. However, one important point to note is that the output is missing key information about  the **billing company**. Hence, it might be better provide a sample json format which will be able instruct the LLM about the information that needs to be present in the output.

#### Prompt 2

This prompt uses both `system` role and `user` role. `System` role provides top-level instructions to the model, and typically describe what the model is supposed to do and how it should generally behave and respond, hence proactively providing the context of the task to the model.
As we have already seen in `Prompt 1` that `user` role provides the information which is the invoice text in this case.

Few prompt engineering strategies used in the promt are:
- explicitly provided the output format of json
- provided the context of the task using `system` role and added instruction for the model to generate output with context - `You convert unstructured text from invoices to json data`


In [20]:
json_schema = {'invoice': {'invoice_number': '',
             'invoice_for': '',
             'invoice_date': '12/12/2024',
             'company_info': {'name': '',
                              'street': '',
                              'city': '',
                              'state': '',
                              'zip': '',
                              'fax': '',
                              'phone': '',
                              'email': ''},
             'billing_information': {'bill_to': {'name': '',
                                                 'company': '',
                                                 'street': '',
                                                 'city': '',
                                                 'state': '',
                                                 'zip': '',
                                                 'phone': ''},
                                     'discount': '',
                                     'items': [{'quantity': 0,
                                                'description': '',
                                                'unit_price': '',
                                                'amount': 0.0,
                                                'discount': ''}],
                                     'subtotal': 0.0,
                                     'credit': 0.0,
                                     'tax': '',
                                     'additional_discount': '',
                                     'balance_due': 0.0,
                                     'notes': ''}}}


In [21]:
# api call using prompt2 for the completion task
response2 = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system", 
            "content": f"You convert unstructured text from invoices to json data \nHere is an example of output json data: \n {json.dumps(json_schema)}"
        },
        {
            "role": "user", 
            "content": f"Here is the unstructured text: {invoice_text}"
        }
    ],
)

print(f"Output json:\n {response2.choices[0].message.content}")

Output json:
 ```json
{
  "invoice": {
    "invoice_number": "1111",
    "invoice_for": "PO # 123456",
    "invoice_date": "6/28/2024",
    "company_info": {
      "name": "Company Name",
      "street": "321 Avenue A",
      "city": "Portland",
      "state": "OR",
      "zip": "12345",
      "fax": "(206) 555-1164",
      "phone": "(206) 555-1163",
      "email": "someone@websitegoeshere.com"
    },
    "billing_information": {
      "bill_to": {
        "name": "Natasha Jones",
        "company": "Central Beauty",
        "street": "123 Main St.",
        "city": "Manhattan",
        "state": "NY",
        "zip": "98765",
        "phone": "(321) 555-1234"
      },
      "discount": "10%",
      "items": [
        {
          "quantity": 1,
          "description": "Item Number 1",
          "unit_price": "$2.00",
          "amount": 2.00,
          "discount": ""
        },
        {
          "quantity": 1,
          "description": "Item Number 2",
          "unit_price": "$2.00",


In [22]:
#show token usage of api call
print(response2.usage)

CompletionUsage(completion_tokens=424, prompt_tokens=466, total_tokens=890, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))


The issue of missing information for `company` is not longer present in the prompt 2 output. This is because we provided the output format of json as a guide for the model. The model generated the json output exactly in same schema as provided in the prompt. Prompt 2 used more input tokens `466` as compared to `279` of prompt 1 because we provided addtional text in input to specify the json format.  This is an important thing to consider when using APIs because the the API is billed based on number of tokens in the reuest and response. However, the output tokens of prompt 2 (`424`) are less compared to `669` of prompt 1, because prompt 1 generated more output with additional explanations apart from the actual json data. As output tokens are billed at a higher rate than input tokens, hence in this case prompt 2 would have **lower cost**. *This highlights an important aspect of prompt engineering: that by right use/crafting of prompts we can achieve better results with reduced cost.*