In [1]:
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

import os
import sys
# Add the parent directory of this notebook to sys.path
notebook_dir = os.path.dirname(os.path.abspath('__file__'))
parent_dir = os.path.dirname(notebook_dir)
sys.path.append(parent_dir)

from openai import OpenAI
import json

# used to load environment variables from the .env file
from dotenv import load_dotenv
load_dotenv()

# 1. Introduction
## 1.1 Overview
Within this notebook, the task is performed to create a chatbot that would take in a couple of pre-defined operation a chatbot can do, these are tasks related to managing information in property portfolio. 

- extend_lease
- sell_property
- change_erv
- get_rent
- change_area
- change_rent

The general idea in this notebook is the let LLM perform the NLP task end-to-end by utilising prompt engineering techniques. Specifically, the following traditional NLP are include in the prompts to be included in the instructions to the LLM of choise - OpenAI's GPT-4o model. 


# 1.2 Functional Design
The assumption is made in this exercise that, the output of the LLM would be a JSON object that can allow precise computational control in further steps of the property portfolio management process. This would include using these JSON output to facilitate API calls to perform database level operation 

Based on this, the main functional objective is to perform the following:
- specify an user input in natual language that would linguistically match to one of the pre-defined operation types, with additional information that would facilitate the operation.
- perfform necessary disambiguation to handle numerical notation and missing values as part of the prompt engineering process
- demonstrate basic user conversational flow control: user input LLM processing -> clarification and abnormal output handling if necessary -> output 
- the output comsists of two parts: identification of the matched operation and the JSON object containing the parameters for the matched operation


# Initial setup

In [2]:
oai_key2 = os.getenv('OPENAI_API_KEY2')
OAI_MODEL = "gpt-4o"

# Definition of available operations

In [125]:
available_operations = [
    {
        "name": "extend_lease",
        "description": "Extend the lease for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "years": {"type": "integer", "description": "Number of years to extend the lease"},
                "date": {"type": "string", "description": "Date of the lease extension (YYYY-MM)"}
            },
            "required": ["property_name", "years"]
        }
    },
    {
        "name": "sell_property",
        "description": "Sell a property unit",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name or description of the property unit"},
                "price": {"type": "number", "description": "numeric value of the price"},
                "price_unit": {"type": "string", "description": "unit of the price"},
                "date": {"type": "string", "description": "Date of the sale (YYYY-MM)"}
            },
            "required": ["property_name", "price", "date"]
        }
    },

    {
        "name": "change_erv",
        "description": "Change the ERV (Estimated Rental Value) for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "erv": {"type": "number", "description": "numeric value of the new ERV"},
                "erv_unit": {"type": "string", "description": "unit of the ERV value"}
            },
            "required": ["property_name", "erv"]
        }
    },
    {
        "name": "get_rent", 
        "description": "Get the current rent for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name or description of the property"},
                "floor": {"type": "string", "description": "Floor number or location in the building"}
            },
            "required": ["property_name"]
        }
    },
    {
        "name": "change_area",
        "description": "Change the area of a property",
        "parameters": {
            "type": "object", 
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "area": {"type": "number", "description": "numeric value of the new area"},
                "area_unit": {"type": "string", "description": "unit of measurement for the area"},
                "start_date": {"type": "string", "description": "Start date for the area change (YYYY-MM)"}
            },
            "required": ["property_name", "area"]
        }
    },
    {
        "name": "change_rent",
        "description": "Change the rent for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "rent": {"type": "number", "description": "numeric value of the new rent"},
                "rent_unit": {"type": "string", "description": "unit of the rent value"},
                "time_unit": {"type": "string", "description": "unit of the time period for the rent - week, month, or year"}
            },
            "required": ["property_name", "rent"]
        }
    }


    
]

# 3 Prompt Engineering Principles
The promp


## 3.1 Prompt template definition

In [119]:
# these are the two prompte used to perform the basic two-step prompting process
# prompt_template is used to perform the operation matching step
# parameter_extraction_prompt_template is used to perform the parameter extraction step

prompt_template = """ 
     Given the following action types and descriptions:\n
     {available_operations}

     What are the possible action types for the following user input: '{user_input}'? If there are multiple possibilities, list them all. If there are no matches, say 'No match'"

     provide the answer in exact text matching that of the matched operation or "no match"

"""



parameter_extraction_prompt_template = """
     Given the user input: '{user_input}', extract the following parameters for the '{matched_operation}' operation:
     
     {parameter_content}
     

     for numeric values convert them into standard decimal representation:
     - 1k or 1 thousand -> 1000
     - 1m or 1 million -> 1000000
     - 1b or 1 billion -> 1000000000
     etc

     for money sign, do the following conversion:
     - £ -> GBP
     - $ -> USD
     - € -> EUR     
    if no money sign is found, then the price is in GBP
     
     Respond in JSON format containing the extracted parameters:
     {{
         param_1: value of param_1
         param_2: value of param_2
         ...
     }}
     
     If a parameter can't be extracted, set its value to null.     
     
"""


In [122]:
# this is a composite prompt template that combines the operation matching and parameter extraction into a single prompt, used by the function process_input_with_clarification_composite_prompt - this is the showcase the option to combine multiple NLP functional steps into a single prompt, saving the cost of multiple LLM calls, but generally become less controllable on specific aspect of the NLP task. 

composite_prompt_template = """
     Given the following action types and descriptions:
     {available_operations}

     For the user input: '{user_input}'

     1. First determine which action type matches the input. If there are multiple possibilities, list them all. If there are no matches, say 'No match'
     2. If there is a match, extract the required parameters for the matched operation:
     {parameter_content}

     for numeric values convert them into standard decimal representation:
     - 1k or 1 thousand -> 1000
     - 1m or 1 million -> 1000000
     - 1b or 1 billion -> 1000000000
     etc

     for money sign, do the following conversion:
     - £ -> GBP
     - $ -> USD
     - € -> EUR
    if no money sign is found, then the price is in GBP

     Respond in JSON format:
     {{
         "matched_operation": "exact text matching that of the matched operation or 'no match'",
         "parameters": {{
             "param_1": "value of param_1",
             "param_2": "value of param_2",
             ...
         }}
     }}

     If no match is found, return null for parameters.
     If a parameter can't be extracted, set its value to null.
"""


# 4. Chatbot implementation
## 4.1 chatbot with two-step prompting

In [120]:
def process_input_with_clarification(user_input):
    query = prompt_template.format(user_input=user_input, available_operations=available_operations)

    # client = OpenAI(api_key=oai_key, organization=oai_org) 
    client = OpenAI(api_key=oai_key2) 


    messages = [
        {"role": "user", "content": query},
    ]    
    kwargs = dict(model=OAI_MODEL, messages=messages)
    completion = client.chat.completions.create(**kwargs)
    response = completion.choices[0].message.content
    matched_operation = response.strip().lower()
    available_operations_names = [op['name'] for op in available_operations]
    
    # first check if the operation is matched by LLM
    if matched_operation == "no match":
        while True:
            print("I couldn't determine which function you want to use. Please choose from the following options:")
            print("\nAvailable operations:")
            for i, func in enumerate(available_operations):
                print(f"{i+1}. {func['name']}: {func['description']}")
            print("\nOther options:")
            print(f"{len(available_operations)+1}. Re-enter your command")
            print(f"{len(available_operations)+2}. Exit")
            
            try:
                choice = int(input("\nEnter your choice number: "))
                if 1 <= choice <= len(available_operations):
                    matched_operation = available_operations[choice-1]['name']
                    break
                elif choice == len(available_operations) + 1:
                    reentry_input = input("Please re-enter your command: ")
                    print(f"re-entered command: {reentry_input}")
                    return process_input_with_clarification(reentry_input)
                elif choice == len(available_operations) + 2:
                    print("Exiting...")
                    return None, None
                else:
                    print("Invalid choice. Please try again.")
            except ValueError:
                print("Please enter a valid number.")
    
    # if the operation is matched by LLM, then extract the parameters for future functional calls
    if matched_operation in available_operations_names:
        for operation in available_operations:
            if operation['name'] == matched_operation:
                parameter_content = ""

                for param, details in operation['parameters']['properties'].items():
                    parameter_content += f"- {param}: {details['description']}\n"

                parameter_extraction_prompt = parameter_extraction_prompt_template.format(user_input=user_input, matched_operation=matched_operation, parameter_content=parameter_content)

                parameter_extraction_response = client.chat.completions.create(model=OAI_MODEL, messages=[{"role": "user", "content": parameter_extraction_prompt}], response_format={"type": "json_object"})

                extracted_params = parameter_extraction_response.choices[0].message.content
                return matched_operation, extracted_params
    else:
        return None, None


## 4.2 Chatbot with composite prompt

In [116]:
def process_input_with_clarification_composite_prompt(user_input):
    # Combine operation matching and parameter extraction into a single prompt
    composite_prompt = f"""Given the user input: "{user_input}"

Available operations:
{[{'name': op['name'], 'description': op['description'], 'parameters': op['parameters']} for op in available_operations]}

1. First determine which operation best matches the user's intent. If no operation matches, return "no match".
2. If an operation matches, extract the required parameters for that operation based on its parameter specifications.

Return your response in the following JSON format:
{{
    "matched_operation": "operation_name or no match",
    "parameters": {{}} // Only include if operation matched
}}"""

    # client = OpenAI(api_key=oai_key, organization=oai_org)
    client = OpenAI(api_key=oai_key2)

    messages = [
        {"role": "user", "content": composite_prompt},
    ]
    
    completion = client.chat.completions.create(
        model=OAI_MODEL, 
        messages=messages,
        response_format={"type": "json_object"}
    )
    
    response = completion.choices[0].message.content
    response_json = json.loads(response)
    matched_operation = response_json["matched_operation"].strip().lower()
    available_operations_names = [op['name'] for op in available_operations]

    # Handle no match case
    if matched_operation == "no match":
        while True:
            print("I couldn't determine which function you want to use. Please choose from the following options:")
            print("\nAvailable operations:")
            for i, func in enumerate(available_operations):
                print(f"{i+1}. {func['name']}: {func['description']}")
            print("\nOther options:")
            print(f"{len(available_operations)+1}. Re-enter your command")
            print(f"{len(available_operations)+2}. Exit")
            
            try:
                choice = int(input("\nEnter your choice number: "))
                if 1 <= choice <= len(available_operations):
                    matched_operation = available_operations[choice-1]['name']
                    # Re-run with the chosen operation to get parameters
                    return process_input_with_clarification_composite_prompt(user_input)
                elif choice == len(available_operations) + 1:
                    reentry_input = input("Please re-enter your command: ")
                    print(f"re-entered command: {reentry_input}")
                    return process_input_with_clarification_composite_prompt(reentry_input)
                elif choice == len(available_operations) + 2:
                    print("Exiting...")
                    return None, None
                else:
                    print("Invalid choice. Please try again.")
            except ValueError:
                print("Please enter a valid number.")

    # Handle matched operation case
    if matched_operation in available_operations_names:
        return matched_operation, json.dumps(response_json["parameters"])
    else:
        return None, None


In [123]:
user_input = "“Sell the Ground & Lower Ground unit for  2.5m in December 2026."
# user_input = "“modify the area of the Stott & May Professional Search Limited to 4200 sqft starting from July 2021"
# user_input = "Change the ERV of happy house to $1000k"
# user_input = "set the rent of b5 of emerson house to twice of the current rent of 500 per month"
# user_input = "What is the rent for the tenant on the unit 1 of 1st floor of 10 jason street"
# user_input = "I want to play in the park"
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print(matched_operation) 
print(extracted_params)

sell_property
{"property_name": "Ground & Lower Ground unit", "price": 2.5, "price_unit": "m", "date": "2026-12"}


In [126]:
# user_input = "“Sell the Ground & Lower Ground unit for  £2.5m in December 2026."

user_input = "I want to play in the park"
matched_operation, extracted_params = process_input_with_clarification(user_input)
print(matched_operation) 
print(extracted_params)

I couldn't determine which function you want to use. Please choose from the following options:

Available operations:
1. extend_lease: Extend the lease for a property
2. sell_property: Sell a property unit
3. change_erv: Change the ERV (Estimated Rental Value) for a property
4. get_rent: Get the current rent for a property
5. change_area: Change the area of a property
6. change_rent: Change the rent for a property

Other options:
7. Re-enter your command
8. Exit
re-entered command: extend the lease of the horselance by 3 years from dec 2025
extend_lease
{
    "property_name": "horselance",
    "years": 3,
    "date": "2025-12"
}


In [None]:
# matched_operation, extracted_params = process_input(user_input)


In [94]:
# user_input = "“Sell the Ground & Lower Ground unit for £2.5 million in December 2026."
# user_input = "“modify the area of the Stott & May Professional Search Limited to 4200 sqft starting from July 2021"
# user_input = "Change the ERV of happy house to $1000k"
# user_input = "What is the rent for the tenant on the third floor of 10 jason street"
user_input = "I want to play in the park"
matched_operation, extracted_params = process_input_with_clarification(user_input)
print(matched_operation) 
print(extracted_params)

I couldn't determine which function you want to use. Please choose from the following options:

Available operations:
1. extend_lease: Extend the lease for a property
2. sell_property: Sell a property unit
3. change_erv: Change the ERV (Estimated Rental Value) for a property
4. get_rent: Get the current rent for a property
5. change_area: Change the area of a property
6. change_rent: Change the rent for a property

Other options:
7. Re-enter your command
8. Exit
re-entered command: sell this house for half the price of 5000 usd
sell_property

{
    "property_name": "this house",
    "price": 2500,
    "price_unit": "USD",
    "date": null
}






# 5. Future work
Please see the end of the notebook for additional thoughts for:
- observation and comparsion of the two prompting methods

- future work:
    - further work on data structure design
    - improving economics of utlising LLM
    - improving of prompt quality 
    - generating additional data asset for further evaluation, RAG and fine-tuning 