In [1]:
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

import os
import sys
# Add the parent directory of this notebook to sys.path
notebook_dir = os.path.dirname(os.path.abspath('__file__'))
parent_dir = os.path.dirname(notebook_dir)
sys.path.append(parent_dir)

from openai import OpenAI
import json

# used to load environment variables from the .env file
from dotenv import load_dotenv
load_dotenv()

# initial setup with OpenAI API and model specification
oai_key2 = os.getenv('OPENAI_API_KEY2')
OAI_MODEL = "gpt-4o"

True

# 1. Introduction
## 1.1 Overview
Within this notebook, the task is performed to create a chatbot that would take in a couple of pre-defined operation a chatbot can do, these are tasks related to managing information in property portfolio. 

- extend_lease
- sell_property
- change_erv
- get_rent
- change_area
- change_rent

The general idea in this notebook is the let LLM perform the NLP task end-to-end by utilising prompt engineering techniques. Specifically, the following traditional NLP are include in the prompts to be included in the instructions to the LLM of choise - OpenAI's GPT-4o model. 


# 1.2 Functional Design
The assumption is made in this exercise that, the output of the LLM would be a JSON object that can allow precise computational control in further steps of the property portfolio management process. This would include using these JSON output to facilitate API calls to perform database level operation 

Based on this, the main functional objective is to perform the following:
- specify an user input in natual language that would linguistically match to one of the pre-defined operation types, with additional information that would facilitate the operation.
- perform necessary disambiguation to handle numerical notation and missing values as part of the prompt engineering process

- demonstrate basic user conversational flow control: user input LLM processing -> clarification and abnormal output handling if necessary -> output 

- the output comsists of two parts: identification of the matched operation and the JSON object containing the parameters for the matched operation


# 2. Definition of available operations
The available operations for the chatbot are defined as list of python dict items in the following format:
- name: the name of the operation
- description: the description of the operation
- parameters: the parameters for the operation, including the type, properties and required parameters  

These are specifically formatted in a way that can be included injected as parameterized components in prompts used by the LLM model, so that the model can
- match the user input to the available operations
- extract the parameters for the matched operation


In [3]:
available_operations = [
    {
        "name": "extend_lease",
        "description": "Extend the lease for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "years": {"type": "integer", "description": "Number of years to extend the lease"},
                "date": {"type": "string", "description": "Date of the lease extension (YYYY-MM)"}
            },
            "required": ["property_name", "years"]
        }
    },
    {
        "name": "sell_property",
        "description": "Sell a property unit",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name or description of the property unit"},
                "price": {"type": "number", "description": "numeric value of the price"},
                "price_unit": {"type": "string", "description": "unit of the price"},
                "date": {"type": "string", "description": "Date of the sale (YYYY-MM)"}
            },
            "required": ["property_name", "price", "date"]
        }
    },

    {
        "name": "change_erv",
        "description": "Change the ERV (Estimated Rental Value) for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "erv": {"type": "number", "description": "numeric value of the new ERV"},
                "erv_unit": {"type": "string", "description": "unit of the ERV value"}
            },
            "required": ["property_name", "erv"]
        }
    },
    {
        "name": "get_rent", 
        "description": "Get the current rent for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name or description of the property"},
                "floor": {"type": "string", "description": "Floor number or location in the building"}
            },
            "required": ["property_name"]
        }
    },
    {
        "name": "change_area",
        "description": "Change the area of a property",
        "parameters": {
            "type": "object", 
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "area": {"type": "number", "description": "numeric value of the new area"},
                "area_unit": {"type": "string", "description": "unit of measurement for the area"},
                "start_date": {"type": "string", "description": "Start date for the area change (YYYY-MM)"}
            },
            "required": ["property_name", "area"]
        }
    },
    {
        "name": "change_rent",
        "description": "Change the rent for a property",
        "parameters": {
            "type": "object",
            "properties": {
                "property_name": {"type": "string", "description": "Name of the property"},
                "rent": {"type": "number", "description": "numeric value of the new rent"},
                "rent_unit": {"type": "string", "description": "unit of the rent value"},
                "time_unit": {"type": "string", "description": "unit of the time period for the rent - week, month, or year"}
            },
            "required": ["property_name", "rent"]
        }
    }


    
]

# 3 Prompt Engineering Principles
The prompt engineering principles are as follows:
- firstly focus on the matching the user command with the available operations as defined above in section 2.
- secondly focus on the extraction of the parameters for the matched operation

The prompts are setup with prompt template that would allow the parameterized injection of dynamic content at the point of user input and LLM funcitonal calls 

The following additional instruction for disambiguation and handling of numerical notation and missing values are included:
- for numeric values convert them into standard decimal representation
- for money signs, perform conversion to standard currencies - GBP, USD, EUR
- if no money sign is found assume the price is in GBP

Two prompt engineering approaches are implemented:
- two-step prompting template definition: this allow each prompting step to focus on a specific aspect of the overall NLP processing task, at the cost of multiple LLM calls

- composite prompting template definition: this combines the operation matching and parameter extraction into a single prompt, allowing more efficient use of LLM.


The combined prompting approach should be more preferred as it's more efficient. However this require more effort at arriving at a good prompt template that cater for different aspects of the underlying NLP tasks

## 3.1 Two-step prompting template definition

In [4]:
# these are the two prompte used to perform the basic two-step prompting process
# prompt_template is used to perform the operation matching step
# parameter_extraction_prompt_template is used to perform the parameter extraction step

prompt_template = """ 
     Given the following action types and descriptions:\n
     {available_operations}

     What are the possible action types for the following user input: '{user_input}'? If there are multiple possibilities, list them all. If there are no matches, say 'No match'"

     provide the answer in exact text matching that of the matched operation or "no match"

"""



parameter_extraction_prompt_template = """
     Given the user input: '{user_input}', extract the following parameters for the '{matched_operation}' operation:
     
     {parameter_content}
     

     for numeric values convert them into standard decimal representation:
     - 1k or 1 thousand -> 1000
     - 1m or 1 million -> 1000000
     - 1b or 1 billion -> 1000000000
     etc

     for money sign, do the following conversion:
     - £ -> GBP
     - $ -> USD
     - € -> EUR     
    if no money sign is found, then the price is in GBP
     
     Respond in JSON format containing the extracted parameters:
     {{
         param_1: value of param_1
         param_2: value of param_2
         ...
     }}
     
     If a parameter can't be extracted, set its value to null.     
     
"""


## 3.2 Composite prompting template definition

In [27]:
# this is a composite prompt template that combines the operation matching and parameter extraction into a single prompt, used by the function process_input_with_clarification_composite_prompt - this is the showcase the option to combine multiple NLP functional steps into a single prompt, saving the cost of multiple LLM calls, but generally become less controllable on specific aspect of the NLP task. 

composite_prompt_template = """
     Given the following action types and descriptions:
     {available_operations}

     For the user input: '{user_input}'

     1. First determine which action type matches the input. If there are multiple possibilities, list them all. If there are no matches, say 'No match'
     
     2. If an operation matches, extract the required parameters for that operation based on its parameter specifications.

     for numeric values convert them into standard decimal representation:
     - 1k or 1 thousand -> 1000
     - 1m or 1 million -> 1000000
     - 1b or 1 billion -> 1000000000
     etc

     for money sign, do the following conversion:
     - £ -> GBP
     - $ -> USD
     - € -> EUR
    if no money sign is found, then the price is in GBP

     Respond in JSON format:
     {{
        "matched_operation": "exact text matching that of the matched operation or 'no match'",
        "parameters": {{
             "param_1": "value of param_1",
             "param_2": "value of param_2",
             ...
         }} // Only include if operation matched
     }}

     If no match is found, return null for parameters.
     If a parameter can't be extracted, set its value to null.
"""


# 4. Chatbot implementation
Two chatbot calling functions are implemented:
- process_input_with_clarification: this function uses the two-step prompting template definition, firstly focus on operation matching, and then perform parameter extraction
- process_input_with_clarification_composite_prompt: this function uses the composit prompts


Basic user conversational flow is implemented in both functions:
- user input: provided by specifying the text for the user input
- LLM processing: check the user imput against the available operation, and perform parameter extraction if the operation is matched
- if no matched operation is found, then the user is asked to re-enter the command or exit the chatbot
- if a matched operation is found or specified by the user, the LLM will return the matched operation and the parameters for the operation, filling missing parameters with null values


## 4.1 chatbot with two-step prompting

In [6]:
def process_input_with_clarification(user_input):
    query = prompt_template.format(user_input=user_input, available_operations=available_operations)

    # client = OpenAI(api_key=oai_key, organization=oai_org) 
    client = OpenAI(api_key=oai_key2) 


    messages = [
        {"role": "user", "content": query},
    ]    
    kwargs = dict(model=OAI_MODEL, messages=messages)
    completion = client.chat.completions.create(**kwargs)
    response = completion.choices[0].message.content
    matched_operation = response.strip().lower()
    available_operations_names = [op['name'] for op in available_operations]
    
    # first check if the operation is matched by LLM
    if matched_operation == "no match":
        while True:
            print("I couldn't determine which function you want to use. Please choose from the following options:")
            print("\nAvailable operations:")
            for i, func in enumerate(available_operations):
                print(f"{i+1}. {func['name']}: {func['description']}")
            print("\nOther options:")
            print(f"{len(available_operations)+1}. Re-enter your command")
            print(f"{len(available_operations)+2}. Exit")
            
            try:
                choice = int(input("\nEnter your choice number: "))
                if 1 <= choice <= len(available_operations):
                    matched_operation = available_operations[choice-1]['name']
                    break
                elif choice == len(available_operations) + 1:
                    reentry_input = input("Please re-enter your command: ")
                    print(f"re-entered command: {reentry_input}")
                    return process_input_with_clarification(reentry_input)
                elif choice == len(available_operations) + 2:
                    print("Exiting...")
                    return None, None
                else:
                    print("Invalid choice. Please try again.")
            except ValueError:
                print("Please enter a valid number.")
    
    # if the operation is matched by LLM, then extract the parameters for future functional calls
    if matched_operation in available_operations_names:
        for operation in available_operations:
            if operation['name'] == matched_operation:
                parameter_content = ""

                for param, details in operation['parameters']['properties'].items():
                    parameter_content += f"- {param}: {details['description']}\n"

                parameter_extraction_prompt = parameter_extraction_prompt_template.format(user_input=user_input, matched_operation=matched_operation, parameter_content=parameter_content)

                parameter_extraction_response = client.chat.completions.create(model=OAI_MODEL, messages=[{"role": "user", "content": parameter_extraction_prompt}], response_format={"type": "json_object"})

                extracted_params = parameter_extraction_response.choices[0].message.content
                return matched_operation, extracted_params
    else:
        return None, None


In [7]:
# test run
user_input = "“Sell the Ground & Lower Ground unit for  2.5m in December 2026."
matched_operation, extracted_params = process_input_with_clarification(user_input)
print(matched_operation) 
print(extracted_params)


sell_property
{
    "property_name": "Ground & Lower Ground unit",
    "price": 2500000,
    "price_unit": "GBP",
    "date": "2026-12"
}


## 4.2 Chatbot with composite prompt

In [32]:
def process_input_with_clarification_composite_prompt(user_input):
    client = OpenAI(api_key=oai_key2)
    composite_prompt = composite_prompt_template.format(user_input=user_input, available_operations=available_operations)
    messages = [
        {"role": "user", "content": composite_prompt},
    ]
    
    completion = client.chat.completions.create(
        model=OAI_MODEL, 
        messages=messages,
        response_format={"type": "json_object"}
    )
    
    response = completion.choices[0].message.content
    response_json = json.loads(response)
    matched_operation = response_json["matched_operation"].strip().lower()
    available_operations_names = [op['name'] for op in available_operations]

    # Handle no match case
    if matched_operation == "no match":
        while True:
            print("I couldn't determine which function you want to use. Please choose from the following options:")
            print("\nAvailable operations:")
            for i, func in enumerate(available_operations):
                print(f"{i+1}. {func['name']}: {func['description']}")
            print("\nOther options:")
            print(f"{len(available_operations)+1}. Re-enter your command")
            print(f"{len(available_operations)+2}. Exit")
            
            try:
                choice = int(input("\nEnter your choice number: "))
                if 1 <= choice <= len(available_operations):
                    matched_operation = available_operations[choice-1]['name']
                    # Re-run with the chosen operation to get parameters
                    return process_input_with_clarification_composite_prompt(user_input)
                elif choice == len(available_operations) + 1:
                    reentry_input = input("Please re-enter your command: ")
                    print(f"re-entered command: {reentry_input}")
                    return process_input_with_clarification_composite_prompt(reentry_input)
                elif choice == len(available_operations) + 2:
                    print("Exiting...")
                    return None, None
                else:
                    print("Invalid choice. Please try again.")
            except ValueError:
                print("Please enter a valid number.")

    # Handle matched operation case
    if matched_operation in available_operations_names:
        return matched_operation, json.dumps(response_json["parameters"])
    else:
        return None, None


# 5. Running of chatbots
the following cells provide test run results of the chatbots implemented in section 4.1 and 4.2, these cover the operations defined in section 2, additionally a various input formats are tested to observe the chatbot's ability in handling lingustic variations and edge cases. 


In [28]:
# extend_lease
user_input = "Extend the lease for Drake & Morgan Limited by 5 years"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

two-step LLM calling
extend_lease
{
    "property_name": "Drake & Morgan Limited",
    "years": 5,
    "date": null
}
--------------------------------
one-step LLM calling
extend_lease
{"property_name": "Drake & Morgan Limited", "years": 5, "date": null}


In [31]:
# sell_property -1 
user_input = "Sell the Ground & Lower Ground unit for £2.5 million in December 2026"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)


two-step LLM calling
sell_property
{
    "property_name": "Ground & Lower Ground unit",
    "price": 2500000,
    "price_unit": "GBP",
    "date": "2026-12"
}
--------------------------------
one-step LLM calling
sell_property
{"property_name": "Ground & Lower Ground unit", "price": 2500000, "price_unit": "GBP", "date": "2026-12"}


In [33]:
# sell_property 2 - trying various input formats
user_input = "let go the Ground & Lower Ground unit for $2500k in December 2026"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)


two-step LLM calling
sell_property
{
    "property_name": "Ground & Lower Ground unit",
    "price": 2500000,
    "price_unit": "USD",
    "date": "2026-12"
}
--------------------------------
one-step LLM calling
sell_property
{"property_name": "Ground & Lower Ground unit", "price": 2500000, "price_unit": "USD", "date": "2026-12"}


In [35]:
# sell_property 3 - trying various input formats
user_input = "put on the market unit 3 of 10 jason street for 250,000 in January 2026"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)


two-step LLM calling
sell_property
{
    "property_name": "unit 3 of 10 jason street",
    "price": 250000,
    "price_unit": "GBP",
    "date": "2026-01"
}
--------------------------------
one-step LLM calling
sell_property
{"property_name": "unit 3 of 10 jason street", "price": 250000, "price_unit": "GBP", "date": "2026-01"}


In [34]:
# change_area
user_input = "“modify the area of the Stott & May Professional Search Limited to 4200 sqft starting from July 2021"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

two-step LLM calling
change_area

{
    "property_name": "Stott & May Professional Search Limited",
    "area": 4200,
    "area_unit": "sqft",
    "start_date": "2021-07"
}
--------------------------------
one-step LLM calling
change_area
{"property_name": "Stott & May Professional Search Limited", "area": 4200, "area_unit": "sqft", "start_date": "2021-07"}


In [36]:
# change_erv
user_input = "Change the ERV of happy house to £1000k"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

two-step LLM calling
change_erv

{
    "property_name": "happy house",
    "erv": 1000000,
    "erv_unit": "GBP"
}
--------------------------------
one-step LLM calling
change_erv
{"property_name": "happy house", "erv": 1000000, "erv_unit": "GBP"}


In [37]:
# change_erv - 2 - trying various input formats
user_input = "increase the Est. Rental Value of number 20 prince street to 50000"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

two-step LLM calling
change_erv
{
    "property_name": "number 20 prince street",
    "erv": 50000,
    "erv_unit": "GBP"
}
--------------------------------
one-step LLM calling
change_erv
{"property_name": "number 20 prince street", "erv": 50000, "erv_unit": "GBP"}


In [38]:
# change_rent
# in this example, the one-step calling has the better capturing of the house unit
user_input = "set the rent of b5 of emerson house to twice of the current rent of 500 per month"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

two-step LLM calling
change_rent
{
    "property_name": "Emerson House",
    "rent": 1000,
    "rent_unit": "GBP",
    "time_unit": "month"
}
--------------------------------
one-step LLM calling
change_rent
{"property_name": "b5 of emerson house", "rent": 1000, "rent_unit": "GBP", "time_unit": "month"}


In [39]:
# get_rent
user_input = "What is the rent for the tenant on the third floor?"

# two-step LLM calling with two-step prompting
matched_operation, extracted_params = process_input_with_clarification(user_input)
print("two-step LLM calling")
print(matched_operation) 
print(extracted_params)

print("--------------------------------")

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

two-step LLM calling
get_rent
{
    "property_name": null,
    "floor": 3
}
--------------------------------
one-step LLM calling
get_rent
{"property_name": null, "floor": "third"}


In [40]:
# user conversation flow control - only showing one-step calling.
user_input = "I want to play in the park" # deliberately not matching any of the available operations

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

I couldn't determine which function you want to use. Please choose from the following options:

Available operations:
1. extend_lease: Extend the lease for a property
2. sell_property: Sell a property unit
3. change_erv: Change the ERV (Estimated Rental Value) for a property
4. get_rent: Get the current rent for a property
5. change_area: Change the area of a property
6. change_rent: Change the rent for a property

Other options:
7. Re-enter your command
8. Exit
re-entered command: I want to sell the park
one-step LLM calling
sell_property
{"property_name": "the park", "price": null, "price_unit": "GBP", "date": null}


In [41]:
# more complex user input
# as show here, that the name of the property is actually missing, but the LLM is taking it in. This is a limitation that can be prevent with either better user input prompt on missing entity or other information

user_input = "I don't want to have that house under my name anymore, help be get rid of it at half ot the price, I got it for £2.5m, please sell it for me by end of the year"

# one-step LLM calling with composite prompt
matched_operation, extracted_params = process_input_with_clarification_composite_prompt(user_input)
print("one-step LLM calling")
print(matched_operation) 
print(extracted_params)

one-step LLM calling
sell_property
{"property_name": "that house", "price": 1250000, "price_unit": "GBP", "date": "2023-12"}


# 6. Observation & Future Work 
## 6.1 observation and comparsion of the two prompting methods

The output of the two prompting methods seems to be comparable in the example above, with one case (change_rent) where the two-step prompting missing the house unit. Only a limited examples are tested, so would be too early to draw strong conclusion as to which methods is functionally better.

It does look like that the one-step prompting is at least as capable as the two-step appraoch, and is definitely more efficient in terms of the number of LLM calls. Looks like a better approach to take further for further development of both the prompt conent, and the response management of the chatbot. 

## 6.2 Future Work
Due to limited time scope, and the experimental nature of this exercise, there are plenty of scope to expand the work in several aspects, these are listed in the following 

### 6.2.1 further work on data structure design
<b> property data structure </b> It should be desirable to include in the prompting templates some a general representation of the data structure for the properties under management, typically this could come with the following attributes:

- property_id - for unique identification of property in underlying database 
- properyt_name
- area
- erv
- rent
- status - rented/listed/recently_sold/recently_rented/unlisted
- value - for_rent/for_sale/rented/sold_value 
- operation_target_date - target date of the last valid operation 
- operation_date - date of the last valid operation 

<b> operation data structure </b> The operation data structure can be further refined to be consistent with API call or other functional calls signature, this would provide a more explicit link between the chatbot and the underlying property management system.  This also can facilitate multi-steps operations to be performed single user chat session, with additional feedback provided to the user as of the execution status of the previous operations. 


### 6.2.2 improvement of conversational control 
Ideally this chatbot should be implented in conjunction with conversational AI framework, such as [Rasa](https://rasa.com/) for more professional integration, or with easier to integration chat-oriented application such as telegram for experimental purpose. 

As for the above technical exercise, the conversational control can be much improved by having a better connection with the LLM's response, and offer options accordingly based on such response. For instance, if certain required parametes are given null value by the LLM, options should be provided for the user to fill in the missing information. 

### 6.2.3 implementation of non-proprietary LLM 
With proprietary LLM API such as OpenAI, one can build state-of-the-art chatbot with little overhead in the form of model fine-tuning, and can expect increasingly better performance at least in the short-mid term as computation scaling law still upholds. This however come with the drawback of high API cost espeically in the situation where the chatbot application is to be used by large number of users. Additionally, we don't have directly control in terms of down time and quality degradation. Also, the fine-tuning option of such APIs are sometime limited to selected version of models and often come with additional tech constraints 

With open source LLM such as Llama 3.1, especially with the [smaller version](https://huggingface.co/meta-llama/Llama-3.1-8B), we have the option to tail the chatbot's behaviour to proprietary property information either by doing RAG operation on text that are more focus on property information. Additionally, with additional user/chatbot conversation data, we can perform efficient fine-tuning with algorithm such as 
[QLoRA](https://www.entrypointai.com/blog/lora-fine-tuning/)


## 6.2.4 evaluation and improvement of conversation quality 
For any LLM app, including chatbot, it is critial to have a well-established workflow to set up benchmark on conversational quality and perform continuous evaluation on chatbot/model iteration. This is important for both the purpose of continuous improvement, but also to filling the gap on the chatbot's awareness of any niece topic.  

The following concreate actions would shall be considered before the project move beyond early versions:
- Setup process to curate chat dataset tailored to the property management information - from existing documentation, human expert knowledge, and synthetically generated data 

- Work on evaluation metrics that reflect the chatbot's ability in aiding conversation on property management topics - potentially a balanced consideration between chatbot's ability at facilitating accurate action, and the general chat quality at understanding diverse input format and quality with proper response 

- Generate labelled chatdata set according the above evaluation metrics, perhaps with a blend of human labelling, LLM labelling, and depend on the context other automatically marking mechanism 

- Setup experiment control and tracking mechanism, perhaps to start with setting up simple tracking with package such as [mlflow](https://mlflow.org/docs/latest/index.html)

- systematic improvement of prompt quality - build up repository of quality prompts, and also use automatic package such as [TextGrad](https://github.com/zou-group/textgrad)

- explore the explanatbility on chatbo's response - this is an area still under developed, but one can investigate the application of existing package that NLP suitable, such as [LIME](https://github.com/marcotcr/lime), as well as devising explanation mechanisum via basic principles - such as statistical response between core vocabulary, language style, and the chatbot's response quality 










