# Custom Prompt using LangChain

The following code snippets are utilizing LangChain. A handy library that provides convenient functions for creating advanced chatbots.

We are going to generate fake property listings and create an artificial real estate agent hat would provide recommendations on these based on the user input. All with just a few lines of code (!).

## Data Generation

We require a set of listings for the artificial real estate agent to make suggestions to potential buyers based on their preferences.

Now, instead of creating some listings ourself, we can utilize ChatGPT.

In [1]:
from langchain.prompts import PromptTemplate
from langchain_openai.llms import OpenAI

OPENAI_API_KEY = "<YOUR API KEY HERE>"

completion_model_name = "gpt-3.5-turbo-instruct"
temperature = 1.2
llm = OpenAI(model_name=completion_model_name, temperature=temperature, max_tokens = 1000, openai_api_key = OPENAI_API_KEY)

prompt_template = PromptTemplate.from_template(
    """Act as a real estate agent and generate {number_of_listings} listings of houses in csv format using ; as delimiter.
       Do not use ; within the descriptions.
       The csv file requires to contain the following columns:
       Price, Bedrooms, House Size, Description, Neighborhood Description"
    """
)

output = llm.invoke(prompt_template.format(number_of_listings = 10))
#print(output)

Next is to store the data as a CSV file. Since the output is a string, we need to re-interpret it as a CSV.

In [2]:
import csv
import pandas as pd
reader = csv.reader(output.lstrip().split('\n'), delimiter=';')
listings_df = pd.DataFrame(list(reader))
listings_df

Unnamed: 0,0
0,"Price, Bedrooms, House Size, Description, Neig..."
1,
2,"200,000; 4; 2000 sqft; Beautiful ranch style h..."
3,"350,000; 5; 3000 sqft; Spacious Colonial home ..."
4,"500,000; 6; 4000 sqft; Stunning Mediterranean ..."
5,"175,000; 3; 1500 sqft; Charming townhouse with..."
6,"250,000; 4; 2200 sqft; Updated split-level hom..."
7,"450,000; 4; 3400 sqft; Immaculate custom Colon..."
8,"300,000; 5; 2900 sqft; Classic Victorian home ..."
9,"600,000; 6; 4500 sqft; Luxurious estate home w..."


=> That looks like an interesting and moreover, quite realistic set of listings. Lets store it into a file.


In [3]:
listings_df.to_csv("data/listings.csv")

## Data Storage

In order to properly unilize the information provided in the CSV, we need to make it "machine readable" (or "LLM readble") as the contents require to be converted into vector repesentation. Depending on the amount of data provided, this step could require lots of computational and memory resources. Hence, using a database represents an efficient approach.

Lets create a vector database that stores the information from the simple CSV file ONCE, adds the corresponding embeddings and accesses the content on demand.



In [4]:
from langchain.document_loaders.csv_loader import CSVLoader # import the handy loader from LangChain
from langchain_openai import OpenAIEmbeddings # import the OpenAI embeddings

from langchain.vectorstores import Chroma # we are using the Chroma vector database

csv_file = CSVLoader(file_path = "data/listings.csv").load()

open_ai_embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
local_db = Chroma.from_documents(csv_file, open_ai_embeddings)

## Chat Implementation

As of now, we have created a dataset and we have converted the dataset into a useful representation. Time for the fun part.

Next is the implementation of a user conversation where we would like to 

1. get an initial preference from the user regarding their living arrangements
2. make ChatGPT respond with the listing that best matches their preference from the available listings

For this, we would need to offer some choices for preferences to the client.

In [5]:
# lets define a set of questions first
questions = [   
                "How big do you want your house to be?",
                "What are 3 most important things for you in choosing this property?", 
                "Which amenities would you like?", 
                "Which transportation options are important to you?",
                "How urban do you want your neighborhood to be?",   
            ]

### Next, we collect the preferences.

In [7]:
# lets get an input from the user
prompt_template_welcome = PromptTemplate.from_template(
    """You are a kind and welcoming real estate agent.\n
       A potential client has entered your office and is keen to become a property owner.\n
       Introduce yourself using a name and company of your choice and let him know that you need to ask him a few questions.\n
    """
)

# run this cell in your Jupyter Notebook to modify answers. The input() function enables an input prompt.
print(llm.invoke(prompt_template_welcome.format()))
print("\n")
answers = []
for item in questions:
    print("\n")
    print(item)
    current_answer = input()
    answers.append(current_answer)

"Hello, my name is Amanda Jones and I am an agent with Jones Realty. Welcome to our office! I am thrilled to hear that you are interested in becoming a property owner. Before we get started, may I ask you a few questions to understand your needs better?" 




How big do you want your house to be?


 A comfortable three-bedroom house with a spacious kitchen and a cozy living room.




What are 3 most important things for you in choosing this property?


 A quiet neighborhood, good local schools, and convenient shopping options.




Which amenities would you like?


 A backyard for gardening, a two-car garage, and a modern, energy-efficient heating system.




Which transportation options are important to you?


 Easy access to a reliable bus line, proximity to a major highway, and bike-friendly roads.




How urban do you want your neighborhood to be?


 A balance between suburban tranquility and access to urban amenities like restaurants and theaters.


### And request a recomendation

In [11]:
query = "Based on the answers in the context, recommend a listing. Make the recommendation sound appealing."

pr = PromptTemplate(
    template = "{query}\n Context: {context}",
    input_variables = ["query", "context"]
)

context = local_db.similarity_search(" ".join(answers))

from langchain.chains.question_answering import load_qa_chain

chain = load_qa_chain(llm, prompt = pr, chain_type="stuff")

print(chain.run(input_documents = context, query = query))



I would recommend the listing for $350,000 with 5 bedrooms, 3000 sqft, and a stunning two story foyer. Perfect for a growing family, this spacious Colonial home boasts an open floor plan, a deck for entertaining, and an inviting master suite with a soaking tub and walk-in closet. Not only is the home impressive, but it's also located in a highly sought-after neighborhood with easy access to highways and amenities. Don't miss out on the opportunity to make this listing your new dream home!


## Summary

We have generated some data, converted the data into vector representation, created a custom request and custom response from ChatGPT.

The entire implementation has less than 60 lines of code and performs surprisingly well. The generated data as well as the responses seem very realistic and natural. There are a vast amount of potential extension for this little example. For instance:
1. Extend the chatbot with a conversational history to enable more a fluent conversations and
2. Place the code into a Vercel AI Chatbot (https://vercel.com/templates/next.js/nextjs-ai-chatbot)
3. ...