# Reffie Take-Home Interview:  Building smart reply feature

Trung Le

May 8, 2024

In [47]:
## Import necessary libaries

from langchain_community.chat_models import ChatOllama
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.prompts.chat import ChatMessage
from langchain.memory import ChatMessageHistory

In order to implement smart reply feature using LLMs, I use the following external libraries:
- `ollama` for running LLMs locally
- `langchain` modularizing LLM-building workflow and support for multiple LLMs
- `pydantic` for coercing LLM output into a structured format

### Defining the `pydantic` model

Smart Reply feature requires exactly three reply suggestions. Therefore, I construct a Pydantic model `SmartReplies` to faciliate structuring and validation of LLMs output to a class of which instance includes exactly three fields. This Pydantic model allows easy validation and serialization of data, serving as a key component for the frontend.

`PydanticOutputParser` will parse the LLM output to an instance of the `SmartReplies` or raise a `ValidationError` if the output cannot form a valid model.

In [37]:
# Defining the Pydantic class
class SmartReplies(BaseModel):
    reply_1: str = Field(description = "Smart Reply 1")
    reply_2: str = Field(description = "Smart Reply 2")
    reply_3: str = Field(description =" Smart Reply 3")

        
parser = PydanticOutputParser(pydantic_object=SmartReplies)

The parser will create instruction prompt will can be passed to the LLM to guide the LLM

In [50]:
print(parser.get_format_instructions())

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"reply_1": {"title": "Reply 1", "description": "Smart Reply 1", "type": "string"}, "reply_2": {"title": "Reply 2", "description": "Smart Reply 2", "type": "string"}, "reply_3": {"title": "Reply 3", "description": " Smart Reply 3", "type": "string"}}, "required": ["reply_1", "reply_2", "reply_3"]}
```


### Building our LLM workflow using `langchain`

`langchain` provides a framework that makes it easy to build LLM application. One of the strengths of using `langchain` is the use of chains, i.e. sequencing multiple components together. Here I use a simple chain: prompt + model + output parser.

#### Prompt



#### Model

For the model, I use a Llama3 as the LLM of choice, since it is the best open-sourced LLMs on the market right now, with Ollama simplifying running LLM on local machine. 



In [40]:
# Define template

model = ChatOllama(model = 'llama3', temperature=0.5)

system_prompt = """You are a helpful assistant to the responder. Your role is to suggest EXACTLY 3 distinct responses for the responder in a conversation with the human.

Each suggestions contains fewer than {num_words}. Each suggestions represents what the responder would most likely send to the other person, based on the conversation history.

If the human is requesting truths, please suggest truthfully. If the human is asking open-ended questions, please suggest creatively.

The conversation history is below: \n{chat_history}\n. Follow this format:\n {format_instruction}. Do not include "properties" from schema.
"""

template = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human",  "{message}")
])


chain = template | model | parser

In [41]:
# Add memory retrieval

def parse_history(chat_history: ChatMessageHistory) -> str:
    result = []
    for message in chat_history.messages:
        if hasattr(message, 'type'):
            if message.type == 'human':
                result.append(f"human: {message.content}")
            elif message.type == 'ai':
                result.append(f"ai: {message.content}")
            elif message.type == 'chat':
                 result.append(f"{message.role}: {message.content}")
    return '\n'.join(result)


def get_replies(message: str, chat_history: ChatMessageHistory, save_to_history: bool = False) -> SmartReplies:
    response = chain.invoke({"message": message,
                  "num_words": 10,
                  "format_instruction": parser.get_format_instructions(),
                  "chat_history": parse_history(chat_history)})
    if save_to_history:
        chat_history.add_user_message(message)
    return response

def reply(message: str, chat_history: ChatMessageHistory):
    chat_history.add_message(ChatMessage(role = "responder", content = message))

## Test run

In [42]:
# Test run

memory_1 = ChatMessageHistory()

get_replies(message="Hi. Can I see what time it is?", chat_history=memory_1, save_to_history=True)

SmartReplies(reply_1="I'm happy to help! The current time is...", reply_2="Let me check that for you... It's currently...", reply_3="Time request received! That's...")

In [43]:
reply(message="It's 9AM", chat_history=memory_1)

In [44]:
get_replies(message="So is it nighttime or morning time?", chat_history=memory_1, save_to_history=True)

SmartReplies(reply_1="It's morning, specifically 9AM.", reply_2="Morning time! It's 9 o'clock.", reply_3='Good morning! The current time is 9AM.')

In [45]:
reply(message="Morning time", chat_history=memory_1)

In [46]:
get_replies(message="What is current time in 24hr format?", chat_history=memory_1)

SmartReplies(reply_1='09:00', reply_2="It's morning time!", reply_3='Current time is 09:00')