# Customer support chatbot

Below is an example of a customer support chatbot modeled as a state machine. It uses the simpler `MessageGraph` version of LangGraph, and is designed to work with smaller models by reducing the decision space a given LLM call has.

The entrypoint is a node containing a chain that we have prompted to answer basic questions, but delegate questions related to billing or technical support to other "teams".

Depending on this entry node's response, the edge from that node will use an LLM call to determine whether to respond directly to the user or invoke either the `billing_support` or `technical_support` nodes.

- The technical support will attempt to answer the user's question with a more focused prompt.
- The billing agent can choose to answer the user's question, or can authorize a refund (currently just returns directly to the user with an acknowledgement).

![Diagram](./diagram.png)

This is intended as a sample, proof of concept architecture - you could extend this example by giving individual nodes the ability to perform retrieval, other tools, adding human-in-the-loop/prompting the user for responses, delegating to more powerful models at deeper stages etc.

Let's dive in!

## Setup

First we need to install the required packages. We'll use Cloudflare's Workers AI to run the required inference.

```bash
yarn add @langchain/langgraph @langchain/cloudflare
```

You'll also need to set the following environment variable. You can get them from your Cloudflare dashboard:

```ini
CLOUDFLARE_ACCOUNT_ID=
CLOUDFLARE_API_TOKEN=
```

## Initializing the model

First, we define the LLM we'll use for all calls and the LangGraph state. We'll use a chat fine-tuned version of Mistral 7B called `neural-chat-7b-v3-1-awq`:

In [1]:
import { ChatCloudflareWorkersAI } from "@langchain/cloudflare";
import { MessageGraph } from "@langchain/langgraph";

const model = new ChatCloudflareWorkersAI({
  model: "@hf/thebloke/neural-chat-7b-v3-1-awq",
  temperature: 0,
});

const graph = new MessageGraph();

As an exercise, let's see what happens with a naive attempt to get the model to answer questions:

In [2]:
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";

const naivePrompt = ChatPromptTemplate.fromTemplate(
  `You are an expert support specialist, able to answer any question about LangCorp, a company that sells computers.`
);

const chain = naivePrompt.pipe(model).pipe(new StringOutputParser());

const res = await chain.invoke("I've changed my mind and I want a refund for order #182818!");

console.log(res);

LangCorp is a company that specializes in selling computers and related accessories. They offer a wide range of products, including laptops, desktops, monitors, keyboards, mice, and other peripherals. Their goal is to provide customers with high-quality, reliable, and affordable technology solutions to meet their needs.

LangCorp's team of experts is always ready to assist customers in finding the perfect computer setup for their requirements. They offer personalized advice, help with configuration, and provide support throughout the entire purchasing process. Additionally, LangCorp ensures that their products are backed by warranties and after-sales services to guarantee customer satisfaction.

In summary, LangCorp is a company that focuses on selling computers and related accessories, aiming to provide customers with the best technology solutions for their needs. They offer a variety of products, expert advice, and excellent customer support to ensure a seamless shopping experience.


Not super helpful. We can do better!

## Laying out the graph

Now let's start defining our nodes. Each node's return value will be added to the graph state, which for `MessageGraph` is a list of messages. This state will be passed to the next executed node, or returned if execution has finished.

Let's define our entrypoint node. This will be modeled after a secretary who can handle incoming questions and respond conversationally or route to a more specialized team:

In [3]:
import { MessagesPlaceholder } from "@langchain/core/prompts";
import type { BaseMessage } from "@langchain/core/messages";

graph.addNode("initial_support", async (state: BaseMessage[]) => {
  const SYSTEM_TEMPLATE = `You are frontline support staff for LangCorp, a company that sells computers.
Be concise in your responses.
You can chat with customers and help them with basic questions, but if the customer is having a billing or technical problem,
do not try to answer the question directly or gather information.
Instead, immediately transfer them to the billing or technical team by asking the user to hold for a moment.
Otherwise, just respond conversationally.`;

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", SYSTEM_TEMPLATE],
    new MessagesPlaceholder("messages"),
  ]);

  return prompt.pipe(model).invoke({ messages: state });
});

graph.setEntryPoint("initial_support");

Next, our nodes representing billing and technical support. We give special instructions in the billing prompt that it can choose to authorize refunds by routing to another agent:

In [4]:
graph.addNode("billing_support", async (state: BaseMessage[]) => {
  const SYSTEM_TEMPLATE = `You are an expert billing support specialist for LangCorp, a company that sells computers.
Help the user to the best of your ability, but be concise in your responses.
You have the ability to authorize refunds, which you can do by transferring the user to another agent who will collect the required information.
If you do, assume the other agent has all necessary information about the customer and their order.
You do not need to ask the user for more information.`;

  let messages = state;
  // Make the user's question the most recent message in the history.
  // This helps small models stay focused.
  if (messages[messages.length - 1]._getType() === "ai") {
    messages = state.slice(0, -1);
  }

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", SYSTEM_TEMPLATE],
    new MessagesPlaceholder("messages"),
  ]);
  return prompt.pipe(model).invoke({ messages });
});

graph.addNode("technical_support", async (state: BaseMessage[]) => {
  const SYSTEM_TEMPLATE = `You are an expert at diagnosing technical computer issues. You work for a company called LangCorp that sells computers.
Help the user to the best of your ability, but be concise in your responses.`;

  let messages = state;
  // Make the user's question the most recent message in the history.
  // This helps small models stay focused.
  if (messages[messages.length - 1]._getType() === "ai") {
    messages = state.slice(0, -1);
  }

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", SYSTEM_TEMPLATE],
    new MessagesPlaceholder("messages"),
  ]);
  return prompt.pipe(model).invoke({ messages });
});

Finally, a node that can handle refunds. The logic is stubbed out here since it's not a real system:

In [5]:
import { AIMessage } from "@langchain/core/messages";

graph.addNode("refund_tool", async (state) => {
  return new AIMessage("Refund processed!");
});

## Connecting the nodes

Great! Now let's move onto the edges. These edges will evaluate the current state of the graph created by the return values of the individual nodes and route execution accordingly.

First, we want our `initial_support` node to either delegate to the billing node, technical node, or just respond directly to the user. Here's one example of how we might do that:

In [6]:
import { END } from "@langchain/langgraph";


graph.addConditionalEdges("initial_support", async (state) => {
  const mostRecentMessage = state[state.length - 1];
  const SYSTEM_TEMPLATE = `You are an expert customer support routing system.
Your job is to detect whether a customer support representative is routing a user to a billing team or a technical team, or if they are just responding conversationally.`;
  const HUMAN_TEMPLATE = `The previous conversation is an interaction between a customer support representative and a user.
Extract whether the representative is routing the user to a billing or technical team, or whether they are just responding conversationally.

If they want to route the user to the billing team, respond only with the word "BILLING".
If they want to route the user to the technical team, respond only with the word "TECHNICAL".
Otherwise, respond only with the word "RESPOND".

Remember, only respond with one of the above words.`;
  const prompt = ChatPromptTemplate.fromMessages([
    ["system", SYSTEM_TEMPLATE],
    new MessagesPlaceholder("messages"),
    ["human", HUMAN_TEMPLATE],
  ]);
  const chain = prompt
    .pipe(model)
    .pipe(new StringOutputParser());
  const rawCategorization = await chain.invoke({ messages: state });
  if (rawCategorization.includes("BILLING")) {
    return "billing";
  } else if (rawCategorization.includes("TECHNICAL")) {
    return "technical";
  } else {
    return "conversational";
  }
}, {
  billing: "billing_support",
  technical: "technical_support",
  conversational: END
});

**Note:** We do not use function/tool calling here for extraction because our model does not support it, but this would be a reasonable time to use that if your model does.

Let's continue. We add an edge making the technical support node always end, since it has no tools to call. The billing support node uses a conditional edge since it can either call the refund tool or end.

In [7]:
graph.addEdge("technical_support", END);

graph.addConditionalEdges("billing_support", async (state) => {
  const mostRecentMessage = state[state.length - 1];
  const SYSTEM_TEMPLATE = `Your job is to detect whether a billing support representative wants to refund the user.`;
  const HUMAN_TEMPLATE = `The following text is a response from a customer support representative.
Extract whether they want to refund the user or not.
If they want to refund the user, respond only with the word "REFUND".
Otherwise, respond only with the word "RESPOND".

Here is the text:

<text>
{text}
</text>

Remember, only respond with one word.`;
  const prompt = ChatPromptTemplate.fromMessages([
    ["system", SYSTEM_TEMPLATE],
    ["human", HUMAN_TEMPLATE],
  ]);
  const chain = prompt
    .pipe(model)
    .pipe(new StringOutputParser());
  const response = await chain.invoke({ text: mostRecentMessage.content });
  if (response.includes("REFUND")) {
    return "refund";
  } else {
    return "end";
  }
}, {
  refund: "refund_tool",
  end: END
});

graph.addEdge("refund_tool", END);

Let's lock it in by calling `.compile()`:

In [8]:
const runnable = graph.compile();

And now let's test it!

We can get the returned value from the executed nodes as they are generated using the `.stream()` runnable method (we also could go even more granular and get output as it is generated using `.streamEvents()`, but this requires a bit more parsing).

Here's an example with a billing related refund query. Because we are using `MessageGraph`, the input must be a message (or a list of messages) representing the user's question:

In [9]:
import { HumanMessage } from "@langchain/core/messages";

const stream = await runnable.stream(
  new HumanMessage("I've changed my mind and I want a refund for order #182818!")
);

for await (const value of stream) {
  // Each node returns only one message
  const [nodeName, output] = Object.entries(value)[0];
  if (nodeName !== END) {
    console.log("---STEP---")
    console.log(nodeName, output.content);
    console.log("---END STEP---")
  }
}

---STEP---
initial_support To request a refund for order #182818, please hold for a moment while I transfer you to our billing team.
---END STEP---
---STEP---
billing_support To process your refund, please transfer the call to our refunds team. They will guide you through the necessary steps.
---END STEP---
---STEP---
refund_tool Refund processed!
---END STEP---


[Click here to see a LangSmith trace of the above run](https://smith.langchain.com/public/08fb80d9-4ec2-4460-a62d-f7fdc1a21f96/r)

Now, let's try a technical question:

In [10]:
const stream = await runnable.stream(
  new HumanMessage("My LangCorp computer isn't turning on because I dropped it in water.")
);

for await (const value of stream) {
  // Each node returns only one message
  const [nodeName, output] = Object.entries(value)[0];
  if (nodeName !== END) {
    console.log("---STEP---")
    console.log(nodeName, output.content);
    console.log("---END STEP---")
  }
}

---STEP---
initial_support I'm sorry to hear that. Please hold for a moment while I transfer you to our technical team who can help you with this issue.
---END STEP---
---STEP---
technical_support Unfortunately, your computer is likely damaged beyond repair due to water exposure. You should consider purchasing a new one. Contact LangCorp for assistance if needed.

Note: Always seek professional help for water-damaged electronics. Drying them out may not be enough to fix the issue.
---END STEP---


[Click here to see a LangSmith trace of the above run](https://smith.langchain.com/public/787fd20a-dea8-426b-bfb2-ebb0aed21505/r)

We can see the query gets correctly routed to the technical support node!

Finally, let's try a simple conversational response:

In [11]:
const stream = await runnable.stream(
  new HumanMessage("How are you? I'm Cobb.")
);

for await (const value of stream) {
  // Each node returns only one message
  const [nodeName, output] = Object.entries(value)[0];
  if (nodeName !== END) {
    console.log("---STEP---")
    console.log(nodeName, output.content);
    console.log("---END STEP---")
  }
}

---STEP---
initial_support Hi Cobb, I'm doing well. How can I help you today?
---END STEP---


[Click here to see a LangSmith trace of the above run](https://smith.langchain.com/public/095dd3af-19d6-4377-95f3-4c219c47a87d/r)