<a href="https://colab.research.google.com/github/nagababumo/LangChain.js/blob/main/web%20api.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lesson 6: Shipping as a web API

In [None]:
import "dotenv/config";

[Module: null prototype] { default: {} }

In [None]:
import {
  loadAndSplitChunks,
  initializeVectorstoreWithDocuments
} from "./lib/helpers.ts";

const splitDocs = await loadAndSplitChunks({
  chunkSize: 1536,
  chunkOverlap: 128,
});

const vectorstore = await initializeVectorstoreWithDocuments({
  documents: splitDocs,
});

const retriever = vectorstore.asRetriever();

In [None]:
import {
  createDocumentRetrievalChain,
  createRephraseQuestionChain
} from "./lib/helpers.ts";

const documentRetrievalChain = createDocumentRetrievalChain();
const rephraseQuestionChain = createRephraseQuestionChain();

In [None]:
import { ChatPromptTemplate, MessagesPlaceholder } from "@langchain/core/prompts";

const ANSWER_CHAIN_SYSTEM_TEMPLATE = `You are an experienced researcher,
expert at interpreting and answering questions based on provided sources.
Using the below provided context and chat history,
answer the user's question to the best of your ability
using only the resources provided. Be verbose!

<context>
{context}
</context>`;

const answerGenerationChainPrompt = ChatPromptTemplate.fromMessages([
  ["system", ANSWER_CHAIN_SYSTEM_TEMPLATE],
  new MessagesPlaceholder("history"),
  [
    "human",
    `Now, answer this question using the previous context and chat history:

    {standalone_question}`
  ]
]);

In [None]:
import {
  RunnablePassthrough,
  RunnableSequence
} from "@langchain/core/runnables";
import { ChatOpenAI } from "@langchain/openai";

const conversationalRetrievalChain = RunnableSequence.from([
  RunnablePassthrough.assign({
    standalone_question: rephraseQuestionChain,
  }),
  RunnablePassthrough.assign({
    context: documentRetrievalChain,
  }),
  answerGenerationChainPrompt,
  new ChatOpenAI({ modelName: "gpt-3.5-turbo-1106" }),
]);

In [None]:
import { HttpResponseOutputParser } from "langchain/output_parsers";

// "text/event-stream" is also supported
const httpResponseOutputParser = new HttpResponseOutputParser({
  contentType: "text/plain"
});

In [None]:
import { RunnableWithMessageHistory } from "@langchain/core/runnables";
import { ChatMessageHistory } from "langchain/stores/message/in_memory";

const messageHistory = new ChatMessageHistory();

const finalRetrievalChain = new RunnableWithMessageHistory({
  runnable: conversationalRetrievalChain,
  getMessageHistory: (_sessionId) => messageHistory,
  historyMessagesKey: "history",
  inputMessagesKey: "question",
}).pipe(httpResponseOutputParser);

Additionally, we'll want to bear in mind that users should not share chat histories, and we should create a new history object per session:

In [None]:
const messageHistories = {};

const getMessageHistoryForSession = (sessionId) => {
    if (messageHistories[sessionId] !== undefined) {
        return messageHistories[sessionId];
    }
    const newChatSessionHistory = new ChatMessageHistory();
    messageHistories[sessionId] = newChatSessionHistory;
    return newChatSessionHistory;
};

We'll recreate our final chain with this new method:

In [None]:
const finalRetrievalChain = new RunnableWithMessageHistory({
  runnable: conversationalRetrievalChain,
  getMessageHistory: getMessageHistoryForSession,
  inputMessagesKey: "question",
  historyMessagesKey: "history",
}).pipe(httpResponseOutputParser);

In [None]:
const port = 8087;

In [None]:
const handler = async (request: Request): Response => {
  const body = await request.json();
  const stream = await finalRetrievalChain.stream({
    question: body.question
  }, { configurable: { sessionId: body.session_id } });

  return new Response(stream, {
    status: 200,
    headers: {
      "Content-Type": "text/plain"
    },
  });
};

In [None]:
Deno.serve({ port }, handler);

Listening on http://localhost:8087/


{
  finished: Promise { [36m<pending>[39m },
  shutdown: [36m[AsyncFunction: shutdown][39m,
  ref: [36m[Function: ref][39m,
  unref: [36m[Function: unref][39m,
  [[32mSymbol(Symbol.asyncDispose)[39m]: [36m[Function: [Symbol.asyncDispose]][39m
}

In [None]:
const decoder = new TextDecoder();

// readChunks() reads from the provided reader and yields the results into an async iterable
function readChunks(reader) {
  return {
    async* [Symbol.asyncIterator]() {
      let readResult = await reader.read();
      while (!readResult.done) {
        yield decoder.decode(readResult.value);
        readResult = await reader.read();
      }
    },
  };
}

const sleep = async () => {
  return new Promise((resolve) => setTimeout(resolve, 500));
}

In [None]:
const response = await fetch(`http://localhost:${port}`, {
    method: "POST",
    headers: {
        "content-type": "application/json",
    },
    body: JSON.stringify({
        question: "What are the prerequisites for this course?",
        session_id: "1", // Should randomly generate/assign
    })
});

// response.body is a ReadableStream
const reader = response.body?.getReader();

for await (const chunk of readChunks(reader)) {
  console.log("CHUNK:", chunk);
}

await sleep();

CHUNK: Based o
CHUNK: n the provided c
CHUNK: ontext an
CHUNK: d chat his
CHUNK: tory, the requ
CHUNK: irements fo
CHUNK: r the course t
CHUNK: aught by
CHUNK:  instructor Andr
CHUNK: ew Ng includ
CHUNK: e familiarity 
CHUNK: with basi
CHUNK: c probability 
CHUNK: and statistics. Thi
CHUNK: s includes knowledge
CHUNK:  of rand
CHUNK: om varia
CHUNK: bles, expectati
CHUNK: on, variance,
CHUNK:  and other fo
CHUNK: undationa
CHUNK: l concepts. The
CHUNK:  instructor mentions that mos
CHUNK: t undergr
CHUNK: aduate
CHUNK:  statistics class
CHUNK: es, such as
CHUNK:  Stat 116 at
CHUNK:  Stanford, wil
CHUNK: l cover the nece
CHUNK: ssary mater
CHUNK: ial.

Further
CHUNK: more, basic 
CHUNK: familiarity with linear algebra i
CHUNK: s also assum
CHUNK: ed. This includes underst
CHUNK: anding of ma
CHUNK: trices, v
CHUNK: ectors, matrix 
CHUNK: multiplic
CHUNK: ation, and matrix inver
CHUNK: se. Courses like Math
CHUNK:  51, 103,
CHUNK:  Math 113, or 
CHUNK: CS205 at Stanford are
CHUNK:  menti

In [None]:
const response = await fetch(`http://localhost:${port}`, {
  method: "POST",
  headers: {
    "content-type": "application/json",
  },
  body: JSON.stringify({
    question: "Can you list them in bullet point format?",
    session_id: "1", // Should randomly generate/assign
  })
});

// response.body is a ReadableStream
const reader = response.body?.getReader();

for await (const chunk of readChunks(reader)) {
  console.log("CHUNK:", chunk);
}

await sleep();

CHUNK: Certainly! Base
CHUNK: d on the
CHUNK:  provided
CHUNK:  context 
CHUNK: and chat his
CHUNK: tory, here 
CHUNK: are the prerequisit
CHUNK: es for th
CHUNK: is course in
CHUNK:  bullet poin
CHUNK: t format:

- Famili
CHUNK: arity with bas
CHUNK: ic probabil
CHUNK: ity and statistics
CHUNK: , including kno
CHUNK: wledge of ra
CHUNK: ndom var
CHUNK: iables, e
CHUNK: xpectation
CHUNK: , and variance
CHUNK: . Most u
CHUNK: ndergradu
CHUNK: ate statistics class
CHUNK: es, such as Stat
CHUNK:  116 at Sta
CHUNK: nford, cove
CHUNK: r the necessary mater
CHUNK: ial.
- Bas
CHUNK: ic familiarity with lin
CHUNK: ear algebra
CHUNK: , including understan
CHUNK: ding of matric
CHUNK: es, vectors,
CHUNK:  matrix multipl
CHUNK: ication, and matri
CHUNK: x inverse. C
CHUNK: ourses like Math
CHUNK:  51, 103,
CHUNK:  Math 113, or CS
CHUNK: 205 at Stanford are mentioned
CHUNK:  as providing sufficie
CHUNK: nt backgr
CHUNK: ound for
CHUNK:  this requirement
CHUNK: .
- The ability to
CHUNK:  understand

In [None]:
const response = await fetch(`http://localhost:${port}`, {
  method: "POST",
  headers: {
    "content-type": "application/json",
  },
  body: JSON.stringify({
    question: "What did I just ask you?",
    session_id: "2", // Should randomly generate/assign
  })
});

// response.body is a ReadableStream
const reader = response.body?.getReader();

for await (const chunk of readChunks(reader)) {
  console.log("CHUNK:", chunk);
}

await sleep();

CHUNK: Based on the cont
CHUNK: ext and chat 
CHUNK: history provi
CHUNK: ded, you 
CHUNK: didn't exp
CHUNK: licitly ask a q
CHUNK: uestion,
CHUNK:  but you did re
CHUNK: quest for the oth
CHUNK: er person to
CHUNK:  repeat w
CHUNK: hat you had s
CHUNK: aid. It 
CHUNK: seems like
CHUNK:  there may h
CHUNK: ave been so
CHUNK: me background nois
CHUNK: e or an is
CHUNK: sue with the cl
CHUNK: arity of
CHUNK:  the communica
CHUNK: tion. The conve
CHUNK: rsation also t
CHUNK: ouched on the topic
CHUNK:  of audio recording and 
CHUNK: separating v
CHUNK: oices, so 
CHUNK: it's possible that
CHUNK:  there was a t
CHUNK: echnical discu
CHUNK: ssion relate
CHUNK: d to that.

In gen
CHUNK: eral, ask
CHUNK: ing for repetition
CHUNK:  in a con
CHUNK: versation can occur du
CHUNK: e to various reasons
CHUNK:  such as noise 
CHUNK: interfer
CHUNK: ence, un
CHUNK: clear sp
CHUNK: eech, or a
CHUNK:  simple
CHUNK:  request for clarification. It's 
CHUNK: importan
CHUNK: t to ensure 
CHUNK: clear commu