# Langchain Async

One of the biggest pain-points developers discuss when trying to build useful LLM applications is latency; these applications often make multiple calls to LLM APIs, each one taking a few seconds. It can be quite a frustrating user experience to stare at a loading spinner for more than a couple seconds. Streaming helps reduce this perceived latency by returning the output of the LLM token by token, instead of all at once.

This notebook demonstrates how to monitor a LangChain streaming app with TruLens.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/trulens_eval/examples/quickstart/langchain_async.ipynb)

### Import from LangChain and TruLens

In [1]:
# ! pip install trulens_eval==0.18.1 langchain>=0.0.342

In [2]:
import asyncio

from langchain import LLMChain
from langchain.prompts import PromptTemplate
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.chains import LLMChain
from langchain.chat_models.openai import ChatOpenAI
from langchain.llms.openai import OpenAI
from langchain.memory import ConversationSummaryBufferMemory

from trulens_eval import Feedback
from trulens_eval import feedback
from trulens_eval import Tru
import trulens_eval.utils.python  # makes sure asyncio gets instrumented

## Setup
### Add API keys
For this example you will need Huggingface and OpenAI keys

In [3]:
import os
os.environ["HUGGINGFACE_API_KEY"] = "hf_..."
os.environ["OPENAI_API_KEY"] = "sk-..."

### Create Async Application

In [4]:
# Set up an async callback.
callback = AsyncIteratorCallbackHandler()

chatllm = ChatOpenAI(
    temperature=0.0,
    streaming=True # important
)
llm = OpenAI(
    temperature=0.0,
)

memory = ConversationSummaryBufferMemory(
    memory_key="chat_history",
    input_key="human_input",
    llm=llm,
    max_token_limit=50
)

# Setup a simple question/answer chain with streaming ChatOpenAI.
prompt = PromptTemplate(
    input_variables=["human_input", "chat_history"],
    template='''
    You are having a conversation with a person. Make small talk.
    {chat_history}
        Human: {human_input}
        AI:'''
)

chain = LLMChain(llm=chatllm, prompt=prompt, memory=memory)

### Set up a language match feedback function.

In [5]:
tru = Tru()
hugs = feedback.Huggingface()
f_lang_match = Feedback(hugs.language_match).on_input_output()

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of `Tru` to prevent this.
✅ In language_match, input text1 will be set to __record__.main_input or `Select.RecordInput` .
✅ In language_match, input text2 will be set to __record__.main_output or `Select.RecordOutput` .


### Set up evaluation and tracking with TruLens

In [6]:
# Example of how to also get filled-in prompt templates in timeline:
from trulens_eval.instruments import instrument
instrument.method(PromptTemplate, "format")

tc = tru.Chain(
    chain,
    feedbacks=[f_lang_match],
    app_id="chat_with_memory"
)

In [7]:
tc.print_instrumented()

Components:
	TruChain (Other) at 0x142d7b310 with path __app__
	LLMChain (Other) at 0x132d31860 with path __app__.app
	ConversationSummaryBufferMemory (Other) at 0x142f342d0 with path __app__.app.memory
	OpenAI (Other) at 0x13274a490 with path __app__.app.memory.llm
	PromptTemplate (Custom) at 0x132d33ca0 with path __app__.app.memory.prompt
	ChatMessageHistory (Other) at 0x142f34150 with path __app__.app.memory.chat_memory
	PromptTemplate (Custom) at 0x142eb3160 with path __app__.app.prompt
	ChatOpenAI (Other) at 0x142eb2120 with path __app__.app.llm

Methods:
Object at 0x132d31860:
	<function Chain.__call__ at 0x118a17600> with path __app__.app
	<function LLMChain._call at 0x118a45bc0> with path __app__.app
	<function LLMChain._acall at 0x118a46160> with path __app__.app
	<function Chain.acall at 0x118a176a0> with path __app__.app
	<function Chain._call at 0x118a174c0> with path __app__.app
	<function Chain._acall at 0x118a17560> with path __app__.app
Object at 0x142f342d0:
	<function

### Start the TruLens dashboard

In [8]:
tru.run_dashboard()

Starting dashboard ...


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Dashboard started at http://192.168.4.23:8501 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

### Use the application

In [9]:
message = "Hi. How are you?"

with tc as recording:
    task = asyncio.create_task(
        chain.acall(
            inputs=dict(human_input=message, chat_history=[]),
            callbacks=[callback]
        )
    )

# Note, you either need to process all of the callback iterations or await task
# for record to be available.

async for token in callback.aiter():
    print(token, end="")

# Make sure task was completed:
await task
record = recording.get()

Hello! I'm an AI, so I don't have feelings, but I'm here to chat with you. How can I assist you today?