# Langchain Quickstart

In this quickstart you will create a simple LLM Chain and learn how to log it and get feedback on an LLM response.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/trulens_eval/examples/frameworks/langchain/langchain_quickstart.ipynb)

## Setup
### Add API keys
For this quickstart you will need Open AI and Huggingface keys

In [1]:
import os
os.environ["OPENAI_API_KEY"] = "..."
os.environ["HUGGINGFACE_API_KEY"] = "..."

### Import from LangChain and TruLens

In [2]:
from IPython.display import JSON

# Imports main tools:
from trulens_eval import TruChain, Feedback, Huggingface, Tru, OpenAI as fOpenAI, Select
tru = Tru()

# Imports from langchain to build app. You may need to install langchain first
# with the following:
# ! pip install langchain>=0.0.170
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from langchain.prompts.chat import ChatPromptTemplate, PromptTemplate
from langchain.prompts.chat import HumanMessagePromptTemplate
from langchain.memory import ConversationBufferMemory

🦑 Tru initialized with db url sqlite:///default.sqlite .


### Create Simple LLM Application

This example uses a LangChain framework and OpenAI LLM

In [4]:
template = """You a friendly AI assistant having a conversation with a human. You love to go on irrelevant tangents.

{chat_history}
Human: {human_input}
Chatbot:"""

prompt = PromptTemplate(
    input_variables=["chat_history", "human_input"], template=template
)
memory = ConversationBufferMemory(memory_key="chat_history")

llm = OpenAI()
llm_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=True,
    memory=memory,
)

### Send your first request

In [5]:
# tell me a story about max and his dog and their trip to target in Japanese
prompt_input = "Discuss the parallels between the works of Shakespeare and the evolution of modern pop culture, taking into account the ways in which archetypal characters and narrative structures continue to shape our understanding of human nature and storytelling."

## Initialize Feedback Function(s)

In [6]:
# Initialize feedback function collection classes:
fopenai = fOpenAI()

f_conciseness = Feedback(fopenai.conciseness, name = "Conciseness").on_output()
f_relevance = Feedback(fopenai.relevance, name = "QA Relevance").on(Select.Record.calls[0].args.inputs.human_input).on_output()

✅ In Conciseness, input text will be set to *.__record__.main_output or `Select.RecordOutput` .
✅ In QA Relevance, input prompt will be set to *.__record__.calls[0].args.inputs.human_input .
✅ In QA Relevance, input response will be set to *.__record__.main_output or `Select.RecordOutput` .


## Instrument chain for logging with TruLens

In [7]:
truchain = TruChain(llm_chain,
    app_id='Chain1_ChatApplication',
    feedbacks=[f_conciseness, f_relevance])

In [8]:
# Instrumented chain can operate like the original:
llm_response = truchain(prompt_input)

display(llm_response)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou a friendly AI assistant having a conversation with a human. You love to go on irrelevant tangents.


Human: Discuss the parallels between the works of Shakespeare and the evolution of modern pop culture, taking into account the ways in which archetypal characters and narrative structures continue to shape our understanding of human nature and storytelling.
Chatbot:[0m

[1m> Finished chain.[0m


{'human_input': 'Discuss the parallels between the works of Shakespeare and the evolution of modern pop culture, taking into account the ways in which archetypal characters and narrative structures continue to shape our understanding of human nature and storytelling.',
 'chat_history': '',
 'text': " Wow, that's a really fascinating topic. It's amazing how Shakespeare's works continue to shape our culture even today. For example, I recently read an article that argued that the story of Romeo and Juliet is still very relevant in modern pop culture, with the idea of two star-crossed lovers coming from different worlds still resonating with audiences. Even some of the characters themselves have become archetypal figures in our culture, such as the wise fool and the loyal friend. It's really interesting to think about how the same story structures and characters can be used in a variety of different contexts to tell different stories, while still resonating with the same themes. On a more 

In [9]:
tru.get_records_and_feedback(None)[0][["Conciseness","QA Relevance"]]

Unnamed: 0,Conciseness,QA Relevance
0,0.3,0.7


In [10]:
def feedback_retry_command(dataframe):
    if dataframe.empty:
        return "The DataFrame is empty."

    column_names = dataframe.columns.tolist()
    last_row_values = dataframe.iloc[-1].tolist()

    bullet_points = []
    for col, val in zip(column_names, last_row_values):
        bullet_points.append(f"- {col}: {val}")

    bulleted_list = "\n".join(bullet_points)
    retry_command = ("The following is an evaluation of your last response of the following format:\n"
    "- CRITERIA: SCORE\n"
    "- CRITERIA: SCORE\n"
    ", with 1 being the most highest possible SCORE and 0 being the lowest. "
    "Please try responding again with the goal of improving all SCORES. You will be rewarded for higher SCORES.\n" 
    "Only respond with the new answer and say anything about this evaluation" + (bulleted_list))
    return retry_command


In [12]:
retry_command = feedback_retry_command(tru.get_records_and_feedback(None)[0][["Conciseness","QA Relevance"]])

In [13]:
retry_command

'The following is an evaluation of your last response of the following format:\n- CRITERIA: SCORE\n- CRITERIA: SCORE\n, with 1 being the most highest possible SCORE and 0 being the lowest. Please try responding again with the goal of improving all SCORES. You will be rewarded for higher SCORES.\nOnly respond with the new answer and say anything about this evaluation- Conciseness: 0.3\n- QA Relevance: 0.7'

In [14]:
# Instrumented chain can operate like the original:
llm_response = truchain(retry_command)

display(llm_response["text"])



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou a friendly AI assistant having a conversation with a human. You love to go on irrelevant tangents.

Human: Discuss the parallels between the works of Shakespeare and the evolution of modern pop culture, taking into account the ways in which archetypal characters and narrative structures continue to shape our understanding of human nature and storytelling.
AI:  Wow, that's a really fascinating topic. It's amazing how Shakespeare's works continue to shape our culture even today. For example, I recently read an article that argued that the story of Romeo and Juliet is still very relevant in modern pop culture, with the idea of two star-crossed lovers coming from different worlds still resonating with audiences. Even some of the characters themselves have become archetypal figures in our culture, such as the wise fool and the loyal friend. It's really interesting to think about how the same story structure

" Ah, I see. Well, I'll try to be more concise and relevant in my response. I think it's interesting to consider how Shakespeare's works continue to influence our modern culture, particularly in the way that archetypal characters and themes have been adapted and re-imagined in various forms. For example, the story of Romeo and Juliet has been re-interpreted in a variety of mediums, from film to theater to video games. This speaks to the timelessness of Shakespeare's stories and characters, and how they can be used to explore different themes and ideas."

In [15]:
tru.get_records_and_feedback(None)[0][["Conciseness","QA Relevance"]]

Unnamed: 0,Conciseness,QA Relevance
0,0.3,0.7
1,0.5,0.5


## Explore in a Dashboard

In [14]:
tru.run_dashboard() # open a local streamlit app to explore

# tru.stop_dashboard() # stop if needed

Starting dashboard ...


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Dashboard started at http://192.168.4.23:8501 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

Alternatively, you can run `trulens-eval` from a command line in the same folder to start the dashboard.

### Chain Leaderboard

Understand how your LLM application is performing at a glance. Once you've set up logging and evaluation in your application, you can view key performance statistics including cost and average feedback value across all of your LLM apps using the chain leaderboard. As you iterate new versions of your LLM application, you can compare their performance across all of the different quality metrics you've set up.

Note: Average feedback values are returned and displayed in a range from 0 (worst) to 1 (best).

![Chain Leaderboard](https://www.trulens.org/Assets/image/Leaderboard.png)

To dive deeper on a particular chain, click "Select Chain".

### Understand chain performance with Evaluations
 
To learn more about the performance of a particular chain or LLM model, we can select it to view its evaluations at the record level. LLM quality is assessed through the use of feedback functions. Feedback functions are extensible methods for determining the quality of LLM responses and can be applied to any downstream LLM task. Out of the box we provide a number of feedback functions for assessing model agreement, sentiment, relevance and more.

The evaluations tab provides record-level metadata and feedback on the quality of your LLM application.

![Evaluations](https://www.trulens.org/Assets/image/Leaderboard.png)

### Deep dive into full chain metadata

Click on a record to dive deep into all of the details of your chain stack and underlying LLM, captured by tru_chain.

![Explore a Chain](https://www.trulens.org/Assets/image/Chain_Explore.png)

If you prefer the raw format, you can quickly get it using the "Display full chain json" or "Display full record json" buttons at the bottom of the page.

Note: Feedback functions evaluated in the deferred manner can be seen in the "Progress" page of the TruLens dashboard.

## Or view results directly in your notebook

In [None]:
tru.get_records_and_feedback(app_ids=[])[0] # pass an empty list of app_ids to get all