# Langchain Quickstart

In this quickstart you will create a simple LLM Chain and learn how to log it and get feedback on an LLM response.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/trulens_eval/examples/quickstart/langchain_quickstart.ipynb)

## Setup
### Add API keys
For this quickstart you will need Open AI and Huggingface keys

In [2]:
import trulens_eval
print(trulens_eval.__version__)

0.21.0


In [3]:
import os
os.environ["OPENAI_API_KEY"] = "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"

### Import from LangChain and TruLens

In [4]:
# Imports main tools:
from trulens_eval import TruChain, Feedback, Huggingface, Tru
from trulens_eval.schema import FeedbackResult
tru = Tru()
tru.reset_database()

# Imports from langchain to build app
import bs4
from langchain import hub
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import WebBaseLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.schema import StrOutputParser
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain_core.runnables import RunnablePassthrough

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of `Tru` to prevent this.


### Load documents

In [5]:
from langchain.document_loaders import UnstructuredMarkdownLoader
from langchain.text_splitter import MarkdownHeaderTextSplitter
#DOCUMENT LOADING
file_path = "../../Data/Scraping_Bocconi_converted_no_dup_check.md"
with open(file_path, 'r') as file:
    markdown_content = file.read()

#CREATE VECTOR STORE
headers_to_split_on = [
    ("#", "Header 1"),
    ("##", "Header 2"),
    ("###", "Header 3"),
    ("####", "Header 4"),]


### Create Vector Store

In [6]:
markdown_splitter = MarkdownHeaderTextSplitter(
    headers_to_split_on=headers_to_split_on)
splits = markdown_splitter.split_text(markdown_content)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

  warn_deprecated(


### Create RAG

In [7]:
retriever = vectorstore.as_retriever()

prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

  warn_deprecated(


In [26]:
#5
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

# Build prompt
template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.  Keep the answer as concise as possible. Always say "Se hai bisogno di ulteriori informazioni, non esitare a chiedere!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate.from_template(template)


llm_name = "gpt-3.5-turbo"
llm = ChatOpenAI(model_name=llm_name, temperature=0)

#CHAINS WITH DIFFERENT RETRIEVERS 
#Basic

rqa_base = RetrievalQA.from_chain_type(
    llm,
    retriever=vectorstore.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
)

In [23]:
rqa_base.invoke("Come funziona l'ingresso in residenza?")['result']

'Per accedere in residenza, è necessario presentarsi alla reception con un documento di identità in corso di validità. Non ci sono limitazioni orarie di ingresso o di uscita per gli studenti assegnatari di posto alloggio nelle residenze. Se hai bisogno di ulteriori informazioni, non esitare a chiedere!'

### VDP - Create your own RAG

In [9]:
def pretty_print_docs(docs):
    print(f"\n{'-' * 100}\n".join([f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]))

In [10]:
from langchain.retrievers.document_compressors import LLMChainExtractor
from langchain.retrievers import ContextualCompressionRetriever

compressor = LLMChainExtractor.from_llm(llm)

compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=vectorstore.as_retriever()
)

rag_chain_compressed = (
    {"context": compression_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

### Selfquery retriever - https://python.langchain.com/docs/modules/data_connection/retrievers/self_query/

In [39]:
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains.query_constructor.base import AttributeInfo
from langchain_openai import ChatOpenAI

metadata_field_info = [
    AttributeInfo(
        name="Header 1",
        description="a primary category or a general topic. It introduces the broader theme under which more specific information is grouped. In a retrieval task, it acts as the first level of data filtering or organization, offering a broad overview of the context or subject area.",
        type="string",
    ),
    AttributeInfo(
        name="Header 2",
        description="This is a subtheme or subcategory of Header 1. It provides a further level of detail, focusing on a specific aspect of the main theme. It serves to refine the search or understanding within the general topic defined by Header 1, guiding the user towards more targeted information.",
        type="string",
    ),
    AttributeInfo(
        name="Header 3",
        description="This represents an even more specific subdivision of Header 2. This level may contain rules, guidelines, or particular details concerning the subtheme. In a retrieval task, this header helps to focus on very specific aspects within the subcategory, making the search even more targeted. ",
        type="string",
    ),
    AttributeInfo(
        name="Header 4",
        description="This is the most specific level, typically formulated as a question or a very precise statement. It serves to direct the user or the retrieval system towards a highly detailed and specific answer or information, often of a practical or operational nature. It's the level that directly responds to the user's questions or needs.",
        type="string",
    ),
]

document_content_description = "Frequently asked questions"

llm = ChatOpenAI(temperature=0)
self_retriever = SelfQueryRetriever.from_llm(
    llm,
    vectorstore,
    document_content_description, #
    metadata_field_info,          #
    verbose= True
)

### Send your first request

In [11]:
rag_chain.invoke("Come posso fare l'ingresso in residenza? ")

"Per fare l'ingresso in residenza, devi presentarti alla reception con un documento di identità in corso di validità per effettuare il check-in e ricevere le chiavi della tua stanza. Al termine della tua permanenza, dovrai riconsegnare le chiavi alla reception dopo aver liberato la stanza dai tuoi effetti personali."

In [12]:
rag_chain_compressed.invoke("Come posso fare l'ingresso in residenza?")



"Per fare l'ingresso in residenza, devi avvisare telefonicamente il residente e recarti alla reception per compilare il registro con i dati dell'ospite esterno. L'ospite esterno dovrà firmare e depositare un documento di identità in reception."

In [13]:
compressed_docs = compression_retriever.get_relevant_documents("Come posso fare l'ingresso in residenza?")
pretty_print_docs(compressed_docs)



Document 1:

la presenza di ospiti esterni in residenza è consentita esclusivamente tra le ore 7.00 e le ore 24.00;
all'arrivo dell'ospite esterno, il residente viene avvisato telefonicamente e deve recarsi alla reception, dove dovrà compilare l'apposito registro indicando nome, cognome del proprio ospite e anche numero di matricola se bocconiano. Qualora vi fossero più ospiti contemporaneamente, il residente avrà cura di compilare una riga del registro per ogni ospite esterno. Il residente dovrà infine apporre la propria firma sul registro;
l'ospite esterno firma e deposita in reception un documento di identità (carta di identità, passaporto o patente), che potrà ritirare all'uscita. A questo punto è autorizzato a entrare in residenza;
all'uscita, l'ospite esterno si reca in reception per ritirare il proprio documento di identità e firmare il registro;
l'ospite esterno - così come il residente - si impegna a un comportamento civile e rispettoso, in particolar modo assicurandosi di non

In [40]:
self_retriever.invoke(" Come posso fare l'ingresso in residenza?")

[Document(page_content='Prima di accedere alla domanda prendi visione dei documenti utili (Regolamento Residenze Bocconi a.a. 2023-24 e Informativa privacy) disponibili al seguente link.  \nTieni a portata di mano le credenziali di accesso, cerca una connessione internet veloce e utilizza un unico dispositivo per accedere alla domanda online nel momento dell’apertura. All’apertura della domanda online, segui tutti i passaggi previsti:  \n> ACCEDI al link all\'orario di apertura indicato: ti troverai in una "waiting room" virtuale. Quando arriva il tuo turno, accedi inserendo le credenziali Bocconi (matricola/username e password).  \n> ENTRA NELLA SEZIONE "Accommodation choice"  \n> SELEZIONA LA RESIDENZA  \nSe non visualizzi alcuna opzione significa che i posti disponibili sono esauriti.  \n> SELEZIONA LA TIPOLOGIA DI CAMERA  \nSe non visualizzi alcuna opzione significa che i posti disponibili sono esauriti.  \n> CLICCA SU "SAVE" (salva) in fondo alla sezione  \nSe non riesci a cliccar

## Initialize Feedback Function(s)

In [28]:
from trulens_eval.feedback.provider import OpenAI
import numpy as np

# Initialize provider class
openai = OpenAI()

# select context to be used in feedback. the location of context is app specific.
from trulens_eval.app import App
context = App.select_context(rqa_base)

from trulens_eval.feedback import Groundedness
grounded = Groundedness(groundedness_provider=OpenAI())
# Define a groundedness feedback function
f_groundedness = (
    Feedback(grounded.groundedness_measure_with_cot_reasons)
    .on(context.collect()) # collect context chunks into a list
    .on_output()
    .aggregate(grounded.grounded_statements_aggregator)
)

# Question/answer relevance between overall question and answer.
f_qa_relevance = Feedback(openai.relevance).on_input_output()
# Question/statement relevance between question and each context chunk.
f_context_relevance = (
    Feedback(openai.qs_relevance)
    .on_input()
    .on(context)
    .aggregate(np.mean)
    )

✅ In groundedness_measure_with_cot_reasons, input source will be set to __record__.app.retriever.get_relevant_documents.rets.collect() .
✅ In groundedness_measure_with_cot_reasons, input statement will be set to __record__.main_output or `Select.RecordOutput` .
✅ In relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In qs_relevance, input question will be set to __record__.main_input or `Select.RecordInput` .
✅ In qs_relevance, input statement will be set to __record__.app.retriever.get_relevant_documents.rets .


## Instrument chain for logging with TruLens

In [15]:
tru_recorder = TruChain(rag_chain,
    app_id='Chain1_ChatApplication',
    feedbacks=[f_qa_relevance, f_context_relevance, f_groundedness])

In [16]:
with tru_recorder as recording:
    llm_response = rag_chain.invoke("What is the purpose of the source?")

display(llm_response)

'The purpose of the source is to provide information and instructions for students regarding the temporary housing agreement and the application process for student residences. It also outlines the rules and regulations that residents must adhere to and the potential consequences for violating them. Additionally, it provides guidance on how to navigate the online application system and what to do if all available spots are filled.'

In [24]:
rag_chain.invoke("What is the purpose of the source?")

'The purpose of the source is to provide information and guidelines regarding the rules and regulations of the Bocconi residences. It also provides instructions and steps for students to follow when making a reservation for accommodation. Additionally, it explains the role of the Fees, Funding, and Housing Office in facilitating and supporting students in their temporary housing arrangements.'

In [None]:
rqa_base.invoke("Come funziona l'ingresso in residenza?")['result']

In [29]:
tru_recorder2 = TruChain(rqa_base,
    app_id='Chain2_ChatApplication',
    feedbacks=[f_qa_relevance, f_context_relevance, f_groundedness])

In [30]:
with tru_recorder2 as recording:
    llm_response = rqa_base.invoke("Qual'è lo scopo delle resources")['result']

display(llm_response)

'Lo scopo delle risorse è soddisfare la richiesta di alloggio da parte degli studenti.'

## Retrieve records and feedback

In [15]:
# The record of the app invocation can be retrieved from the `recording`:

 # use .get if only one record
# recs = recording.records # use .records if multiple

#display(rec)

Record(record_id='record_hash_6efbf3747fe4eac0949a2e01d1777f23', app_id='Chain1_ChatApplication', cost=Cost(n_requests=0, n_successful_requests=0, n_classes=0, n_tokens=0, n_stream_chunks=0, n_prompt_tokens=0, n_completion_tokens=0, cost=0.0), perf=Perf(start_time=datetime.datetime(2024, 2, 9, 20, 53, 43, 825821), end_time=datetime.datetime(2024, 2, 9, 20, 53, 57, 497758)), ts=datetime.datetime(2024, 2, 9, 20, 53, 57, 514870), tags='-', meta=None, main_input='What is the purpose of the source?', main_output='The purpose of the source is to provide information and guidelines regarding the allocation and management of student accommodations at Bocconi University. It also outlines the rules and regulations that students must adhere to while living in the residences. Additionally, it explains the role of the Fees, Funding, and Housing Office in facilitating temporary housing arrangements for students.', main_error=None, calls=[RecordAppCall(stack=[RecordAppCallMethod(path=Lens().app, metho

In [16]:
# The results of the feedback functions can be rertireved from the record. These
# are `Future` instances (see `concurrent.futures`). You can use `as_completed`
# to wait until they have finished evaluating.
rec = recording.get()

from concurrent.futures import as_completed

for feedback_future in  as_completed(rec.feedback_results):
    feedback, feedback_result = feedback_future.result()

    feedback: Feedback
    feedbac_result: FeedbackResult

    #display(feedback.name, feedback_result.result)


'relevance'

0.8

'qs_relevance'

0.2

'groundedness_measure_with_cot_reasons'

0.9

In [17]:
records, feedback = tru.get_records_and_feedback(app_ids=["Chain1_ChatApplication"])

records.head()

Unnamed: 0,app_id,app_json,type,record_id,input,output,tags,record_json,cost_json,perf_json,ts,relevance,qs_relevance,groundedness_measure_with_cot_reasons,relevance_calls,qs_relevance_calls,groundedness_measure_with_cot_reasons_calls,latency,total_tokens,total_cost
0,Chain1_ChatApplication,"{""tru_class_info"": {""name"": ""TruChain"", ""modul...",RunnableSequence(langchain_core.runnables.base),record_hash_6efbf3747fe4eac0949a2e01d1777f23,"""What is the purpose of the source?""","""The purpose of the source is to provide infor...",-,"{""record_id"": ""record_hash_6efbf3747fe4eac0949...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-02-09T20:53:43.825821"", ""...",2024-02-09T20:53:57.514870,0.8,0.2,0.9,[{'args': {'prompt': 'What is the purpose of t...,[{'args': {'question': 'What is the purpose of...,"[{'args': {'source': [[{'page_content': ""Le\xa...",13,0,0.0


In [20]:
tru.get_leaderboard(app_ids=["Chain2_ChatApplication"])

Unnamed: 0_level_0,relevance,groundedness_measure_with_cot_reasons,qs_relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Chain2_ChatApplication,1.0,0.0,0.2,2.0,0.000748


## Explore in a Dashboard

In [18]:
tru.run_dashboard() # open a local streamlit app to explore

# tru.stop_dashboard() # stop if needed

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Dashboard started at http://10.10.130.79:8502 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

Alternatively, you can run `trulens-eval` from a command line in the same folder to start the dashboard.

Note: Feedback functions evaluated in the deferred manner can be seen in the "Progress" page of the TruLens dashboard.
