# Ground Truth Evaluations

In this quickstart you will create a evaluate a _LangChain_ app using ground truth. Ground truth evaluation can be especially useful during early LLM experiments when you have a small set of example queries that are critical to get right.

Ground truth evaluation works by comparing the similarity of an LLM response compared to its matching verified response.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/examples/expositional/frameworks/langchain/langchain_groundtruth.ipynb)

### Import from _LangChain_ and TruLens

In [None]:
# !pip install --pre trulens trulens-apps-langchain trulens-providers-huggingface trulens-providers-openai langchain>=0.0.342 langchain_community

In [None]:
from langchain.chains import LLMChain
from langchain.prompts import ChatPromptTemplate
from langchain.prompts import HumanMessagePromptTemplate
from langchain.prompts import PromptTemplate
from langchain_community.llms import OpenAI
from trulens.core import Feedback
from trulens.core import TruSession
from trulens.feedback import GroundTruthAgreement
from trulens.providers.huggingface import Huggingface
from trulens.providers.openai import OpenAI as fOpenAI

tru = TruSession()

### Add API keys
For this quickstart, you will need Open AI keys.

In [None]:
import os

os.environ["HUGGINGFACE_API_KEY"] = "hf_..."
os.environ["OPENAI_API_KEY"] = "sk-..."

### Create Simple LLM Application

This example uses Langchain with an OpenAI LLM.

In [None]:
full_prompt = HumanMessagePromptTemplate(
    prompt=PromptTemplate(
        template="Provide an answer to the following: {prompt}",
        input_variables=["prompt"],
    )
)

chat_prompt_template = ChatPromptTemplate.from_messages([full_prompt])

llm = OpenAI(temperature=0.9, max_tokens=128)

chain = LLMChain(llm=llm, prompt=chat_prompt_template, verbose=True)

## Initialize Feedback Function(s)

In [None]:
golden_set = [
    {"query": "who invented the lightbulb?", "response": "Thomas Edison"},
    {"query": "¿quien invento la bombilla?", "response": "Thomas Edison"},
]

f_groundtruth = Feedback(
    GroundTruthAgreement(golden_set, provider=fOpenAI()).agreement_measure, name="Ground Truth"
).on_input_output()

# Define a language match feedback function using HuggingFace.
hugs = Huggingface()
f_lang_match = Feedback(hugs.language_match).on_input_output()

## Instrument chain for logging with TruLens

In [None]:
from trulens.apps.langchain import TruChain

tc = TruChain(chain, feedbacks=[f_groundtruth, f_lang_match])

In [None]:
# Instrumented query engine can operate as a context manager:
with tc as recording:
    chain("¿quien invento la bombilla?")
    chain("who invented the lightbulb?")

## Explore in a Dashboard

In [None]:
from trulens.dashboard import run_dashboard

run_dashboard(tru)  # open a local streamlit app to explore

# stop_dashboard(tru) # stop if needed