## Evaluating RAG of e5 Embedding and Mistral 8x7 as LLM by HHEM evaluator

In this quickstart, you'll learn to evaluate and create RAG using Hugging Face models using:

1. **Intfloat Multilingual-e5-large-instruct** as the embedding function.
2. **Mistral-8x7B** as the LLM (Large Language Model).
3. **ChromaDB** as the vector database.
4. **Vectra HHEM** as the evaluator.
or.


#!pip install langchain ,pip install langchain==0.0.354  ,pip install langchain-community==0.0.20 ,pip install langchain-core==0.1.23

In [87]:
import sys
sys.path.append("trulens\trulens_eval")
from trulens_eval.feedback.provider.hugs import Huggingface


In [1]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
import os
from langchain.document_loaders import DirectoryLoader
import json

## Load your Data Documents Directory

In [2]:
loader = DirectoryLoader('./data/', glob="./*.txt", loader_cls=TextLoader)
documents = loader.load()


## Split the DocumentTEXT into text Chunks to feed in ChromaDb

In [3]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
texts = text_splitter.split_documents(documents)
print(len(texts))

108


In [None]:
import getpass

inference_api_key =getpass.getpass("Enter your HF Inference API Key:\n\n")


## Add hugging face e5 embedding model

In [10]:
from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings

embedding_function = HuggingFaceInferenceAPIEmbeddings(
    api_key=inference_api_key, model_name="intfloat/multilingual-e5-large-instruct"
)



In [11]:
db = Chroma.from_documents(texts, embedding_function)

## Get relevant context Docs with respect to quert

In [88]:
query = "what is SpaceX"
docs = db.similarity_search(query)

In [89]:
content = ''

for doc in docs:
    content += ''.join(doc.page_content)

## Query the Mixtral 8X7B LLM adding context

In [90]:
import requests

def query_model(content, query):
    url = "https://api-inference.huggingface.co/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO"
    headers = {
        "Authorization": "Bearer ",
        "Content-Type": "application/json"
    }

    data = {
        "inputs": f"answer the following question from the information given Question:{query}\nInformation:{content}\n"
    }

    try:
        response = requests.post(url, headers=headers, json=data)
        response.raise_for_status()
        response_data = response.json()

        # Extract the generated text from the response
        generated_text = response_data[0]['generated_text']
        # Remove the input text from the generated text
        response_text = generated_text[len(data['inputs']):]

        return response_text
    except requests.exceptions.RequestException as e:
        print("Error:", e)
        return None

# Example usage:
context_info =content
question = "what is SpaceX"
result = query_model(context_info, question)
if result:
    print("Response:", result)
else:
    print("No response received.")


Response: 
Dorsey responded to more questions, including one where he discussed Twitter founder Jack Dorsey’s view on Musk’s ownership of Twitter. When asked if there is anything Dorsey admires about Musk, he answered, “The people he sometimes listens to.”

Musk said in a recent TED interview that he can’t guarantee free speech on Twitter, despite his claims it’s the “digital town square.” Dorsey said he’s not surprised


## Pass retrieval context docs and LLM reponce in HHEM evaluater to get relevance to get respoce between O to 1 depending on relevancy 

In [92]:
huggingface_provider = Huggingface()
score = huggingface_provider.hallucination_evaluator(result,content)
print(score)

0.258838027715683
