<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://docs.arize.com/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q">Community</a>
    </p>
</center>
<h1 align="center">Using a Local LLM</h1>

Below is an example of using a local LLM to perform evals. In this example we will be using [ollama](https://ollama.com/)

In [2]:
!pip install -qq "arize-phoenix-evals>=0.0.5" "litellm"

In [6]:
import nest_asyncio

nest_asyncio.apply()

In [7]:
import os

from phoenix.evals import LiteLLMModel

os.environ["OLLAMA_API_BASE"] = "http://localhost:11434"

model = LiteLLMModel(model="ollama/llama3")

In [5]:
model("Hello, world!")

"Hello there! It's great to meet you. I'm your friendly AI assistant, here to help with any questions or topics you'd like to discuss. What brings you to this corner of the internet today?"

In [13]:
from phoenix.evals import download_benchmark_dataset

df = download_benchmark_dataset(
    task="binary-relevance-classification", dataset_name="wiki_qa-train"
)
df.head()

Unnamed: 0,query_id,query_text,document_title,document_text,document_text_with_emphasis,relevant
0,Q1,how are glacier caves formed?,Glacier cave,A partly submerged glacier cave on Perito More...,A partly submerged glacier cave on Perito More...,True
1,Q10,how an outdoor wood boiler works,Outdoor wood-fired boiler,The outdoor wood boiler is a variant of the cl...,The outdoor wood boiler is a variant of the cl...,False
2,Q100,what happens to the light independent reactio...,Light-independent reactions,The simplified internal structure of a chlorop...,The simplified internal structure of a chlorop...,True
3,Q1000,where in the bible that palestine have no land...,Philistines,"The Philistine cities of Gaza, Ashdod, Ashkelo...","The Philistine cities of Gaza, Ashdod, Ashkelo...",False
4,Q1001,what are the test scores on asvab,Armed Services Vocational Aptitude Battery,The Armed Services Vocational Aptitude Battery...,The Armed Services Vocational Aptitude Battery...,False


In [16]:
from phoenix.evals import (
    RAG_RELEVANCY_PROMPT_RAILS_MAP,
    RAG_RELEVANCY_PROMPT_TEMPLATE,
    llm_classify,
)

N_EVAL_SAMPLE_SIZE = 100

df_sample = df.sample(n=N_EVAL_SAMPLE_SIZE)

df_sample = df_sample.rename(
    columns={
        "query_text": "input",
        "document_text": "reference",
    },
)

rails = list(RAG_RELEVANCY_PROMPT_RAILS_MAP.values())

relevance_df = llm_classify(
    dataframe=df_sample,
    template=RAG_RELEVANCY_PROMPT_TEMPLATE,
    model=model,
    rails=rails,
    concurrency=20,
)



llm_classify |          | 0/100 (0.0%) | ⏳ 00:00<? | ?it/s

In [18]:
relevance_df.head(20)

Unnamed: 0,label
87,relevant
1632,relevant
2069,relevant
1974,relevant
39,relevant
1379,relevant
1947,relevant
1862,relevant
147,relevant
973,relevant
