# phospho Quickstart

In this quickstart, we will use the `lab`from the `phospho`package to run an event extraction task on a dataset.
First, we will run on a subset of the dataset with several models:
- the OpenAI API
- the Mistral AI API
- a local Ollama model

Then, we will use the `lab` optimizer to find the best model and hyperparameters for the task in term of performance, speed and price.

Finally, we will use the `lab` to run the best model on the full dataset and compare the results with the subset.

Feel free to only use the APIs or Ollama models you want.

## Installation and setup

You will need:
- an OpenAI API key (find yours [here](https://platform.openai.com/api-keys))
- a Mistral AI API key (find yours [here](https://console.mistral.ai/api-keys/))
- Ollama running on your local machine, with the Mistral 7B model installed. You can find the installation instructions for Ollama [here](https://ollama.com)

In [26]:
!pip install -q python-dotenv phospho

In [2]:
# Load and check env variables
import os
from dotenv import load_dotenv

load_dotenv()

from phospho import config

# Check the environment variables
assert config.OPENAI_API_KEY is not None, "You need to set the OPENAI_API_KEY environment variable" 
assert config.MISTRAL_API_KEY is not None, "You need to set the MISTRAL_API_KEY environment variable"



In [27]:
!pip install -q ollama

In [32]:
import ollama

try:
  # Let's check we can reach your local Ollama API
  response = ollama.chat(model='mistral', messages=[
    {
      'role': 'user',
      'content': 'What is the best French cheese? Keep your answer short.',
    },
  ])
  print(response['message']['content'])
except Exception as e:
  print(f"Error: {e}")
  print("You need to have a local Ollama server running to continue and the mistral model downloaded. \nRemove references to Ollama otherwise.")

Error: [Errno 61] Connection refused
You need to have a local Ollama server running to continue and the mistral model downloaded. 
Remove references to Ollama otherwise.


## Define the phospho workload and jobs

In [5]:
from phospho import lab
from typing import Literal

# Create a workload in our lab
workload = lab.Workload()

# Setup the configs for our job
# Model are ordered from the least desired to the most desired
class EventConfig(lab.JobConfig):
    event_name: str
    event_description: str
    model_id: Literal["openai:gpt-4", "mistral:mistral-large-latest", "mistral:mistral-small-latest", "ollama:mistral-7B"] = "openai:gpt-4"

# Add our job to the workload
workload.add_job(
    lab.Job(
        name="sync_event_detection",
        id="question_answering",
        config=EventConfig(
            event_name="Question Answering",
            event_description="User asks a question to the assistant",
            model_id="openai:gpt-4"
        )
    )
)




# Loading a message dataset

Let's load a dataset of messages from huggingface, so we can run our extraction job on it.

In [6]:
!pip install -q datasets

In [7]:
from datasets import load_dataset

dataset = load_dataset("daily_dialog")

In [8]:
# Generate a sub dataset with 30 messages
sub_dataset = dataset["train"].select(range(30))

# Let's print one of the messages
print(sub_dataset[0]["dialog"][0])

# Build the message list for our lab
messages = []
for row in sub_dataset:
    text = row["dialog"][0]
    messages.append(lab.Message(content=text))

Say , Jim , how about going for a few beers after dinner ? 


In [9]:
# Run the lab on it
# The job will be runned with the default model (openai:gpt-3.5-turbo)
workload_results = await workload.async_run(messages=messages, executor_type="parallel")

### Compute the results with the alternative configurations

In [10]:
# Compute alternative results with the Mistral API and Ollama
await workload.async_run_on_alternative_configurations(messages=messages, executor_type="parallel")

In [11]:
workload.jobs[0].config.model_id

'openai:gpt-4'

### Apply the optimizer to the pipeline

For the purpose of this demo, we consider a considertion good enough if it matches gpt-4 on at least 80% of the dataset. Good old Paretto.

In [13]:
workload.optimize_jobs(accuracy_threshold=0.8)

2024-02-28 20:30:22,652 INFO phospho.lab.lab: Found a less costly config with accuracy of 0.8333333333333334. Swapping to it.


accuracies: [0.8333333333333334, 0.8333333333333334, 0.6666666666666666]


In [15]:
# let's check the new model_id (if it has changed)
workload.jobs[0].config.model_id

'mistral:mistral-small-latest'

## Run our workload on the full dataset, with optimized parameters

For the purpose of this demo, we will only run the optimal configuration on a fraction of the dataset, to save time and money.

In [16]:
sub_dataset = dataset["train"].select(range(200)) # Here you can just leave it as dataset["train"] if you want to use the whole dataset

# Build the message list for our lab
messages = []
for row in sub_dataset:
    text = row["dialog"][0]
    messages.append(lab.Message(content=text))

In [17]:
# The job will be runned with the best model (mistral:mistral-small-latest in our case)
workload_results = await workload.async_run(messages=messages, executor_type="parallel")

## Analyze the results

In [25]:
boolean_result = []

# Go through the dict
for key, value in workload_results.items():
    result = value['question_answering'].value
    boolean_result.append(result)

# Let's count the number of True and False
true_count = boolean_result.count(True)
false_count = boolean_result.count(False)

print(f"In the dataset, {true_count/len(boolean_result)*100}% of the messages are a question. The rest are not.")


In the dataset, 44.5% of the messages are a question. The rest are not.


## Going further

You can use the `lab` to run other tasks, such as:
- Named Entity Recognition
- Sentiment Analysis
- Evaluations
- And more!

You can also play around with differnet models, different hyperparameters, and different datasets.

You want to have such analysis on your own LLM app, in real time? Check out the cloud hosted version of phospho, available on [phospho.ai](https://phospho.ai)