<div class="align-center">
<a href="https://oumi.ai/"><img src="https://oumi.ai/docs/en/latest/_static/logo/header_logo.png" height="200"></a>

[![Documentation](https://img.shields.io/badge/Documentation-latest-blue.svg)](https://oumi.ai/docs/en/latest/index.html)
[![Discord](https://img.shields.io/discord/1286348126797430814?label=Discord)](https://discord.gg/oumi)
[![GitHub Repo stars](https://img.shields.io/github/stars/oumi-ai/oumi)](https://github.com/oumi-ai/oumi)
<a target="_blank" href="https://colab.research.google.com/github/oumi-ai/oumi/blob/main/configs/projects/halloumi/halloumi_classifier_inference_notebook.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
</div>

👋 Welcome to Open Universal Machine Intelligence (Oumi)!

🚀 Oumi is a fully open-source platform that streamlines the entire lifecycle of foundation models - from [data preparation](https://oumi.ai/docs/en/latest/resources/datasets/datasets.html) and [training](https://oumi.ai/docs/en/latest/user_guides/train/train.html) to [evaluation](https://oumi.ai/docs/en/latest/user_guides/evaluate/evaluate.html) and [deployment](https://oumi.ai/docs/en/latest/user_guides/launch/launch.html). Whether you're developing on a laptop, launching large scale experiments on a cluster, or deploying models in production, Oumi provides the tools and workflows you need.

🤝 Make sure to join our [Discord community](https://discord.gg/oumi) to get help, share your experiences, and contribute to the project! If you are interested in joining one of the community's open-science efforts, check out our [open collaboration](https://oumi.ai/community) page.

⭐ If you like Oumi and you would like to support it, please give it a star on [GitHub](https://github.com/oumi-ai/oumi).

# HallOumi Inference

This notebook demonstrates how you can run inference locally, using the HallOumi 8B classifier. 

Note that this is the **classifier (non-generative)** flavor of the HallOumi family.
This model lacks the ergonomics of the generative HallOumi (per-sentence classification, citations, explanation), but is much more computationally efficient. 
It can be a good alternative when compute costs and latency are important for your use case.
If you are interested in the generative version of HallOumi, please see [this notebook](https://github.com/oumi-ai/oumi/blob/main/configs/projects/halloumi/halloumi_inference_notebook.ipynb) instead.

For more details on HallOumi, please read our [GitHub documentation](https://github.com/oumi-ai/oumi/blob/main/configs/projects/halloumi/README.md) and our [technical overview](https://oumi.ai/blog/posts/introducing-halloumi).

## Prerequisites

Please install the following packages before the inference walkthrough.

In [None]:
%pip install transformers
%pip install torch
%pip install scipy

## Inference Walkthrough

### Dataset

Let's start by defining a toy dataset, where each example consists of a `context` and a `claim`. 

In [1]:
toy_dataset = [
    {
        "context": "Today is a sunny day.",
        "claim": "It is not raining today.",
    },
    {
        "context": "James is a software engineer. He works at a tech company.",
        "claim": "James loves his tech job.",
    },
]

We then convert these examples to a list of prompts. To do so, we leverage the HallOumi classifier's prompt template (`PROMPT_TEMPLATE`) that is shown below, and replace the `context` and `claims` variables with the specific example's parameters.

In [2]:
PROMPT_TEMPLATE = "<context>\n{context}\n</context>\n\n<claims>\n{claim}\n</claims>"

prompts = []
for example in toy_dataset:
    prompt = PROMPT_TEMPLATE.format(context=example["context"], claim=example["claim"])
    prompts.append(prompt)

### Inference

#### Loading Model

Next, we load the [HallOumi classifier model](https://huggingface.co/oumi-ai/HallOumi-8B-classifier) (`oumi-ai/HallOumi-8B-classifier`) from HuggingFace, together with its corresponding tokenizer.

In [3]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained(
    "oumi-ai/HallOumi-8B-classifier"
)
tokenizer = AutoTokenizer.from_pretrained("oumi-ai/HallOumi-8B-classifier")

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

#### Tokenizing

Using the tokenizer instantiated above, we tokenize the prompts to query HallOumi.

In [4]:
inputs = tokenizer(prompts, padding=True, truncation=True, return_tensors="pt")

#### Running Inference

The following snippet shows how to run inference and then extract the logits. 
Finally, we apply softmax to get the probabilities.

In [5]:
import torch
from scipy.special import softmax

with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
probabilities = softmax(logits.numpy(), axis=-1)

### Inspecting the results

The last step is to iterate on the results. As shown below, HallOumi correctly identified the first example as a non-hallucination (with probability 100%-10%=90%) and the second example as a hallucination (with probability 99%).

In [6]:
for probability, example in zip(probabilities, toy_dataset):
    # The hallucination class is 1 (positive probability).
    hallucination_prob = probability[1]
    prediction = probability.argmax()
    prediction_str = "Hallucination" if prediction == 1 else "Non-Hallucination"

    # print the results for inspection.
    print(f"Context: {example['context']}")
    print(f"Claim: {example['claim']}")
    print(f"Prediction: `{prediction_str}`")
    print(f"Hallucination probability: {hallucination_prob:.0%}\n")

Context: Today is a sunny day.
Claim: It is not raining today.
Prediction: `Non-Hallucination`
Hallucination probability: 10%

Context: James is a software engineer. He works at a tech company.
Claim: James loves his tech job.
Prediction: `Hallucination`
Hallucination probability: 99%

