Skip to content

JudgmentLabs/osiris-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Osiris: Lightweight Hallucination Evaluation Model

🔥 News

Usage

from src.data.perturb_musique import DatasetPerturbator

perturbator = DatasetPerturbator(
    dataset_path="/path/to/your/dataset.jsonl",
    output_dir="/path/to/save/perturbed/dataset"
)

perturbator.perturb()

Evaluation

# Navigate to the evaluation directory
cd src/data/evaluation

# Run the RAGTruth benchmark on models defined in this script
# This evaluates how well models detect hallucinations in RAG contexts
bash ragtruth_predict.sh

# Format the RAGTruth benchmark results into structured JSON files if necessary
# This prepares the data for analysis and visualization
bash format_predictions.sh

# Calculate and display evaluation metrics
# Shows Recall, Precision, and F1 scores for hallucination detection
python show_results.py

Training

We used LLaMA-Factory for efficient fine-tuning. A sample configuration can be found here.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •