A research implementation of Entropy Production Rate (EPR) for detecting hallucinations in Large Language Models (LLMs). This tool visualizes the uncertainty of the model at the token level, helping to identify potential fabrication or reasoning errors.
- Real-time Entropy Visualization: See the model's uncertainty for every generated token.
- Interactive Charts: Explore the entropy flow and token probability distribution.
- Risk Scoring: Automatic classification of tokens as Low, Medium, or High risk.
- Generation Control: Adjust Temperature, Top-K, and Seed to experiment with model behavior.
- Modular Architecture: Clean separation between the Flask backend and the core algorithmic logic.
- Python 3.8+
- Node.js 16+ (for the frontend)
- Ollama (running locally)
- Download and install Ollama from ollama.com.
- Start the Ollama server (usually runs in the background).
- Pull a model (i use
cas/ministral-8b-instruct-2410_q4km):Note: You can configure available models inollama pull cas/ministral-8b-instruct-2410_q4km
models_config.py.
- Clone the repository and navigate to the root:
cd wepr - Install Python dependencies:
pip install flask flask-cors requests
- Start the backend server:
The server will start on
python3 app.py
http://127.0.0.1:5001.
- Navigate to the frontend directory:
cd frontend - Install dependencies:
npm install
- Start the development server:
The app will open at
npm run dev
http://localhost:5173.
- Open your browser at
http://localhost:5173. - Select a model from the dropdown (top right).
- (Optional) Click the Settings icon to adjust Temperature or Top-K.
- Type a prompt and hit Enter.
- Analyze the results:
- Green tokens: The model is confident.
- Red tokens: The model is uncertain (High Entropy).
- Click on any token to see the Top-20 candidates and their probabilities.
app.py: The Flask API entry point.core_epr/: Contains the core logic.client.py: Handles communication with Ollama.entropy.py: Mathematical functions (Shannon Entropy, Normalization).processing.py: Data pipeline and risk calculation.
frontend/: The React application.
Based on the research paper: "Entropy-based Hallucination Detection" (arXiv:2509.04492).
Note: If you are here Charles, I hope it's clear for you!