A tool for verifying LLM responses by comparing log probability distributions using statistical analysis. This package helps detect model spoofing and ensures that responses actually come from the claimed model.
Note: To run the verifier end-to-end you need a machine with GPUs that can run vLLM.
When an LLM generates a response, it produces log probabilities for each token. These probabilities form a characteristic distribution unique to each model. By comparing the original log probabilities with fresh samples from a verification model, we can statistically determine if the responses came from the same model.
- 🔍 Statistical Verification: Uses Kolmogorov-Smirnov test to compare log probability distributions
- 🌐 Universal Compatibility: Works with any OpenAI-compatible API (vLLM, OpenRouter, OpenAI, etc.)
- 🎯 High Accuracy: Detects model spoofing with configurable confidence levels
- 🚀 Easy to Use: Simple CLI tools for sampling and verification
- 📦 Production Ready: Clean, tested, and well-documented codebase
- 🐍 Python Library: Can be used programmatically in your own code
- 🔤 Text-Based Matching: Supports providers without token IDs through intelligent text normalization
The core verification logic is implemented in the following files:
src/core.py: Core verification logic/utilssrc/sample.py: Utils for generating verification data from an OpenAI-compatible APIsrc/verify.py: Utils for verifying responses with a known vLLM instance
For more details on the LOGIC methodology, see METHODOLOGY.md.
- uv package manager (manages Python automatically)
- Task task runner (optional but recommended)
- GPU-enabled machine for running vLLM verification (optional - see below)
Note: You don't need to install Python manually! The repository is configured as a self-contained workspace where uv automatically manages the Python version (3.11) and all dependencies.
GPU Requirements: Full verification requires vLLM with GPU support (NVIDIA CUDA or AMD ROCm).
# Clone the repository
git clone https://github.com/context-labs/logic
cd logic
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install Task (optional but recommended)
brew install go-task # macOS
# Or see https://taskfile.dev/installation/
# Install dependencies and setup environment with uv
task setupRun a full end-to-end test flow with successful and unsuccessful test cases:
# Run verification statistics on pre-generated sample outputs:
task demo:verification-success
task demo:verification-failure
# Requires a machine with vLLM support (i.e. with GPUs):
task demo:test-success
task demo:test-failure
# Run the CLI tools directly:
uv run logprob-sample --help
uv run logprob-verify --helptask install # Install dependencies
task test # Run unit tests
task test:integration # Run integration tests
task lint # Check code quality
task lint:fix # Auto-fix linting issues
task format # Format code
task check # Run linting and tests
task clean # Clean generated files
task update:deps # Update dependenciesRun task --list to see all available commands.
To run the streamlined end-to-end test flow, run:
# Download models for integration tests
task download:models
# Run end-to-end test with local vLLM server
task test:integrationFor more details on testing, see TESTING.md.
This repository is configured as a complete workspace:
.python-version: Specifies Python 3.11 (auto-installed by uv)pyproject.toml: Defines all dependencies and project metadatauv.lock: Locks exact versions for reproducibility[tool.uv]config: Tells uv to manage Python installations automatically
When you run task setup, it:
- Installs Python 3.11 if not present
- Creates an isolated virtual environment in
.venv - Installs all dependencies with exact versions from
uv.lock
No need to manually manage Python versions or virtual environments!
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Write tests for new functionality
- Run checks:
task ci - Commit with clear messages (e.g., "Add support for streaming responses")
- Submit a pull request
MIT License - see LICENSE file for details.