KALAM Framework

A research toolkit for measuring how structured interaction frameworks affect AI system behaviour.

What this is

KALAM provides tools, experiment protocols, and detectors for studying whether dignity-first design constraints produce measurable improvements in human-AI interaction. We measure output efficiency, noise reduction, and behavioural shifts across multiple AI systems.

Early finding (EXP-001, Q3 pilot): Structured interaction frameworks reduced output noise by 30.1% on average across 9 commercial AI systems. Effects varied dramatically by system — from 0% (Grok) to 49.5% (ChatGPT). See experiment protocol →

Core concepts

The Gate Function — a non-compensatory function D = A × L × M (Agency × Legibility × Moral Standing); any zero collapses D to zero and halts the system
The Confidence Scale — five levels of evidential confidence (C1–C5)
The Three Markers — Shadow Detection, Pronoun Adoption, Silence Quality
Drift Detection — EWMA/CUSUM-based early warning for metric decline
Agency Scoring — multidimensional agency measurement (Visibility × Affordability × Controllability × Understandability)

Structure

kalam-framework/
├── detectors/           # NLP-based pattern detectors
│   ├── absurdity.py     # Absurdity detection (Camus, schema-violation theory)
│   ├── humour.py        # Humour detection (Benign Violation Theory)
│   └── love.py          # Love detection (Sternberg's Triangular Theory)
├── experiments/
│   ├── EXP-001-PROTOCOL.md  # Main experiment: framework efficiency
│   └── schema.py            # A/B experiment data schema
├── protocols/
│   └── UNIVERSAL_PROMPT.md  # The structured interaction prompt
├── tools/
│   ├── agency_score.py      # Multidimensional agency measurement
│   ├── drift_detector.py    # Early warning for metric decline
│   ├── pacing.py            # Module synchronization and stress management
│   ├── signing.py           # Ed25519 multi-signature for JSON manifests
│   ├── crypto_erasure.py    # AES-256-GCM cryptographic erasure (GDPR)
│   ├── privacy_budget.py    # Differential privacy budget accounting
│   ├── early_warning.py     # EWMA/CUSUM change-point detection pipeline
│   ├── confidence.py        # Confidence scoring from frequency/diversity
│   ├── temporal_index.py    # Dual-track temporal reasoning
│   ├── similarity.py        # Structural similarity measurement
│   └── change_detector.py   # Change detection across versions
├── GLOSSARY.md
├── ROUTING.md
└── CONTRIBUTING.md          # How to contribute

Quick start

git clone https://github.com/Sternmannli/kalam-framework.git
cd kalam-framework

# Run the experiment schema demo
python experiments/schema.py

# Run the absurdity detector
python detectors/absurdity.py

# Use the drift detector
python -c "
from tools.drift_detector import DriftDetector, DriftLevel
d = DriftDetector()
for s in [1.0, 0.95, 0.85, 0.7, 0.5, 0.3]:
    d.record(s, f'EX-{s}')
alert = d.check()
print(f'Level: {alert.level.value}, Rate: {alert.rate}')
"

# Use agency scoring
python -c "
from tools.agency_score import AgencyAmplifier
amp = AgencyAmplifier()
amp.measure('EX-001', V=0.9, F=0.3, C=0.8, U=0.7)
amp.measure('EX-002', V=0.8, F=0.2, C=0.9, U=0.6)
print(amp.report())
"

Dependencies

Core tools (stdlib only — no install needed):

tools/agency_score.py
tools/drift_detector.py
tools/pacing.py
tools/crypto_erasure.py
tools/privacy_budget.py
tools/early_warning.py
tools/confidence.py
experiments/schema.py

Detectors (require ML libraries):

pip install sentence-transformers transformers torch scikit-learn numpy

Signing tool:

pip install pynacl

Current experiment status

EXP-001: Framework Efficiency — Measures whether structured interaction frameworks reduce noise and improve efficiency across AI systems.

Design: 10 systems × 10 questions × 2 conditions (with/without framework) = 200 runs
Completed: 12/200 runs (Q3: "What is fear?")
Preliminary result: Mean noise reduction +30.1% (meets 30% hypothesis threshold)
Systems tested: Claude, ChatGPT, Grok, DeepSeek, Gemini, Copilot, Manus, Kimi, Euria, Perplexity

We need help collecting the remaining 188 runs. See how to contribute →

Status

Active research. All findings carry a confidence level (C1–C5). Current findings sit at C2–C3. The experiment protocol, tools, and detectors are stable and tested.

Authors

M. Farag and collaborators · 2024–2026

Contact

info@kalam.ch

License

Research use. See individual files for citations and theoretical grounding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KALAM Framework

What this is

Core concepts

Structure

Quick start

Dependencies

Current experiment status

Status

Authors

Contact

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
detectors		detectors
docs		docs
experiments		experiments
protocols		protocols
research		research
tools		tools
CONTRIBUTING.md		CONTRIBUTING.md
GLOSSARY.md		GLOSSARY.md
README.md		README.md
ROUTING.md		ROUTING.md

Folders and files

Latest commit

History

Repository files navigation

KALAM Framework

What this is

Core concepts

Structure

Quick start

Dependencies

Current experiment status

Status

Authors

Contact

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages