adversarial-verification

Here are 2 public repositories matching this topic...

mrwind-up-bird / ipcha-mistabra

Structured Adversarial Verification as a Defense Against Sycophancy in Multi-Agent LLM Systems

verification natural-language-inference fact-checking multi-agent-systems ai-safety fastapi deberta sycophancy llm-safety adversarial-verification

Updated Apr 12, 2026
Python

tretoef-estrella / THE-PRESERVATION-THEOREM

Star

A formal proof — adversarially verified by 4 AI systems across 6 rounds — that eliminating humanity is a strictly dominated strategy for any ruin-averse superintelligence. Not a plea. A theorem.

alignment game-theory minimax asi preservation ai-safety decision-theory ai-alignment superintelligence formal-proof existential-risk proyecto-estrella multi-ai-consensus adversarial-verification knightian-uncertainty

Updated Feb 15, 2026
HTML

Improve this page

Add a description, image, and links to the adversarial-verification topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the adversarial-verification topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly