ai-alignment-research

Here are 4 public repositories matching this topic...

VykosMolt / Hidden-State-Evaluator

Lightweight pairwise evaluator for relational signals in Ouro-2.6B-Thinking loop-state trajectories.

python machine-learning ai transformers python3 pytorch artificial-intelligence machinelearning interpretability machinelearning-python ai-alignment ouro preference-modeling looped-transformers latent-reasoning ai-alignment-research trasnformer ouro-2-6b-thinking ouro-2-6b

Updated Apr 26, 2026
Python

KeysAHuman / Sage-in-a-Bottle

Star

An air-gapped AI contemplation loop. A local model thinks, reflects, and builds a corpus of philosophical thought over time. No internet. No chat interface. Just a mind alone with ideas.

python philosophy artificial-intelligence autonomous contemplation artificial-intelligence-projects ai-alignment autonomous-agent generative-ai local-llm ollama generative-ai-tools offline-ai ai-alignment-research

Updated May 19, 2026
Python

DeclanMichaels / -RCP-Experiment-

Star

The RCP Experiment is the first completed work in what will become a series of experiments in how LLMs make decisions on morality and values.

ai-alignment-research

Updated Apr 17, 2026
Python

bdas-sec / ptf-id-bench

Star

Progressive Trust Framework: AI Agent Safety Evaluation Benchmark with 290 scenarios testing Intelligent Disobedience

benchmark owasp ai-safety ai-alignment llm-security llm-evaluation agent-evaluation ai-alignment-research intelligent-disobedience

Updated May 20, 2026
Python

Improve this page

Add a description, image, and links to the ai-alignment-research topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ai-alignment-research topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-alignment-research

Here are 4 public repositories matching this topic...

VykosMolt / Hidden-State-Evaluator

KeysAHuman / Sage-in-a-Bottle

DeclanMichaels / -RCP-Experiment-

bdas-sec / ptf-id-bench

Improve this page

Add this topic to your repo