Popular repositories Loading
-
alignment-faking
alignment-faking PublicForked from safety-research/open-source-alignment-faking
Replication and Extension of Anthropic & Redwood Research's Alignment Faking Paper, with CoT Monitoring Ablations
Jinja 1
-
-
-
alignment-faking-replication-lite
alignment-faking-replication-lite PublicLightweight replication of the Alignment Faking setup using the ARENA alignment science exercises.
HTML
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.