ai-fail-safe

safe-reward Public

a prototype for an AI safety library that allows an agent to maximize its reward by solving a puzzle in order to prevent the worst-case outcomes of perverse instantiation

Python 8

life-span Public

a project to ensure an artificial agent will eventually reach the end of its existence

1

gene-drive Public

a project to ensure that all child processes created by an agent "inherit" the agent's safety controls

1

mulligan Public

a library designed to shut down an agent exhibiting unexpected behavior providing a potential "mulligan" to human civilization; IN CASE OF FAILURE, DO NOT JUST REMOVE THIS CONSTRAINT AND START IT B…

1

honeypot Public

a project to detect environment tampering on the part of an agent

1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-fail-safe

Popular repositories Loading

Repositories

People

Top languages

Most used topics