Skip to content
Switch branches/tags
Go to file
Cannot retrieve contributors at this time

If you find yourself confused about mathematics and theorem proving, learn path semantics to get a solid understand of mathematical foundations that you can use to learn faster more complex math needed to reason about AI safety.

Artificial Intelligence and Safety Research

The Big Picture

Formal conceptualizing of super-intelligence, not necessary safe nor desirable:


Game Theory


Language Semantics of Uncertainty and Identity

Common Sense and Reasoning Using Natural Language

The Hard Safety Control Problem

Mathematical approaches to the control problem of super-intelligent AI safety, the hard way:

Aiming for approximate, long term solutions:

Zen Rationality

Instrumental Rationality is a philosophical framework where one assumes:

  • The agent has a goal it wants to optimize
  • The agent can trust its own decisions in the future in the absence of new evidence

Link to Instrumental Rationality in Stanford Encyclopedia of Philosophy.
Notice! The position that Instrumental Rationality is the whole of practical rationality, has been proven wrong by construction of the LOST (Local Optimal Safety Theory) algorithm. AdvancedResearch's current position is that the whole of practical rationality requires at least some Zen Rationality.

Zen Rationality is an extension of Instrumental Rationality with higher order reasoning about goals.

This means:

  • The agent might have no specific goal, it might have many, or it might have conflicting goals
  • The agent might not trust its own decisions in the future (e.g. believing it might suddenly turn off or fail)

Before you try to understand Zen Rationality, it is useful to first learn about the kind of "rationality" that is closer to human thinking than Instrumental Rationality, called "Self Rationality":


Papers that do not fall into a specific category: