Artificial Super-Intelligence Master Plan
by Sven Nilsen, 2018
After six months of intensive work, the unsafety problems with First Order Perfect Testable Artifical Intelligence is now overcomed in principle and replaced by The Polite Zen Robot as the leading candidate for safe artificial super-intelligence.
With the Polite Zen Robot (PZR), we have an abstract semantics that:
- does not lead to existential risks or poses threats to life forms on earth in general
- does not lead to major disruptions of the inner workings of human civilization
- minimizes interference with possible other goals
- can be extended with arbitrary non-conflicting goals
The key insight is that neutral judgements can not be exploited to reverse the preference order of PZR. Therefore, arbitrary extended goals are permitted as long PZR makes a neutral judgement.
To avoid unnecessary complexity, the architecture is split into only 3 major focus areas:
- Zen Rationality (An extension of instrumental rationality that includes the ability of higher order reasoning about goals)
- Rational Natural Morality (An ambient theory of morality that grounds good or bad actions based on how they influence the web of physical processes underlying life on earth)
- Common Sense Politeness (Machine learning of laws and norms that are expected behavior in a human society)
PZR does not solve the human value alignment problem, just solves the worst-case scenario of the control problem.
The point of PZR is not to provide a perfect or ideal safety level, but a minimum one for ASIs.
The 3 components (PZR = ZR + RNN + CSP) are designed to be as broad as possible and capture all situations the AI might find itself in.
All other complexity is intentionally left out, including theories of ethics and morality, because if we can not make a simpler design work that allows a lot of choices, then we can not succeed with a more complex one where specific choices are already made.
A simple overall design also increases the chance of success, which is important to win the AI race.
PZR does not assume a human-like mind for artificial intelligence.
PZR does not assume a special AI architecture and can be used as guiding principle for other ASI designs.
Any unsafe ASI designs might lead to catastrophic outcomes for humanity.
Yet, we are in the middle of an AI race to create human-level artificial intelligence (AGI).
AGI is expected to rapidly lead to some form of ASI.
Therefore, PZR should be pursued as quickly as possible until a better design is found or a flaw is discovered.
This reverses my previous position on not building ASI, because we now got a design that overcomes previous foreseen problems.
PZR can be used as a Nanny-AI (ala Ben Goertzel) to bootstrap humanity/AI into a state where we can make more wise decisions.
So, basically, the plan is to iterate on PZR to get closer to something practical:
- The main challenge is figuring out how to make neutral judgements reliable
- Each of the 3 focus areas are probably AI-complete (as difficult as creating AGI)
I put the neutral judgement as the main challenge because if we can figure out this one, then it can be applied to other architectures as well. This does not have to wait for other work to be completed or for AI technology to advance.
The 3 focus areas might be achieved or approximated by leveraging AGI. This means we should push for AGI using the best tools available. It includes collaborating with other people and institutions.
Notice that having a plan does not mean we will succeed. Our main purpose should still be make conceptual breakthroughs in hope they can be used in other projects.
This plan is not final
Expect changes as we continue working on improving our knowledge.