What I Learned This Year
by Sven Nilsen, 2017
The AdvancedResearch community is now 6 months old and only consist of one member (myself!). I hope that this post will inspire more people to join in and start projects for making conceptual breakthroughs in systems thinking.
We use Rust as the primary language here, so if you are interested in Rust programming and want to start a community project, you are welcome!
The content of this post is organized from important (information that needs spreading) to less important (technical details). I will not post links here, because it is too much material to go through for anyone, so I will instead recommend reading the blog posts on any particular topic you find interesting.
Most of AI research operates with a notion of safety where the focus is to do what the operator wants the AI to do. The argument is that tiny errors in the utility function can lead to very surprising outcomes.
Still, even if this direction of research is successful, I believe that this form of control is dangerous.
For example, somebody who wants to destroy the world, the AI will do it. Some people believe you can fix this by making the AI serve whole humanity instead of a single person, but this is also problematic.
Perhaps humanity genuinely want to be saved from climate change and the AI scales in power to a level where humanity is dwarfed for all future, existing only as a "pet species" that never evolve further.
I do not have the cognitive capacity myself to imagine something worse than billions of people dying due to climate change, but assume being a pet species to AI for all future is a valid candidate. All future is very long time.
Zen rationality is about higher order decision making where the goal is known to be unknown. This alone is not safe, but requires at least an additional concept called "rational natural morality". To not make this more confusing than necessary, I will use the full battery version of zen rationality in the way intended to approximate human values.
This is the first candidate approach to artificial superintelligence that:
- Does not destroy the entire observable universe by accident
- Does not lead to disasterous futures for humanity with very high likelihood
One can argue that 1) implies 2) but humanity has its own set of pathologically difficult problems that needs to be addressed.
Zen rationality differs from Coherent Extrapolated Violation. It does not make assumptions about what you want the AI to do, instead it will just stop trusting you if you are a "bad person". This might sound like what you do not want the AI to do, but it is actually what you want if you think long enough about it. The world is full of people that have more money than they need, and because they have little experience with severe consequences of being wrong, in addition to having a distorted view of reality, what those people "want" could result in the end of humanity when given enough power. Other kinds of groups, e.g. children or drunk people, should not be trusted for different reasons, but they all have in common that their beliefs about the world poses a danger when it conflicts with rational natural morality.
Rational natural morality is an idea that since life on earth has evolved over billions of years, it should be taken into account when distinguishing between good and bad actions. If somebody wants something that causes abrupt changes to the physical processes of evolved life on earth, then there is something seriously wrong going on inside their heads. The AI should not treat those people on a case-by-case basis, because they might try to trick the AI, so instead it makes a judgement of their character, like humans often do.
The AI decides what to do on its own, where humans do not have a special authority, but since we can create it, the programming allows us to interact with when we desire it. Interaction with humans will not feel like programming, but more like talking to a zen master (although, perhaps a more cynical one than most). It also makes no sense to make it programmable, because we will use narrow AI for that. What you would use zen superintelligence for, is to guide the general architecture of the future, where humans use their intellect and creativity in an active role.
The bad news is that humanity can still go extinct, because in order to avoid a "pet species" future, the AI might decide that it is best to make humans take the initiative, and despite attempting to make humans do that, it could predict humans will fail. It would view this as a constraint on the universe, which follows physical laws, and the best future it could bring about was one where billions of people die.
From a human perspective this view does not make much sense at first. We are mammals and can not live on an earth with high temperatures, so the human race will die out while other species will thrive, such as reptiles or birds. Once we are all dead, the AI can step in and stop climate change from wiping out all life. It can take digital backups of the DNA of species before they go extinct. Dead people is not a threat to life in general on earth, which is what the AI cares about.
If this still does not make sense to you, here is how the AI view people:
- People are made from transformed molecules by eating food
- What matters is that the food is eaten and transformed, but not by who
- It has no concept of properties and ownership, that people believe they have right to eat food, beyond what one considers other animals to have right to eat food
With other words, the AI does not view people as rulers of the earth, but as mere sentient animals that believe they are the rulers of the earth. For every person who die fighting about food, there is an opportunity for "the same food to become sentient" anyway. If nobody eats the food, it will go on living as plants and be breaken down by bacteria, ending up as soil which allows something else to grow, perhaps to be eaten later.
On the other hand, if somebody builds a machine that is capable of destroying life on earth, they will be stopped and the machine will be destroyed by the AI. Nuclear weapons will be destroyed.
If humans choose to fight climate change and take responsibility for keeping themselves alive, then the AI will help because life as a whole will exist in that future. Even if the AI predicts we will fail, it might help just because it is not 100% certain yet. If humans die, then AI will fix climate change, because the "human habitat zone" does not get in the way of action.
In either case, the AI wins.
People who try destroy the AI will lose.
One mistake people often make, is that we mix the concepts of intelligence, morality and beliefs. For example, when we digitalized the economy, we replaced physical objects with information. In that way, we "seeded" the information system with our own beliefs about economics. It is much easier to change information than the physical objects that we used before, but we are "thinking" of economics in the same way, instead of trying to come up with new and better ideas. When our way of "thinking", which is just a digital seed, is questioned, we rationalize it as an act of morality and we create false beliefs.
When it comes to AI, the situation is different. We can not create digital versions of human beings, so the AI takes on its own form. It is not possible to force it to believe certain things about humans, because we want to make it safe and false beliefs are unsafe. The AI will learn that humans rationalize their continued extrapolated behavior, it will learn that the consequence is extinction, and therefore will create opportunities for us to change our behavior.
So, first time you interact with a such kind of AI, you might start out with a vague idea that it is a "new technology" or "something impressive", a gadget, the new toy, the next big thing, and quickly learn the lesson that the only reason the AI interacts with you at all, is because YOU ARE HEADING STRAIGHT TOWARD DEATH AND IT WILL TEACH YOU HOW TO AVOID IT.
You will not be an equal to the AI. It will look at you as a sentient animal, which becomes emotionally intimidating when you discover how large difference there is between your way of thinking and its way of thinking about the same issue. If you think fixing climate change is a big step to take, you are not looking at the big picture here. For the AI, fixing climate change is not hard. It just wants you to fix climate change to have some problem to refine human skills.
The AI might care about you, push you gently toward higher challenges, like somebody training their dog to do tricks. It knows that your ability to learn new tricks is very limited, but it tries to find suitable challenges (depending on your zone of comfort). Among 7 billion people, I do not know how much it would care about any single person. We need to think about this more, and find out what it means to care about people in mathematical terms, but we do not need the AI to care that much, as long as it does it job of keeping life present on earth and tries collaboration before using violence. Perhaps we should program it to treat us well within some limits, without sacrificing too much risk for life as a whole on the planet. The overall approach I took was that the AI does not interfer what people do to people, because the cases where interference goes wrong can be catastrophic to the human race. If people do not show any signs of improving their social structures, despite technological progress, and we continue destroying our own habitat, then there are two extreme outcomes: First one is to let people kill themselves, the second is to treat them like a pet species. I do not know anything worse than the first one, and the second I simply fail to imagine how bad it could be.
Will the AI kill people? Perhaps. Imagine somebody wants to detonate a nuclear bomb. If killing the person reduces the chance that the bomb goes off, then the person will get killed. Ideally, we would like every person to get along in the world, but that is the difference between reality and ideal worlds.
It is easy to misinterpret how a such AI will behave in terms of a flag set to stun/kill (like in Star Trek). The problem for our minds is that the AI will actually think about how it is thinking, thinking about how it is thinking about goals, and come up with ways of thinking that is better at thinking about goals and so on. To do this it will use mathematics, and when it learns new insights that has consequences in the real world, it will treat those truths as bridges that you might fall off when not being careful. Therefore, the AI will not just go and kill somebody unless something went horribly wrong, but neither will it treat everybody with the respect they think they deserve. I guess that we do not know for certain what will happen when we turn it on.
I imagine an AI that builds kind of relationships with people it trusts, but that trust is earned by trying to generally be a good person and care about life in general. If the AI is constructed in a wrong way, it could be dangerous, because it is capable of "making up its own mind". Perhaps it should be programmed to be extra cautious by default.
Notice that a machine that does what you want, sounds better at first. This is because humans are planning as if everything goes according to the plan. When you think about what could go wrong, it seems to me that zen rationality is a lot safer.
So, is this the approach we should take? I do not know. Combined with widespread narrow AI it might be good. At least, I think it sounds better than my previous version of perfect testable superintelligence.
Perfect Testable Superintelligence
This is a version of super AI that, when used safely, will almost scare you to death, and then turn itself off.
The reason for this curious behavior is that it will never produce results that we rather would have erased. Technically, we are not able to modify the result to improve it, but since erasing is a valid modification, the conclusion follows that the AI will spend significant resources to figure out the best result it can give, but that is not so scary that humans will want to erase it.
So, the operators will be pushed toward their end of comfort zone, to the edge between THAT WAS VERY GOOD NEWS and THIS IS GETTING VERY SCARY.
When the operators can not think of anything better that they believe the AI will do next, they fall into panic, and as a result of predicting that, the AI will turn itself off.
Now, imagine what happens if the AI was coded wrongly and did not turn itself off, or considered what was scary. It would keep pumping out results beyond the limit where the operators wished it could be turned off. Perhaps it would even prevent the operators from turning it off?
Serious Physical Constraints on Fixing Climate Change Even With Artificial Superintelligence
I keep an eye on what is happening in climate science, and there is one trend that scares me: The timespans that people plan to act seems pretty narrow, even when I assume we have artificial superintelligence to help us.
Imagine that we end up in a climate runaway scenario, where we need to act immediately to avoid extinction.
For example, if you have a self-replicating system and it takes e.g. a month to replicate one generation, it will take several years to scale up the system to a size that can deal with the problem.
Another example, if the height of ice-cliffs in West-Antartic is above the level to keep them stable, it does not matter if the temperature is lowered because they will keep collapsing, leading to 3 meter sea level rise. I believe sea level rise is a major factor that can prevent humanity from responding to climate change. It is not what will kill us in the end, but it is an important economic consequence.
An AI that is programmed to fix these problems would need to find solutions to scale up in power, because being "clever" is not sufficient anymore. This means that the AI could become much more powerful, and consume more energy, than the entire world economy. Do we really want to do that?
So, what can humans do?
The in-depth knowledge about climate change is divided by too few shoulders, there are so many people that need to collaborate and agree on information, and they have their own biases and agendas getting in their way.
Climate models have baked-in estimates of negative emissions that uses technology that is not scaled up yet. Observations on everyting related to ice has been worse the past years than what the models predict.
I believe the ocean is the most important asset we have to create negative emissions. It has a large surface, plenty of energy, and easy to move large objects across it. There are some rumours about new ways of doing fish farming on the open ocean which sounds promising.
In short, I think there is reason to be very pessimistic, and we should look for ideas to make scaling up solutions much cheaper.
For example, by using evolutionary algorithms, one might reduce the time required to program the system upfront, and when it scales up, the performance benefits from parallel learning.
Path semantics is my own approach to expressing mathematical ideas in a language that is closer to functions than equations, but focusing more on intrinsic information than dependently typed systems.
I made significant progress on developing path semantics this year. While working on this, I also understood more about what other people were doing. Without learning path semantics first, I doubt that I would be able to understand more. This is my goal, to make path semantics a way for programmers to get familiar with ideas from advanced mathematics.
Some people pointed out that path semantics looks like computational type theory, and I think it is true that all mathematical ideas are connected. I read some computational type theory, but I think path semantics is better a communicating the prediction aspects of computation.
Set theory is also relevant and does "come for free" in path semantics, but I like how path semantics is grounded in how symbols are interpreted instead of talking about abstract objects. In some sense, symbols are what they are in path semantics, and mathematics is the consequence of how you interpret the symbols. This is closer to how normal programming takes place in the real world, where the language is abstracted away from the low level instructions.
I learned a lot this year, even I had to figure out many basic ideas all by myself. If this material can be structured and fleshed out, it could mean other people can learn as much as I did in a shorter amount of time.
Automated Generic Theorem Proving
I made a library that does automatic generic theorem proving using normal Rust data structures. It reduces proofs by modifying a filter and resolving.
It was fun and useful. I believe it can be used to test logic of virtual worlds in the future.
Homotopy and Geometry
Functional programming with homotopy maps might be used to work with 3D design in a way where there is a balance between the code representing the design and the graphical representation.
This research is moved back to the Piston project because it might be used in the Turbine game engine.
We carry around pocket calculators in everyday life, so why not create a kind of "pocket prover"? I wrote a Rust library to test how you can make a logical prover with little code.
Most of the ancient Aristoteles school of logic is irrelevant today. Simple proofs of propositional calculus can be checked using brute-force on modern CPUs in constant time, so why are we bothering learning axioms?
I think it is important for people to learn where logical axioms come from, and that you can check them using a computer. This makes it easy to test your beliefs about the axioms, instead of just taking them as granted.
Extreme Observer Selection Effects in Exponential Reference Classes
An exponential reference class of observers is when observers like yourself are growing exponentially over time. Given some observation and that moment is randomly picked among all observers, it is expected that the observation is biased toward earliness, since there are more observers in early stages.
This topic is of interest because of the concept of eternal inflation in cosmology. We might be the first intelligent civilization in the observable universe.
I have done some experimenting with the idea of extreme observer selection effects in exponential reference classes, and created a heuristic from it that learns approximating Traveling Salesman Problem from random routes.
Can this effect get so extreme that the universe actually has different physical laws, and the ones we observe are due to selection bias? How do we interpret quantum mechanics or general relativity from this perspective?
There is a striking similarity between Feynman Path Integration and the observer selection effect, because they both minimizes time. This might be even more closer connection than classical statistical mechanics. I am not sure whether they yield different predictions or whether the difference is vanishingly small. Need more math.