-
-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evolution in xrisk.md #47
Comments
Hmm I personally feel strongly that the argument is very powerful, but I also feel like I'm not communicating effectively. It's such a difficult argument to effectively explain. Why is there life instead of no life? The answer, IMO, is evolution. The universe selects for things that self-preserve. If you have one instance of something that self-preserves and spreads, than that thing will continue to exist and replicate. We have never built an AI that does this, but at some point, we will. That's what we're selecting for. |
I don't think we are directly selecting for self-preservation, it's more that we are selecting for capability, which can lead to self-preservation as a side effect, which is already explained above the section. But selecting for capability is also something we do with regular tools, so at least the section doesn't make sufficiently clear what the difference with AI is. Do you remember a specific place where you have the argument from? Then I could maybe take a look at the source and try to improve the explanation. (Discussing this over GitHub issues feels very weird.) |
I think I'm really failing at explaining the argument. With everything we do, the universe is selecting for self-preservation. It's the most fundamental thing that exists. If you create a million AI instances, at some point one of these instances will have a sub-goal of self-preservation, and that one will be the one that resists being shut down and seeks power. Another way of looking at it: all the other AIs (the ones which don't give a fuck about self-preservation) simply go extinct the second they stop doing something. Another relevant point: The AI that ends up caring about self-preservation is most likely to end up in a powerful position. An AI that simply aims to answer a question will not try to seek power in our world. I don't recall having heard this argument anyplace in particular AFAIK (although i'm pretty sure it's been inspired by someone else). |
Hm, I don't know, to me, it looks like the argument basically boils down to "if it self-preserves, then it self-preserves (and therefore continues to exist)". I randomly came across Ilya Sutskever using a similar metaphor in his Guardian portrait yesterday, but while it helped establish the vibe in that medium, I didn't think it was a very strong argument there either. |
Accidentally clicked on close with comment |
I'm not sure if we should keep the section about evolution in the post about existential risk. The argument isn't that strong in my opinion because you can mostly say the same things about regular tools, and it's also not required to arrive at self-preservation.
The text was updated successfully, but these errors were encountered: