Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evolution in xrisk.md #47

Open
Wituareard opened this issue Jan 11, 2024 · 5 comments
Open

Evolution in xrisk.md #47

Wituareard opened this issue Jan 11, 2024 · 5 comments
Labels

Comments

@Wituareard
Copy link
Collaborator

I'm not sure if we should keep the section about evolution in the post about existential risk. The argument isn't that strong in my opinion because you can mostly say the same things about regular tools, and it's also not required to arrive at self-preservation.

@joepio
Copy link
Collaborator

joepio commented Jan 12, 2024

Hmm I personally feel strongly that the argument is very powerful, but I also feel like I'm not communicating effectively. It's such a difficult argument to effectively explain.

Why is there life instead of no life? The answer, IMO, is evolution. The universe selects for things that self-preserve. If you have one instance of something that self-preserves and spreads, than that thing will continue to exist and replicate. We have never built an AI that does this, but at some point, we will. That's what we're selecting for.

@Wituareard
Copy link
Collaborator Author

I don't think we are directly selecting for self-preservation, it's more that we are selecting for capability, which can lead to self-preservation as a side effect, which is already explained above the section. But selecting for capability is also something we do with regular tools, so at least the section doesn't make sufficiently clear what the difference with AI is.

Do you remember a specific place where you have the argument from? Then I could maybe take a look at the source and try to improve the explanation.

(Discussing this over GitHub issues feels very weird.)

@joepio
Copy link
Collaborator

joepio commented Jan 15, 2024

I think I'm really failing at explaining the argument.

With everything we do, the universe is selecting for self-preservation. It's the most fundamental thing that exists. If you create a million AI instances, at some point one of these instances will have a sub-goal of self-preservation, and that one will be the one that resists being shut down and seeks power.

Another way of looking at it: all the other AIs (the ones which don't give a fuck about self-preservation) simply go extinct the second they stop doing something.

Another relevant point: The AI that ends up caring about self-preservation is most likely to end up in a powerful position. An AI that simply aims to answer a question will not try to seek power in our world.

I don't recall having heard this argument anyplace in particular AFAIK (although i'm pretty sure it's been inspired by someone else).

@Wituareard
Copy link
Collaborator Author

Wituareard commented Jan 15, 2024

Hm, I don't know, to me, it looks like the argument basically boils down to "if it self-preserves, then it self-preserves (and therefore continues to exist)". I randomly came across Ilya Sutskever using a similar metaphor in his Guardian portrait yesterday, but while it helped establish the vibe in that medium, I didn't think it was a very strong argument there either.

@Wituareard
Copy link
Collaborator Author

Accidentally clicked on close with comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants