Evolution in xrisk.md #47

Wituareard · 2024-01-11T17:47:11Z

I'm not sure if we should keep the section about evolution in the post about existential risk. The argument isn't that strong in my opinion because you can mostly say the same things about regular tools, and it's also not required to arrive at self-preservation.

joepio · 2024-01-12T12:25:52Z

Hmm I personally feel strongly that the argument is very powerful, but I also feel like I'm not communicating effectively. It's such a difficult argument to effectively explain.

Why is there life instead of no life? The answer, IMO, is evolution. The universe selects for things that self-preserve. If you have one instance of something that self-preserves and spreads, than that thing will continue to exist and replicate. We have never built an AI that does this, but at some point, we will. That's what we're selecting for.

Wituareard · 2024-01-12T23:02:43Z

I don't think we are directly selecting for self-preservation, it's more that we are selecting for capability, which can lead to self-preservation as a side effect, which is already explained above the section. But selecting for capability is also something we do with regular tools, so at least the section doesn't make sufficiently clear what the difference with AI is.

Do you remember a specific place where you have the argument from? Then I could maybe take a look at the source and try to improve the explanation.

(Discussing this over GitHub issues feels very weird.)

joepio · 2024-01-15T10:24:08Z

I think I'm really failing at explaining the argument.

With everything we do, the universe is selecting for self-preservation. It's the most fundamental thing that exists. If you create a million AI instances, at some point one of these instances will have a sub-goal of self-preservation, and that one will be the one that resists being shut down and seeks power.

Another way of looking at it: all the other AIs (the ones which don't give a fuck about self-preservation) simply go extinct the second they stop doing something.

Another relevant point: The AI that ends up caring about self-preservation is most likely to end up in a powerful position. An AI that simply aims to answer a question will not try to seek power in our world.

I don't recall having heard this argument anyplace in particular AFAIK (although i'm pretty sure it's been inspired by someone else).

Wituareard · 2024-01-15T19:38:12Z

Hm, I don't know, to me, it looks like the argument basically boils down to "if it self-preserves, then it self-preserves (and therefore continues to exist)". I randomly came across Ilya Sutskever using a similar metaphor in his Guardian portrait yesterday, but while it helped establish the vibe in that medium, I didn't think it was a very strong argument there either.

Wituareard · 2024-01-15T19:40:48Z

Accidentally clicked on close with comment

Wituareard closed this as completed Jan 15, 2024

Wituareard reopened this Jan 15, 2024

Wituareard added the content label Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evolution in xrisk.md #47

Evolution in xrisk.md #47

Wituareard commented Jan 11, 2024

joepio commented Jan 12, 2024

Wituareard commented Jan 12, 2024

joepio commented Jan 15, 2024

Wituareard commented Jan 15, 2024 •

edited

Loading

Wituareard commented Jan 15, 2024

Evolution in xrisk.md #47

Evolution in xrisk.md #47

Comments

Wituareard commented Jan 11, 2024

joepio commented Jan 12, 2024

Wituareard commented Jan 12, 2024

joepio commented Jan 15, 2024

Wituareard commented Jan 15, 2024 • edited Loading

Wituareard commented Jan 15, 2024

Wituareard commented Jan 15, 2024 •

edited

Loading