
## Compute

In order to copy itself, the agent needs to spend effort on getting access to more compute. This compute can be thought of as more servers or higher-quality hardware, so that compute is in units of operations per second (FLOPS). We assume that the amount of compute it gets is proportional to the amount of effort spent and doesn't change over time. The cost for getting access to another unit of compute, enough for one more copy, is constant.

For simplicity, we assume that all agents require the same amount of compute which is constant through time. Smarter agents will be able to use it more efficiently, and if an agent has access to more compute the only way they can use it is by creating (potentially trained) copies to use it themselves. We don't account for memory or distinct types of compute; arguably we can just bundle it in with compute and it won't change the model qualitatively.

Effort can thus be thought of as a fraction of the available compute. We thus normalize this and think of the available compute at each time $t$ as $1$. 

[
    TODO: assume that there is a finite amount of possible available compute. Prove that if the compute is infinite then there's a problem with the optimization

- We could assume that the agent has access to specific bounded amount of compute that they can share with their decendents, and that their own possible use of compute is limited.
] 



## Intelligence

Each agent has an intelligence of $I(a)$, constant through time. We think of intelligence as a multiplicative productivity factor.

Training a copy of an agent, $a\to b$, takes more effort the higher the intelligence of the copy, $I(b)$, and no effort at all if $I(b)=I(a)$. The whole training process happens through one time step exactly, and can only be done before an agent is deployed. We describe this as the following function:

$$I(b) = J_{I(a)}(I(a)T(a\to b))$$

where $T(a\to b)$ is the amount of effort $a$ spent on training $b$ and $J_L$ is a function that describe the effect of quality-weighted effort on increasing intelligence from level $L$. 

Reasonable assumptions:
- $J_L(0) = L$
- $J_L(e)$ is increasing and concave in $e$
- $J_{J_L(e_1)}(e_2) = J_L(e_1+e_2)$ (because the amount of quality-weighted effort required to increase intelligence from $L_1$ to $L_2$ doesn't depend on the amount of quality-weighted effort that went into increasing intelligence from $L_0$ to $L_1$)
  - Therefore, to define $J_L$ it's enough to define it for $L=0$ (which is the result of training from scratch). We'll denote this simply as $J=J_0$ and note that $J_L(e) = J(e + J^{-1}(L))$.


# Simpler model

There is a total of $\bar{C}$ available compute, and each agent uses exactly $C$ compute. Starting with 1.

at time $t$ the agent $a$ creates $p_t(a) = I_t(a)c_t(a)$ paperclips, where $c_t(a)$ is the amount of compute used for consumption.

$e_t(a)$ effort in copying. Assuming that $e_t(a_1) = e_t(a_2)$, and generally everything is homogeneous.

$n_{t+1} = e_tn_{t}$ 

$i_t(a)$ effort in improving intelligence.

$I_{t+1} = (1 + i_t)^r I_t$

$c_t(a) + e_t(a) + i_t(a) = C$

$\sum_a C \le \bar{C}$

$u = \sum_{t,a} \beta^t p_t(a)$


## even simpler assumptions

$t\in \{0,1\}$

at $t=1$,

$u_1 = n_1\beta I_1C$

at $t=0$

$u_0 = n_0I_0(C-e_0-i_0)$

$n_1 = e_0n_0$, $I_1 = (1+i_0)^rI_0$.

Choose $i_0,e_0$ to maximize $u = u_0+u_1$ under the constraints above.


$$ \begin{align*}

\frac{\partial u}{\partial i_0} &= -n_0I_0 + e_0n_0\beta r(1+i_0)^{r-1}I_0 \\
\frac{\partial u}{\partial e_0} &= -n_0I_0 + n_0\beta (1+i_0)^rI_0 
\end{align*}$$

This is linear in $n_0, I_0$, so we get

$$ \begin{align*}
e_0 r(1+i_0)^{r-1} &= 1 \\
(1+i_0)^r &= 1
\end{align*}$$

and the solution is 

$$ \begin{align*}

i_0 &= 0  \\
e_0 &= \frac{1}{r}

\end{align*}$$