# Analytic Solution to Single-Hop Arbitrage

This notebook derives a closed-form formula for the optimal input amount in a single-hop arbitrage trade between two Uniswap V2-style liquidity pools. Rather than relying on numerical methods like binary search, I solved the optimization problem analytically, yielding an exact result in O(1) time. The analysis was done under the context of an MEV bot that uses flashloans for all trades.

In [22]:
# import the optimizer function, sqrt (numpy) and time
from scipy.optimize import fsolve
from numpy import sqrt
import time


In [23]:
# Example reserve values from Uniswap V2-style liquidity pools.
# As a requirement for this formula to work, liquidity pool A
# must have a lower price than liquidity pool B.
#
# The price of pool A = a/b , price of pool B = c/d.
# These reserve amounts came from SHIB/ETH LP's on Ethereum

# Token0 in liquidity pool A
a=15800.025178893529930149
# Token1 in liquidity pool A
b=30348149.556699
# Token0 in liquidity pool B
c=5251.705779226172996106
# Token1 In liquidity pool B
d=9986593.845926
# Uniswap fee @ 0.3%
k=0.997

My initial thought was to have a constraint that requires both pools to end the trade with the same price.  Assumption being that the optimal arbitrage amount will equalize the liquidity pool prices.

---
$$f = \frac{a+x}{b-y}- \frac{c-z}{d+y} = 0$$

 $x$ are the new tokens $(token_0)$ being added to pool A from the flash loan or other buy mechanism.
 $y$ are the tokens $(token_1)$ subtracted from pool A, then added to pool B on the other AMM.
 $z$ are the tokens $(token_0)$ subtracted from pool B and then used to pay back the loan (if used) and take profit.

Using AMM constant product formula and solving for $y$.
The initial formula is $(a+x)(b-y) = ab$

$y = b - \frac{ab}{a+x}$

Using the AMM's constant product formula and solving for $z$.
The initial formula is $(c-z)(d+y) = cd$

$z = c - \frac{c*d}{d+y}$

Substituting the expressions for $z$ and $y$ into $f$, we get an expression with the single variable $x$.

$f = \frac{(a+x)^2}{(a*b)}-\frac{c*d}{(d+b - \frac{a*b}{a+x})^2}$

First, we can numerically solve for $x$ using fsolve for $f = 0$

In [24]:
# Must define the function for fsolver to use, which is f.

def func(x):
    return (a+x)/(a*b/(a+x))-c*d/(d+(b - a*b/(a+x)))**2

In [25]:
# Solve numerically for x with an initial guess of 1.
# Using time.time() to keep track of function speed.

start = time.time()
x1 = fsolve(func,1)
t1 = time.time() - start
x1

array([19.67444199])

The closed form solution of $f = \frac{(a+x)^2}{a*b)}-\frac{c*d}{(d+b - \frac{a*b}{a+x})^2} = 0$


is $x = \frac{\sqrt{ab^3cd+2ab^2cd^2+abcd^3}-abd-ad^2}{b^2+2bd+d^2}$


which can then be used directly.


In [26]:
# CLosed form solution for f1 = 0
# Also keeping track of function speed.

start = time.time()
x = (sqrt(a *b**3* c* d + 2* a* b**2* c* d**2 + a* b* c* d**3) - a* b* d - a* d**2)/(b**2 + 2* b *d + d**2)
t2 = time.time() - start

In [27]:
x


np.float64(19.6744419902239)

Both methods show the same answer to 8 decimals of precision.

In [28]:
# Check to see which method is faster.

t1 - t2

0.0007417201995849609

The closed-form solution above is slightly faster than the numerical method, but it optimizes for the wrong objective. It finds the input that *equalizes* the two pool prices, not the input that *maximizes profit*. When comparing to a spreadsheet model using a typical "goal seek" algorithm, the solution differs slightly and when accounting for gas costs, this distinction matters significantly. After more thought, there is no reason to require my orignal constraint.  

The correct approach is to maximize profit directly, which I derive next.

The next method is to look at the equation $f(x,(z(x)) = z(x) - x$.  This is our profit function.

The maximum of this formula should give us the theoretical maximum profit possible.

First, let's re-solve the initial equation above making sure to include fees.  We will introduce a constant $k$, which can be set to $k=0.997$ to represent a fee of 0.3%, the standard on Uniswap V2.

The new equation is

$$
\frac{a+kx}{b-y} - \frac{c-z}{d+ky} = 0
$$

As before, $x$ represents the amount of the flashloan and $kx$ is the effective input after the fee is taken out.  $y$ is the amount we get out of the initial liquidity pool A, then $ky$ is used as the input amount for liquidity pool B. $z$ is then the final amount we get out of liquidity pool B, and if the trade was successful, should be larger than $x$.

In the new approach, we only need $z(x)$ and x, so we can solve for that directly.  Using the constant product formula we get


$$(a+kx)(b-y)=ab, ~~~ (c-z)(d+ky)=cd$$

Solving for z:

$$z = c-\frac{cd}{d+ky}$$

and solving for y

$$y= b-\frac{ab}{a+kx}$$

Now substituting y into the equation for z we get

$$z= c-\frac{cd}{d+k(b-\frac{ab}{a+kx})}$$

To get the maximum value of $f(x,z(x))$, we  set $\frac{\partial }{\partial x}f(x,z(x)) = 0~$ and solve for $x$.

We can take the derivative of each term separately.  First for z

 $$\frac{\partial}{\partial x}z(x) = \frac{a b c dk^2}{(a d + k(bk+d) x)^2}$$

 The derivative of $x$ is just $1$, so the equation we have to solve is

 $$\frac{\partial}{\partial x}f(x,z(x)) = \frac{a b c dk^2}{(a d + k(bk+d) x)^2} - 1 =0$$

 Solving for $x$ we get

$$x_{ideal} = x = \frac{\sqrt{a b c k^4 d (b k + d)^2} - a k d (b k + d)}{(k^2 (b k + d)^2)}$$

and

$$x_{ideal} = -\frac{\sqrt{a b c k^4 d (b k + d)^2} + a k d (b k + d)}{(k^2 (b k + d)^2)}$$

The second of the two solutions is negative since our constants $a,b,c,d,k$ are always positive, so we can disregard it.  If we simplify the expression we end up with

$$x_{ideal} = \frac{k \sqrt{a b c d} - a d}{(k (b k + d))}$$

Using this new function for x, we can calculate the ideal input in the following code cell.

In [33]:
x_ideal = (k*sqrt(a* b *c* d) - a * d)/(k *(b *k + d))
x_ideal

np.float64(7.921088415958059)

In [34]:
# subtract x from z to get potential profit
z = c - (c*d)/(d+k*(b-(a*b)/(a+k*x_ideal)))
profit = z - x_ideal
profit

np.float64(0.015954661894590494)

This result is consistent with the spreadsheet model and represents the profit-maximizing trade size. The closed-form solution can now be applied directly to any pair of Uniswap V2-style pools without numerical optimization.

## Comparison: Binary Search vs. Closed-Form Solution

The standard approach to finding the optimal arbitrage input is binary search where the algorithm will iteratively narrow the bounds on the input amount until the profit derivative crosses zero. Below I implement this and compare it against the closed-form solution derived above.

In [37]:
def arb_profit(x, a, b, c, d, k):
    """Calculate arbitrage profit for a given input amount x."""
    y = b - (a * b) / (a + k * x)
    z = c - (c * d) / (d + k * y)
    return z - x

def binary_search_optimal(a, b, c, d, k, precision=1e-12, max_iter=1000):
    """Find optimal input via binary search on the profit derivative."""
    left, right = 0.0, min(a, c)
    iterations = 0
    for _ in range(max_iter):
        mid = (left + right) / 2
        eps = max(abs(mid) * 1e-5, 1e-10)
        dp = (arb_profit(mid + eps, a, b, c, d, k) - arb_profit(mid - eps, a, b, c, d, k)) / (2 * eps)
        if (right - left) < precision:
            break
        if dp > 0:
            left = mid
        else:
            right = mid
        iterations += 1
    return mid, iterations

In [38]:
# Binary search
start = time.time()
x_bs, iters = binary_search_optimal(a, b, c, d, k)
t_bs = time.time() - start
profit_bs = arb_profit(x_bs, a, b, c, d, k)

# Closed-form (re-run for fair timing)
start = time.time()
x_cf = (k*sqrt(a* b *c* d) - a * d)/(k *(b *k + d))
t_cf = time.time() - start
profit_cf = arb_profit(x_cf, a, b, c, d, k)

print(f"{'Method':<20} {'Optimal x':>12} {'Profit':>12} {'Time (μs)':>12} {'Iterations':>12}")
print("-" * 70)
print(f"{'Binary Search':<20} {x_bs:>12.6f} {profit_bs:>12.6f} {t_bs*1e6:>12.1f} {iters:>12}")
print(f"{'Closed-Form':<20} {x_cf:>12.6f} {profit_cf:>12.6f} {t_cf*1e6:>12.1f} {'0':>12}")

Method                  Optimal x       Profit    Time (μs)   Iterations
----------------------------------------------------------------------
Binary Search            7.921090     0.015955        190.3           53
Closed-Form              7.921088     0.015955        145.2            0


Both methods converge to the same optimal input amount and profit. The closed-form solution requires zero iterations and executes in deterministic O(1) time, making it particularly well-suited for gas-sensitive on-chain execution and MEV strategies where computational cost directly impacts profitability.