This is a minimalistic demonstration to transform the [minimum convex cost problem to find the most likely payment flow on the Lightning network](https://arxiv.org/abs/2107.05322) to a linearized problem that can be solved in sub second time on the current channel graph with the help of a [linear min cost flow solver](https://developers.google.com/optimization/reference/graph/min_cost_flow). 

## Idea
Using the ideas of [probabilistic payment delivery](https://arxiv.org/abs/2103.08576) (sometimes known as probabilistic path finding) which has been already [implemented](https://github.com/ElementsProject/lightning/pull/4771) and [tested by c-lightning](https://medium.com/blockstream/c-lightning-v0-10-2-bitcoin-dust-consensus-rule-33e777d58657) as well as [implemented by LDK](https://github.com/lightningdevkit/rust-lightning/pull/1227) (aka rust lightning) we know that the cost function to assign to assign the amount $a$ to a channel of capacity $c$ to use when selecting channels should be 

$f_c(a) = -\log\left(\frac{c+1-a}{c+1}\right)$ 

The linear approximation of this cost function can be found by looking at the first term of the Taylor Series which turns out to be: 

$l_c(a) = \frac{a}{c}$

The linearized term is easy to compute and interprete: The cost is 0 if not used and 1 if fully saturated and otherwise just proportional to the fraction of saturation. However just using the linearized version yields two problems: 

1. the unit cost $\frac{1}{c}$ is a float and not an integer (**making it hard for many mcf solving algorithms**!)
2. The linear nature of the problem (like the linear feerate) tends to fully saturate cheap paths which from a reliablity perspective is a very poor choice as fully saturated channels have the lowest probability to be successfull.

To mitigate the first problem with floating values as unit costs we multiply all unit costs with the maximum capacity of the network. So with $C_{max}$ as the max capacity. This will just be a linear scaling of the global cost function and thus not change the solution that minimizes the the cost. The function will look like: 

$L_c(a) = a\cdot\lfloor\frac{C_{max}}{c}\rfloor$


To mitigate the second problem instead of using the same cost function on the entire channel we split the channel in $N$ sements (in our case of equal size to proof a point about runtime. (From an approximation perspective one might want to use the optimal piecewise linear approximation which can also be found via: http://www.iaeng.org/publication/WCECS2008/WCECS2008_pp1191-1194.pdf). So when building the linear approximation of the **uncertainty network** instead of adding one channel with capacity $c$ for each channel we add $N$ channels each of capacity $\frac{c}{N}$. The unit cost of the i-th channel increases via the following formula: 

$L_{c,i}(a) = L_{c}(a)\cdot(i+1)$

### Motivation of this choice for the cost function on the piecewise linear segments
When using a linear min cost flow solver the unit cost can be seen as the derrivative of the cost function. with the formular $L_{c,i}(a) = L_{c}(a)\cdot(i+1)$ the unit cost is linearly increasing of every interval of the channel. This effectively behaves like piecewise approximation of a quadratic cost function. 

Of course in practise one would approximate the derrivative of the negative log probabilities at the positions where the intervals are being created and also not use intervals of equal size.

As this code is to demonstrate feasability of runtime (the linearized model of the **uncertainty network** on which we calculates has $N$ times as many edges as the convex problem) this very pragmatic and easy to implement choice will be good enough.


## Warning: This code DOES NOT 
* include optimization for routing fees (in the case of prallel channels it does not even account the paid fees properly)
* include the round based algorithm on the uncertainty network which learns conditional probabilities from attempted onions
* include the disection of the flow into paths (which is conceptionally straight forward)
* care for HTLCs limits, channel reserves and the like (as all of that is more engineering level)
* Use the optimal piecewise linear approximation for the convex cost function (as described here: http://www.iaeng.org/publication/WCECS2008/WCECS2008_pp1191-1194.pdf) 
* properly handle parallel public channels (actually it just virtually combines the capacity which from a probabilistic perspective makes a hell lot of sense)
* include a simluation of the round based algorithm
* make any mainnet test payments

In [1]:
import json
import time

#using googles linear min cost flow solver as and externaly for convenience.
# it seems to use a cost scaling algorithm internally find more information 
# on their API doc at:  https://developers.google.com/optimization/reference/graph/min_cost_flow
from ortools.graph import pywrapgraph


Next we set a few global variables. Global because our entire code is basically one script with not even 100 lines

In [2]:
#to map node_ids to the range [0,...,#number of nodes] and vice versa
node_key_to_id = {}
id_to_node_key = {}


#the will become the list of arcs that are actually stored in the 
arcs = []

#used to store the capacity of channels
channel_graph = {}

#used to store fees as a touple (base_fee_msat,ppm)
fee_graph = {}

set a few parameters for the experiment some of these numbers might heavily impact runtime

In [3]:
#quantizing payments sets a lower bound on sent HTLCs and speeds up the computation a bit
#set this to 1 if it is to be turned off
QUANTIZATION = 10000

#renes node
SRC = "03efccf2c383d7bf340da9a3f02e2c23104a0e4fe8ac1a880c8e2dc92fbdacd9df"
#loop node
DEST = "021c97a90a411ff2b10dc2a8e32de2f29d2fa49d41bfbb52bd416e460db0747d0d"
AMT = 10*1000*1000 #1 Bitcoin
#number of piecewise linear approximations. Increasing this directly increases runtime but also improves accuracy
N = 5

define two helper functions for depicting results. We want to be able to compute the actual probability of the flow and we want to also be able to know what the flow (if fully successfull) would cost

In [4]:
def uniform_probability(a,s,d):
    """
    Computes the uniform probablity of a payment of amout `a` on a channel s-->d
    """
    c = channel_graph[s][d]
    return float(c+1-a)/(c+1)

def fee_msat(a,s,d):
    """
    Computes the the fees of a payment of amout `a` on a channel s-->d
    """
    base, rate = fee_graph[s][d]
    # note we divide ppm by 1000 to be compatible with base_fee wich is measured in msat and not sats
    return base + a*rate/1000

In [5]:
def import_channels():
    """
    this does all the magic! it imports the channel_graph from c-lightning listchannels command
    
    it first passes through the channels to find all node ids and max capacity
    in a second pass it goes over all channels and adds arcs to the modelled linearized network
    for each channel N arcs are being added with increasing unit costs to mimick convex behavior
    the piecewise dissection is not optimal nor is the linear approximation of negative log probs exact
    however this does not matter for the sake of argument the runtime will not change if the costs
    are chosen abit bit more optimally. But the code will blow up thus those simplifications
    """
    f = open("listchannels20211028.json")
    channels = json.load(f)["channels"]

    #let's first find the max channel capacity and all node_ids 
    # so that we can build the look up table and use integer unit costs
    max_cap = 0
    node_ids = set()
    for c in channels:
        #print(c)
        #return
        src = c["source"]
        dest = c["destination"]
        node_ids.add(src)
        node_ids.add(dest)
        cap = c["satoshis"]
        u = 1.0/cap
        if cap>max_cap:
            max_cap = cap
    
    print("Max capacity is: ", max_cap)
    
    # let's initialize the look up tables for node_ids to integers from [0,...,#number of nodes]
    for k, node_id in enumerate(node_ids):
        node_key_to_id[node_id]=k
        id_to_node_key[k]=node_id

    
    # initilize global channel_graph and fee_graph data structures 
    global channel_graph
    channel_graph={node_key_to_id[n]:{} for n in node_ids}
    global fee_graph
    fee_graph={node_key_to_id[n]:{} for n in node_ids}

    #max_cap = 100*max_cap
    global arcs
    arcs = []
    for c in channels:
        src = node_key_to_id[c["source"]]
        dest = node_key_to_id[c["destination"]]
        cap = c["satoshis"]
        
        # we put channels into channel_Graph data structure
        # in case of parallel channels we combine capacity into 1 channel
        # from a probabilistic point of view (which we are interested in) this is correct
        if dest in channel_graph[src]:
            channel_graph[src][dest]+=cap
        else:
            channel_graph[src][dest]=cap
            
        # FIXME: this ignores fees of paralel channels. Ok for us as this is not our main concern
        fee_graph[src][dest] = (c["base_fee_millisatoshi"],c["fee_per_millionth"])

        unit_cost = int(max_cap/cap)
        #recall: N is the number of piecewise linear approximations of our cost function
        #FIXME: use optimal linear approximation e.g.: http://www.iaeng.org/publication/WCECS2008/WCECS2008_pp1191-1194.pdf
        # so for each channel we add N arcs with c/N capacity and increasing unit cost to mimick convex nature
        for i in range(N):
            #arc format is src, dest, capacity, unit_cost
            # THIS IS THE IMPORTANT LINE OF CODE WHERE THE MAGIC HAPPENS
            arcs.append((src,dest,int(cap/(N*QUANTIZATION)),(i+1)*unit_cost))

import_channels()

Max capacity is:  1400000000


## Invoking the min cost flow solver
now that we have created the model of the linearized uncertainty network we have to plug this into a linear min cost flow solver. the following code is basically and adoption of the example at google operation research API doc which can be found at https://developers.google.com/optimization/flow/mincostflow

In [6]:
def main():
    # Instantiate a SimpleMinCostFlow solver.
    min_cost_flow = pywrapgraph.SimpleMinCostFlow()
    
    # Add each arc.
    for arc in arcs:
        min_cost_flow.AddArcWithCapacityAndUnitCost(arc[0], arc[1], arc[2],
                                                    arc[3])

    # Add node supply to 0 for all nodes
    for i in id_to_node_key.keys():
        min_cost_flow.SetNodeSupply(i, 0)
    #add amount to sending node
    min_cost_flow.SetNodeSupply(node_key_to_id[SRC],int(AMT/QUANTIZATION))
    #add -amount to recipient nods
    min_cost_flow.SetNodeSupply(node_key_to_id[DEST],-int(AMT/QUANTIZATION))
    
    
    # Find the min cost flow.
    print("Deliver {:4.2f} BTC from".format(AMT/100./1000/1000), node_key_to_id[SRC],
          "to", node_key_to_id[DEST])

    #only put solver in time computation.
    #building of the network can be done while channels are announced on gossip
    #the arcs do in practise not change (unless one uses the uncertainty network but even that is cheap)
    start = time.time()
    status = min_cost_flow.Solve()
    end = time.time()
    print("Runtime of flow computation: {:4.2f} sec ".format(end-start))
    
    if status != min_cost_flow.OPTIMAL:
        print('There was an issue with the min cost flow input.')
        print(f'Status: {status}')
        exit(1)
    
    
    ##
    # From here just printing of results
    ##
    
    print('Minimum approximated quadratic cost: ', min_cost_flow.OptimalCost())
    #print('')
    print(' Arc \t\t\t      Flow / Capacity \tprobability \tFee (sats)')
    probability = 1
    total_flow = {}
    for i in range(min_cost_flow.NumArcs()):
        if min_cost_flow.Flow(i) == 0:
            continue
        cost = min_cost_flow.Flow(i) * min_cost_flow.UnitCost(i)
        src = min_cost_flow.Tail(i)
        dest = min_cost_flow.Head(i)
        flow = min_cost_flow.Flow(i)*QUANTIZATION
        
        key = str(src)+":"+str(dest)
        if key in total_flow:
            total_flow[key]=(src,dest,total_flow[key][2]+flow)
        else:
            total_flow[key]=(src,dest,flow)
    
    total_fee = 0
    for k,value in total_flow.items():
        src,dest,flow = value
        u_prob=uniform_probability(flow,src,dest)
        fee = fee_msat(flow,src,dest)
        total_fee += fee/1000
        print('%1s -> %1s     \t  %3s / %3s \t%3f\t%3f' %
              (src, dest,flow, channel_graph[src][dest], u_prob,fee / 1000))
        #break
        probability *= u_prob
    print("Probability of Flow: ", probability)
    print("Total fee: {}, rate: {:5.3f} %".format(total_fee,total_fee/1000000))
    print("arcs included: ", len(total_flow))
main()

Deliver 0.10 BTC from 7899 to 952
Runtime of flow computation: 0.75 sec 
Minimum approximated quadratic cost:  99330
 Arc 			      Flow / Capacity 	probability 	Fee (sats)
7899 -> 9097     	  3350000 / 16777215 	0.800324	4478.950000
5644 -> 952     	  3300000 / 200000000 	0.983500	16501.000000
388 -> 7647     	  3300000 / 1000000000 	0.996700	990.000000
7899 -> 388     	  3300000 / 16777215 	0.803305	4412.100000
7899 -> 6390     	  3350000 / 16777215 	0.800324	4478.950000
4687 -> 1662     	  3350000 / 3900000000 	0.999141	4.350000
13131 -> 5644     	  3300000 / 200000000 	0.983500	412.501000
9097 -> 10600     	  3350000 / 500000000 	0.993300	418.751000
7647 -> 13131     	  3300000 / 500000000 	0.993400	8250.000000
6390 -> 4687     	  3350000 / 300000000 	0.988833	837.500000
1662 -> 952     	  3350000 / 200000000 	0.983250	10047.650000
10600 -> 952     	  3350000 / 100000000 	0.966500	7608.850000
Probability of Flow:  0.4595639284629304
Total fee: 58440.602, rate: 0.058 %
arcs included: