Skip to content

Latest commit

 

History

History
588 lines (488 loc) · 43.2 KB

path-finding.asciidoc

File metadata and controls

588 lines (488 loc) · 43.2 KB

newly suggested structure

Many paths lead to Rome. Which one to choose?

Chapter overview:
  • some with enough liquidity

  • some that are cheap

  • some that don’t lock liquidity for too long -→ optimization problem (tend to be hard in general)

What are good features

  • fees should be chep

  • success probabilities

    • Why is liquidity a problem? Why do channel balances have to be private and what ist the problem with that?

      • if balance was not private nodes would have to tell everyone else about every channel update (that happens with payments) -→ Scales as poorly as bitcoin

      • it is also nice for privacy

    • How to obtain them?

    • without quantifying the uncertainty this is like being blind without help and trying to walk around.

  • other features

    • CLTV to minimize locked capital in bad cases

    • provenance scores

Generating candidate pahts by solving / estimating the optimizaion problem

  • Dijkstra for single paths

  • Backward computation because of fees

Trial and Error Loop

  • the loop

    • Generate candidate paths

    • try to send

    • update konwledge of the uncertainty network

  • question: how long to remember the knowledge

  • predict expected number of attempts

  • issues: success rates drop with larger amounts -→ Expected number of attempts rises

Can we do better than single paths

  • Pros and cons ++ smaller payments — more channels involved

  • math says it is still an optimization problem or a min cost flow problem.

  • depending on the goal algorithms do exist.

  • Well known analogy from logistcs: How do you transport x goods from A to B with least cost of transport

Perspective of Routing nodes

  • provide a service to earn a fee

  • how to improve the service?

    • increase liquidity / provide more liquidity

      • more channels -→ better liquidity

      • larger channels --→ higher reliability

      • abvoe goals (more channels, larger channels) are in conflict for fixed amount of liquidity, how to solve? --→ Depends on goals.

    • increase updatime

    • increase reliability via rebalancing

      • pro actively

      • lazy (JIT - Routing)

    • offer outsourcing of routing for light clients via trampoline payments

old stuff

Relevant questions to answer:

  • What is packet switching? What is circuit switching? Which one does LN use today?

  • In the abstract what is path finding?

  • What is dijkstra’s? What modifications need to be made to apply it to this domain?

  • Why must path finding happen backwards (receiver to sender)?

  • How is the information contained in a channel update used in path finding?

  • How can errors sent during payment routing help the sender to narrow their search space?

  • What is payment splitting? How does it work? What alternatives exist?

  • What information can be sent to intermediate and the final node aside from the critical routing data?

  • What are multi-hop locks? What addition privacy and security guarantees to they offer?

  • How can the flexible onion space be used to enabled packet switching in the network?

Finding a path

Payments on the Lightning Network are forwarded along a path of channels from one participant to another. Thus, a path of payment channels has to be selected. If we knew the exact channel balances of every channel, we could easily compute one or more payment paths using any of the standard path finding algorithms taught in good computer science programs. Actually, when we consider multipath payments, it is rather a flow problem than a path finding problem. Since flows consist of several paths we conveniently talk about path finding. If exact information about channel balances were available, we could solve those problems in a way as to minimize the fees that would have to be paid by the payer to the nodes forwarding the payment. However, as discussed, the balance information of all channels cannot be available to all participants of the network. Thus, we need to have one or more innovative path finding strategies. These strategies must relate closely to the routing algorithm that is used. As we will see in the next section, the Lightning Network uses a source-based onion-routing protocol for routing payments. In such a protocol it is the responsibility of the sender, i.e. payer, to find a path through the network. With only partial information about the network topology available this is a real challenge and active research is still being conducted into optimizing this part of the Lightning Network implementations. The fact that the path finding problem in the Lightning Network is not fully solved is a major point of criticism towards the technology. The path finding strategy currently implemented in Lightning nodes is to probe paths until one is found that has enough liquidity to forward the payment. While this is not optimal and leaves ample room for improvements, it should be noted that even this simplistic strategy works well. This probing is done by the Lightning node or wallet and is not directly seen by the user of the software. The user might suspect that probing is taking place if the payment is not going through instantly. The current algorithm also does not necessarily result in the path with the lowest fees.

What is "Source-Based" routing and why does the Lightning Network use it?

Source-based routing is a method of path-finding where the sender, i.e. the source, plans the path from itself, through the intermediary nodes, to the final destination. Once a path has been found and selected, the sender sends the payment to the first intermediary node, who sends it to the second intermediary node, and so on until it reaches the destination. While a payment is traveling along a path, the path typically does not get changed by any of the intermediary nodes, even if a shorter path or a cheaper path (in terms of routing fees) exists.

One of the reasons the Lightning Network uses source-based routing is to protect user privacy. As discussed in the chapter on Onion Routing, the intermediary nodes transmitting the payment are not aware of the full path of the payment. They only know the node they received it from and the node they are sending it to.

The destination, i.e. the payment recipient, is less able to find a good path. Even if it specifies a path in the invoice, that path may no longer be viable by the time the invoice is paid, which could be several minutes or several days later. The recipient can, however, specify "routing hints" in the invoice to assist the sender in finding a possible path.

On the other hand, source-based routing comes with some inherent drawbacks. The sender chooses the path based on its current understanding of the topological map of the Lightning network. As discussed in previous chapters, this map is necessarily incomplete. The sender cannot be aware of all the channels. And even if it is aware of them, it will not always know their latest balances. The balances of channels change with every payment. Consequently, any topological knowledge becomes obsolete in a short space of time. The standard path finding mechanism in source-based onion-routing that is implemented in all Lightning Network implementations is the following:

  1. Given the limited local topological knowledge the sender tries to find one or more routing paths.

  2. Select an arbitrary path of payment channels which satisfies 3 conditions:

    • path connects sender and receiver of the payment,

    • all channels on path have a presumed capacity of at least the payment amount,

    • all channels on path accept HTLCs of the payment amount.

  3. Construct the "onion" from destination to sender according to the meta data of the channels (base fee, fee rate, CLTV delta).

  4. Send out the "onion" and expect one of two possible results returned:

    • Preimages are returned by nodes if the payment settles successfully

    • Error is returned if the payment fails.

  5. If the payment settles, the sender updates its topological knowledge based on this new information for future payments. The algorithm terminates.

  6. If the payment fails, the sender updates its topological knowledge based on this new information. It then selects a different path and starts the process again from the beginning.

This means that with every attempted payment nodes actually probe the network and also learn some information about how balances are distributed. Implementations will usually prioritise cheaper paths or exclude channels which have recently failed. In that sense the selection is not completely arbitrary. Even with such primitive heuristics in place it could still be considered a random process or a random walk through the channel graph. There can be several reasons why a payment may fail along the way. Reasons for failure include: a routing node became unreachable, a routing channel no longer has the required balance, a routing node doesn’t accept new HTLCs, the owner of a channel increased the channel fees, or the channel was closed in the interim. Furthermore, there is no guarantee that the route chosen was the cheapest in terms of fees or the shortest in terms of channels involved. At the time of writing this book, this is a design trade-off made to protect user privacy.

Paths are constructed from destination to source

Let us use our standard example in which Alice wants to send a payment of 100k satoshi on a path via Bob and Chan to Dina. The path obviously looks like (Alice)-→(Bob)-→(Chan)-→Dina. Bob and Chan will charge routing fees to forward the onion. As you already know, nodes can charge two types of fees. First, the base fee will be charged for any successful forwarding and settlement of an HTLC. This fee is constant and independent of the amount that the node is forwarding. Secondly, nodes might charge a fee rate which is proportional to the forwarded amount. For simplicity assume that the fee rate of Bob and Chan is expensive with 1% for Bob and 2% for Chan. For simplicity furthermore assume neither Bob nor Chan take a base fee. When Alice constructs the onion she has to include the routing fees as the difference of the incoming HTLC and the outgoing HTLC. Let us assume she computes the routing fees for the onion incorrectly. Alice knows that 1% of 100k satoshi is 1k satoshi which she belives she should include in Bob’s onion. Similarly she knows that 2% of 100k satoshi is 2k satoshi which she belives she should include in Chan’s onion. An inexperienced Alice would incorrectly believe her total fee to be 3k satoshi. But she is wrong. Look at the incorrect onion from our naive Alice. Bob would reject this onion.

"route": [
      {
         "id": "Bob",
         "channel": "357",
         "direction": 1,
         "satoshi": 103000,
         "forward": 102000,
         "dealy": 187,
      },
      {
         "id": "Chan",
         "channel": "74",
         "direction": 1,
         "satoshi": 102000,
         "forward": 100000,
         "dealy": 183,
      },
      {
         "id": "Dina",
         "channel": "452",
         "direction": 0,
         "satoshi": 100000,
         "dealy": 153,
      }
   ]
}

The reason for Bob to not forward the onion is that he expects the incoming amount to be 1% larger then the amount he is supposed to forward. Thus he would like to receive an incoming ammount of 103020 satoshi (102000 + 1%) which is 20 satoshi more than our uninformed Alice actually sent him. According to Bob’s fee schedule Bob will reject this onion. If Alice constructed the onion from the destinatin towards the source, she would have started with 100k satoshi for Dina. In the next step she would have added Chan’s 2% fee to compute 102k for Chan’s input. In the last step she would have applied Bob’s fee (1%) to 102k to derive 102k + 1020 satoshi. That makes a total of 103,020 satoshi that she needs to send to Bob. As the routing fees can increase the amount that is being forwarded even beyond the capacity of small channels, it makes sense to start the construction of the onion and the path finding at the destination and work from the destination back towards the sender.

Note

Onions are constructed from the inside to the outside. Hence, onions are built starting with the destination. However, this is not the reason why path finding has to start with the destination node.

Fundamentals about path finding

Finding a path through a graph is a problem modern computers can solve rather efficiently. Developers mainly choose breadth-first search if the edges are all of equal weight. In cases where the edges are not of equal weight the Dijkstra Algorithm is used. In our case the weights of the edges could represent the routing fees. Only edges with a capacity larger than the amount to be sent will be included in the search. In this basic form pathfinding in the Lightning network is very simple and straight forward. However, as we have already discussed in the introduction, channel balances cannot be shared with every participant every time a payment takes place as this would prevent scaling the network. This turns our easy theoretical computer science problem into a rather complex real-world problem. We now have to solve a pathfinding problem with only partial knowledge. For example, we suspect which edges might be able to forward a payment because their capacity seems big enough. But we can’t be certain unless we try it out or ask the channel owners directly. Even if we were able to ask the channel owners directly, their balance might change by the time we have asked others, computed a path, constructed an onion and send it along. Not only do we have soley limited information but the information we have is highly dynamic and might change at any point in time without our knowledge.

One general observation that everyone can easily make is that if every node along a path is able to forward a certain amount of satoshis, these nodes will also be able to forward a lower amount of satoshis. This is why many people intuitively believe that multipath payments might be a good strategy. Instead of finding one path where every node has a large amount of liquidity the task is split into smaller ones. Another reason is of course that the sender of a payment might just not have the amount they wish to send available in one single channel but distributed over several of his channels. We leave it to later sections of this chapter to discuss the strengths and weaknesses of multipath payments. We simply note that multipath payments are equivalent to finding a flow between the source and the destination. Finding flows in a static graph with full knowledge is computationally marginally more expensive than computing a shortest path. On the other hand, given the dynamic reality of the Lightning Network and the fact that we do not need to compute a maximum flow, it is currently not known if the flow problem is more or less difficult than finding a path. Both problems seem to have about the same difficulty and the problems are partially related as we will see in the following sections.

Probing-based pathfinding algorithm on the Lightning Network

In order to deterministically find a path nodes would need to know the balances of remote payment channels and these balances would have to be static. As this is not the case in the Lightning Network, nodes use a probing-based algorithm. In its most basic form the algorithm works as follows:

  1. Select a random path to the destination node

  2. Construct and send the onion

  3. wait for the response of the onion

  4. If response is a valid preimage, then routing was successful and the algorithm terminates.

  5. If response is a failure notification, then start over from step 1.

Nodes will use various sources of information to improve the selection of a random path. The main source of information is the gossip protocol. From the gossip protocol a node learns which other nodes exist and which channels have been opened. This will basically provide a network view that can be used to run graph algorithms that generate plausible paths. One fitting algorithm is the breadth-first seach traversal. The graph algorithm will usually be constrained to channels whose capacity exceeds the payment amount. In practice, due to channel reserve and the assumption that the capacity in the channel will not be sitting completely on one side, it is smarter to prefer larger channels.

The second source of information is the blockchain itself. Channel closings are not announced via the gossip protocol. However, as the funding transaction is encoded by the short channel id of the channel and as it will be spent on closing the channel, nodes can use this on-chain information to update their knowledge about the network of channels.

Past payments form a third source of information. Onions can return with errors. Knowing for example that the third hop along a path returns an error of insufficient balance means that the first two channels had enough balance and that the third channel did not have enough balance. In general, edges with errors can be removed from the set of edges similarly to the edges with insufficient capacity. Nodes can accumulate knowledge and update their knowledge with every failed or successful payment attempt. It is important that nodes are careful with this data. As the capacity information of channels from the gossip protocol and the blockchain data are verifiably correct, the data returned in failed onions can be incorrect. Nodes might simply send an error back because they do not want to reveal balance information. Besides, channel data continuously changes over time as the Lightning Network is very dynamic. This implies that nodes should only use such data if it is not too old or use it only with limited confidence. As time advances this information becomes stale and outdated and the confidence in this data diminishes.

The fourth source of information that the node can use are the routing hints in the BOLT 11 invoices. Remember that a regular payment process starts with the person who wants to receive money producing a random secret and hashing it to derive the payment hash. This hash is usually transported to the sender via an invoice. Invoices typically contain some meta data including some routing hints. This is imperative if the person who wants to be paid does not have announced channels. In that case some unannounced channels will be specified within the invoice. Otherwise the payer would not even be able to find a path to the "hidden" destination node. Routing hints might also be used by the receiving node to indicate which public channels have enough inbound capacity to forward the payment. In general, the longer a payment path is, the more likely it becomes that a channel with insufficient balance is selected. Thus, receiving hints from the receiver indicating on which channels it wishes to receive funds is definitely helpful for the sender.

Improvements on source-based onion-routing

The probing-based approach that is used in the Lightning Network has several shortcomings. Sending out an onion takes a certain amount of time. The time depends on how many hops the onion is supposed to be forwarded, on the speed of nodes processing the onion, and on the topology on the network. In the following diagram you can see how the round-trip time for onions in general increases with the amount of hops that the onion has encoded.

Research shows that the onion round-trip time depends on the distance (CC-BY-SA Tikhomirov, Sergei & Pickhardt, Rene & Biryukov, Alex & Nowostawski, Mariusz. (2020). Probing Channel Balances in the Lightning Network.)

probingtimes This diagram is just a snapshot from an experiment in early 2020 and results might change. We learn from the diagram that payments can take several seconds while the node probes several paths. This is due to the fact that a single onion can easily take a few seconds to return and a sender might have to send several onions sequentially while probing for a successful path. In comparison, this will still be much faster than waiting for confirmations on a Bitcoin block; but it is not performant enough in an environment where payments need to settle fast. People standing in a line at the grocery store cash register prefer not to wait several seconds. Thus, Lightning developers have come up and implemented the following improvements to the probing algorithms. We are also hopeful that additional improvements and optimizations can be discovered in the future.

Improvements to probing

Nodes ordinarily probe the network when making a payment. But nothing prevents them from probing the network periodically. Instead of making a real payment, nodes could send out one or multiple fake payments. A fake payment is nothing but an onions with a random payment hash. Given the properties of the hash function, it is save to assume that nobody knows the preimage. If the payment amount is small enough, a fake payment will fail at the destination and this allows the sending node to learn about the balances on the path. There are clear downsides to this approach. It produces spam and heavy network load and therefore this behaviour is discouraged. However, participants cannot easily be stopped from doing this. Channel partners can detect this type of abuse by observing frequent payments that always fail. As punishment channel partners can decide to produce errors right away without providing balance information or they can decide to close the abused channel.

We want you to understand that Lightning Network by design does not have perfect privacy. While a lot of information is not easily accessible, every time a path is probed the node learns something about the state of the network at that point in time.

Please note that one should never send two onions at the same time with the same payment hash for which the recipient knows the preimage. As long as the onion is being processed and routed the payment is out of control of the sender. In case two onions are sent at the same time, the recipient could very well release the preimage twice and get paid twice. This is the reason why arbitrary probing should be conducted with a fake, i.e. purely random, payment hash. With fake payment hashes the sender can probe concurrently as long as the sender has enough funds to pay for all the HTLCs. Successful probing does not guarantee a following successful payment. Assume a fake onion returns indicating that the payment hash was unknown to the recipient but otherwise the path has been possible. The sender now uses the same path to send the payment with the corrent payment hash. In the interim, the balance of a channel along the path changes rendering the path unworkable. In this case the sender has to start all over again. Admittedly the risk for this to happen is rather small but the possibility exists.

A potential improvement has been outlined by a suggested mechanism labelled as stuckless payments. The proposal of stuckless payments received positive feedback from developers. It is unlikely that the mechanism is implemented before the Lightning Network switches from Hashed Timelock Contracts (HTLCs) to Point Timelock Contracts (PTLCs). PTLCs in turn will only be implemented after Schnorr Signatures are activated on the Bitcoin Network. Stuckless payments give control back to the sender of an onion. We don’t explain the details here, but stuckless payments empower the sender to cancel an onion. This is great for redundant and concurrent pathfinding. The sender can now send out several real onions without fear of being charged multiple times. The first onion that arrives at the recipient will be settled. All others will be canclled. This increases the usuability of the Lightning Network on several levels. One advantage is that the sender can try several paths at the same time. The second advantage is that the path is locked, i.e. reserved, after it is found until it is settled. This means that the sender can either cancel the onion or bring the onion to a successful conclusion. In particular, the probed path once locked cannot change or be used by other routing requests in the interim between probing and setting up the HTLCs that are used to fulfill the request. The found path remains reserved until cancelled or the payment is successfully completed. Using stuckless payments the time for a successful payment will reduce drastically. The distadvantage is that the sender has to lock more bitcoin during the pathfinding process. Due to timeouts these bitcoin can remain locked for several days before being released again. Although this should not happen too frequently. Another drawback is that the execution of this mechanism utilizes more resources of routing nodes.

Multipath payments

Everyone can easily make the following observation:

Let's say your node has discovered a path along which a certain amount of Satoshis can be routed.
If so, then any onion with an smaller amount of Satoshis can also be routed successfully along that path at the given time.
One can conclude that a smaller amount has a higher likelihood to be routed successfully to the destination than a larger amount.

This supposition ignores some edge cases which we ignore for this discussion. The above observation might not hold true for small amounts of Satoshis. Certain node operators might not be interested in routing small amounts because they might consider them as "not profitable enough". Node operators might weigh other node resources against the tiny profit of a small payment and simply reject payments below a given threshold or minimum. What is "small" and what to reject will be defined by each operator on its personal preferences.

But for the general case, researchers and developers have already tested this postulate and confirmed it multiple times emperically.

With this assumption in mind it seems natural to split a payment amount and send several smaller payments along various paths. When one of the smaller payments fails it will be retried and probed just as one would do with a single larger payment. While the main idea is easy to understand, we want to discuss the details, advantages, and disadvantages of this mechanism further.

A receiving node will see an incoming HTLC for a certain payment hash. If the onion signals that the node is the final recipient and if the amount of the HTLC is less than the one specified in the invoice, the node would normally not accept the HTLC and send back an error notification. However, using the Total Value Locked (TLV) format of onions a sender can specify a total amount of the payment which is bigger than the HTLC. In the TLV case, the recipient can safely accept the HTLC and wait for more HTLCs to arrive. All parts of the payment will use the same payment hash. The recipient will only release the preimage if the sum of all incoming HTLCs is at least the specified payment amount.

Multipath or multipart payments? You might have noticed that we named the chapter "multipath" payments but mentioned in the last paragraph that such a payment consists of several parts. The protocol specification uses the abbrivation MPP for multipart payments. Multipath is just a special case of multipart. Multipart covers all the cases of multipath plus the unusual case where multiple parts use the same path. For simplicity we take the liberty to also abbriviate multipath payments with MPP.

It is important to recognize that a node that forwards HTLCs does not have to distinguish a single full payment from a partial multipart payment. Only the receiving node needs to distinguish the two cases. Only the receiver needs to be ready to accept multipart payments. In the BOLT 11 invoice specification there is a field for feature bits. If a node wishes to accept multipart payments it must signal this by setting the corresponding feature bit (bit 16 of 17). If a node wishes to send a multipart payment it can do so if the receiving node has signaled their willingess to accept such payments. Currently there is no mechanism for routing nodes to split the payment amount and onion into several parts or merge several incoming HTLCs into a single onion.

Besides the potentially better chances to find smaller routes the sender might want to use a multipart payment because it does not have enough balance in a single payment channel. If the channel had enough capacity this could be resolved with a circular rebalancing - which we will discuss in the next section. However if the payment amount is bigger than the largest capacity of a channel that the sender has the sender can only pay the invoice if the recipient allows and supports multipart payments. Similarly a recipient might not be able to receive a single payment of the requested amount and would have the interest of signaling multi part payments. Luckily nodes will do this automatically and practially always signal the support for multi part payments if the implementation supports this feature. The standard Lightning Network implementations which follow BOLT 1.1 all support this feature.

Multipart payments will almost always be more expensive than a single payment. You will remember that the fees that routing nodes charge consist of a fee rate and of a base fee. The total fee rate of a multipart payment stays roughly the same as a single payment. However the base fee is added independent of the amount making multipart payments in most cases more expensive. As the sender pays the fees the sender will not necessarily have the interest of splitting the payment in too many parts. Thus implementations usually integrate multi part payments into the probing based approach. For example after a single payment would not got through the node might split the amount into two payments and try a multipart payment with smaller amounts. Those mulitpart payments could again be split down if they are not successfull along a route.

The advantages of multi part payments are quite obvious:

  1. bigger payment sizes

  2. higher success rates

On the other side we have a couple of downsides:

  1. Higher fees

  2. More HTLCs locked / more load on the network

  3. Potentially longer times. If only a single part gets stuck all the other HTLCs in flight have to wait locking liquidity of many nodes for a potentially longer time

  4. Leaks more information as the network is practically probed more heavily.

Rebalancing

In this chapter you have already learnt that the path finding problem on the lightning network is actually rather a problem of finding a flow - which consists of several paths. Very early research about pathfinding in payment channel networks suggests \footnote{FIND LINK} that rebalancing channels does not change the flow properties between nodes. With rebalancing we mean shifting liquidity from one channel to another channel for example via a circular payment. There is also the notion of offchain / onchain swaps with swapping services. This form of rebalancing certainly changes also the topological properties like the flow of the network. As rebalancing via circular self payments would not change the overall amount that an arbitrary node can send to any other node people thought that rebalancing is not very useful. However in practice a node hardly wants to find the perfect flow or multipath to be able to send the absolute maximum amount to another node. Nodes are rather interested in quickly finding a sufficient large flow so that they can make a reasonable payment. Research conducted by Rene Pickhardt (one of the authors of this book) indicated that circular rebalancing operations improve the overall successrate in the network for arbitrary payments. It turns out that there is various ways how rebalancing can be used and in some form it even resembles the functionality of a multi path payment. Thus we decided to devote a section here on basics about rebalancing and how it can be used to improve the pathfinding abilities of the network.

We made the experience that most people call their payment channel balanced if they own the same amount of bitcoin in that channel as their channel partner. While this seems intuitive we want to show that this intuition does not seem to be the best intuition for our goals. In order to see this let us assume the Lightning Network at some point in time looks exactly like that. All channels split the capacity 50 - 50 dividing it into half between the channel partners.

A part of the Lightning Network where all the channel balances are distributed 50/50.

rebalancing 1

It is quite clear that after already one single payment such a 50 - 50 state would be destroyed. You can see this in the following graph.

The Bob - Chan channel becomes now imbalanced

rebalancing 2

you can see that after Bob made a payment of 1 million satoshi to Chan the channel balance was shifted. Bob now has 1.5 million satoshi on the channel and chan has 3.5 million satoshi on the channel. The balance ratio went from 50/50 to 30/70. The other 2 channels however styed with 50/50.

Chan decides that he wants to have a 50/50 channel with Bob. There are 3 ways of how he can achieve this.

  1. He can send back 1 milion satoshi to Bob

  2. He can use an onchain swapping service

  3. He can send a circular onion

Sending back the money would be quite expensive and does not seem to be a realistic option. Using an onchain swapping service after every payment to rebalance channels seems also problematic. The entire idea of creating the Lightning Network was to have less on chain transaction and be able to send money between people without the necessity to do on chain transactions. Thus there is only the last option which means that Chan could move the money from the Bob-Chan channel via the Bob-Erica channel to hhis Erica-Chan channel.

Chan tries to rebalance the Bob-Chan channel in the unbalanced network via a circular onion of 1 mio Satoshi.

rebalancing 4

The problem in the new network can easily be seen on the next picture. While the Bob-Chan channel now becomes 50/50 again all the other channel turned into a 30/70 split ratio.

Rebalancing one channel produces imbalanced other channels

rebalancing 5

An interesting oversvation about this rebalancing can be made though! After the payment and the rebalancing it looked like Bob initially had sent Money not via the Bob-Chan channel but via the path along Erica.

Rebalancing is equivalent to having selected a different payment path to begin with.

rebalancing 6

This observation is actually quite interesting. While the math theory tells us that rebalancing channels does not change the max flow between two nodes we see that it has changed the selected path of a payment. Due to the onion routing and the privacy goals that are implemented in it we have a source based routing and thus assume the sender always has to select and thus find the path. However this is not true! When rebalancing comes into place we can use the local knowledge of the distribution of balances that nodes might have to help with selection of paths and finding a total payment path / multi path or flow. We will explore this idea a little bit more in the upcoming section about JIT routing.

Remember in our example after Bob has paid Chan Bob had a total amount of 4 million satoshi, Chan had a total of 6 million satoshi and Erica still had 5 million satoshi as before. Of course it would be possible to have payment channels between these three people with that distribution of funds so that everyone has 50% of the capacity on their side of the payment channel.

50/50 balances with upteded capacities.

rebalancing 7

While the above picture shows that it is possible to have 50/50 channls after the payment this could only be achieved if the capacities would have been changed. Changing the capacity of channels is only possible by closing and opening the channel or with the help of a technique called splicing. The later is not widely deployed yet and would also depend on onchain transactions.

We hope that you have seen from this example a few things:

  1. Off-chain rebalancing does not change the fact how much money can flow from sender to receiver.

  2. Making payments changes how much money sender and receiver can send or receive. This is similar to the physical world where you also can only spend the cash that you have received first.

  3. The goal to have channels in a 50/50 state is not possible for all the nodes all the time and thus probably not a good one.

  4. Rebalancing in combination with payments changes the way money flew from the sender to the recipient. In particular it shifts can shift the responsability to find a path from the sender to several nodes on the network - even they don’t know which path they are trying to find.

  5. Thus rebalancing can be a nice tool to support path finding.

With these conclusings let us look more precisely what would be good rebalancing strategies for nodes.

The main problem with Lightning network channels from a routing and pathfinding perspective is that the liquidity is not known. From that perspective the 50/50 approach which is not achievable makes sense. If nodes could assume that other nodes always have a certain amount of the capacity on their side they could use that fraction of the capacity to make path finding decisions. Initially all the channel balance of newly opened channels is on one side. Thus if there is a new node which has opened some channels and received some channels all the channels are unbalanced and routing is always only possible in one direction.

Nodes and node operators could look at the channel balance coefficient which is defined as the ratio between the balance they hold on that channel divided by the capacity of that channel. As the balance can never be below zero and never exceed the capacity this channel balance coefficient will always be between 0 and 1. A node can easily compute the channel balance coefficient for all its channels. By the way in the case of the 50/50 rebalancing the coefficients would all have the value of 0.5.

Researchers demonstrated that the overall likelihood to find a path increases if nodes aim to rebalance their channels in a way that their local channel balance coefficients all take the same value. This target value can easily be computed as the amount of total funds that a node owns on the network devided by the sum of all capacities of channels that the node maintains. We call this target value the node balance coefficient \nu. Nodes can check wich channels have channel balance coefficient that is bigger than \nu and which have a channel balance coeffcient that is smaller than \nu. after identifying such channels it makes sense to make circular self payments from the channels with too mcuh liquidity to the channels with too little liquidity.

This approach has an economical drawback. Doing a circular self payment is not for free. The nodes along the circular path will charge routing fees which always have to be paid by the initiator of the payment. This would be your node if you wanted to rebalance your channels. It might be justified for you to pay those fees upfront because you might earn them back with the routing fees that you charge if you can successfully forward payments. However you do not really know in which direction you will have to route payments later. In the worst cast you moved liquidity from a channel which you could have used perfectly to fulfill routing requests along that edge in this direction. Not only would you have paid routing fees for a rebalancing operation you would also have depleeted your channel more quickly and might face the need to rebalance again.

We hope that you are not discouraged at this moment. Rebalancing is still a viable thing. While proactive rebalancing increases the reliablity of the network it is currently economically not viable. However you could rebalance reactively or Just in Time at the moment when necessary. Imagine you have a an incoming HTLCs and the onion says you are supposed to forward the payment along a channel where you lack sufficient balance. The standard case of the protocol would be to return the onion with an onion and remove the incoming HTLC. However noone stops your node from shortly interrupting the routing process and conduct a rebalancing operation to provide yourself with sufficient liquidity on the channel in question. This method is called JIT-Routing as it helps nodes to reactively provide themselves with enough liquidity just in time.

The just in time Routing scheme has 2 major advantages over source based routing.

  1. It increases the privacy of channels. If nodes that do not have sufficient liquidity return the onions an attacker can use that behavior to probe for the channel balance. However if nodes rebalance their channels they will always be able to forward the payment and protect themselves from probing attacks.

  2. More importantly it resembles multipart payments in which the splitting of the payment is not been decided by the sender who would not know how balances remotely are distributed but the splitting would be achieved by the routing node that knows its local topology.

Let us elaborate on the second point and take the example in which Bob was supposed to forward the onion from Alice to Chan but does have enough liquidity on the channel with Chan. If Bob now does a cebalancing operation through Erica and is able to afterwards forward the payment along to Bob he has effectively split the payment at his node to flow along two paths. One part flows directly to Chan and the other part takes the path over Erica to Chan. It is obvious that splitting a payment at the node that can’t forward the entire payment is much more reliable and effective than letting the sender decide how to split a payment and into which amounts.

We thus can see that with the help of JIT-Routing rebalancing and multipart payments are actually not so different concepts and ideas. There is another way how mutlipart payments and rebalancing can be combined. Let us recall that nodes should always aim to have similar channel balance coefficients. So if a node wants to make a multipart payment it could split the payment in such a way that it rebalances its channels. Meaning it would only pay from channels on which it currently has too much liquidity. Also it would use larger parts for the channels that have way too much liquidity and smaller amount for the channels that have just a little bit too much liquidity. The optimal amounts can easily be computed with the following formulars.

TODO: somehow describe this better without being too scientific. Tool and code can be found at: lightningd/plugins#83

new_funds = sum(b) - a

# assuming all channels have capacity of 1 btc
cap = len(b)
nu = float(new_funds) / cap
ris = [1*(float(x)/1 - nu) for x in b]

real_ris = [x for x in ris if x > 0]
s = sum(real_ris)
payments = [a*x/s for x in real_ris]

In fact this multipath rebalancing could also be utilized in the process of JIT routing. Instead of shifting all the funds from one channel to the destination channel a node could use a circular multipart payment.

  • (proactive / reactive) Rebalancing

  • Imbalance measures

  • goals for rebalancing (low Gini coefficient and not 50 / 50)

  • optimization problem / game theory

  • JIT Routing

Optimizations for Multi path payments

The rebalancing goal with local channel balance coefficients could actually be integrated into multi path payments. Thus if a node decides to send a payment along several paths it could very well use this opportunity to split the payment in a way that it improves the imbalance of its own channels. So instead of splitting payments by 2 in a divide and conquorer strategy the node could use the following formula …​