Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

routing: probability based path finding #2802

Merged
merged 10 commits into from Jun 5, 2019

Conversation

joostjager
Copy link
Collaborator

@joostjager joostjager commented Mar 19, 2019

In this PR, several improvements to path finding and mission control are made with the goal to improve the reliability of Lightning payments.

Changes

Probability based routing

The previously used edge and node ignore lists in path finding are replaced by a probability based system. It modifies path finding so that it not only compares routes on fee and time lock, but also takes route success probability into account.

Allowing routes to be compared based on success probability is achieved by introducing a 'virtual' cost of a payment attempt and using that to translate probability into an extra cost factor.

Failed route decay time

This PR also extends the decay time of mission control data. It used to be very short which hardly allowed a payment to benefit from outcomes of previous payment attempts. With this PR, the decay time becomes configurable.

Minimum route probability

The payment process stop condition has been adapted to probability based routing. Previously the payment processed stopped when no more routes could be found. This typically happened when all tried edges had been added to the ignore list. With probabilities, every route will always have a non-zero success probability. That probability can be small, but the payment may still succeed via this route.

To prevent the payment process ending up in a loop trying routes that are highly unlikely to succeed until it times out, a minimum required success probability is introduced. Only routes with a probability above this threshold are considered and if there are none, the payment process is stopped.

Query mission control state

A new rpc call has been added to query the internal mission control state. This should give advanced users a tool to better understand why certain routes are chosen.

Usage

Config

The main parameters influencing the payment process within lnd are exposed as command line or config flags. For these flags to be available, lnd needs to be build with tags="routerrpc".

--routerrpc.minrtprob=

Minimum required route success probability to attempt the payment (default: 0.01)

--routerrpc.apriorihopprob=

Assumed success probability of a hop in a route when no other information is available. (default: 0.95)

--routerrpc.penaltyhalflife=

Defines the duration in minutes after which a penalized node or channel is back at 50% probability (default: 60)

--routerrpc.attemptcost=

The (virtual) cost in sats of a failed payment attempt (default: 100)

lncli

Querying mission control is available through lncli (also requiring the routerrpc tag):

lncli querymc

@joostjager joostjager changed the title Probability based path finding [wip] probability based path finding [wip] Mar 19, 2019
@joostjager joostjager force-pushed the probability branch 3 times, most recently from 959d3c9 to 993fc3e Compare March 28, 2019 15:38
@joostjager joostjager changed the title probability based path finding [wip] routing: probability based path finding [wip] Apr 1, 2019
@joostjager joostjager force-pushed the probability branch 4 times, most recently from 01ca7e2 to 7fec889 Compare April 9, 2019 11:42
@Roasbeef Roasbeef added P2 should be fixed if one has time payments Related to invoices/payments performance routing routing nodes labels Apr 20, 2019
@Roasbeef Roasbeef added this to the 0.7 milestone Apr 20, 2019
@joostjager joostjager requested a review from halseth May 10, 2019 09:50
@joostjager joostjager changed the title routing: probability based path finding [wip] routing: probability based path finding May 10, 2019
@joostjager joostjager force-pushed the probability branch 2 times, most recently from 3caaec0 to ae2bdfd Compare May 10, 2019 11:00
routing/pathfind.go Show resolved Hide resolved
routing/payment_session.go Outdated Show resolved Hide resolved
}
}

// If we are still in the hard prune window, return probability 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we consider letting this happen, and instead rely on the payment timing out instead? That would allow long-lived path finding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this part of the PR. Currently mission control isn't concerned with 'hard pruning' anymore. Instead, a minimum probability restriction is added to path finding. Setting this to a low value should help prevent trying unlikely routes in a short loop.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For long payment timeouts it may actually be desired that we go over previously failed routes again to see if conditions improved.

@joostjager joostjager force-pushed the probability branch 3 times, most recently from f483fe2 to 6dea767 Compare May 13, 2019 15:04
@joostjager
Copy link
Collaborator Author

ptal @halseth

Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting really close I think! 💯

routing/pathfind.go Outdated Show resolved Hide resolved
lntest/node.go Show resolved Hide resolved
routing/missioncontrol.go Outdated Show resolved Hide resolved
routing/missioncontrol.go Show resolved Hide resolved
routing/payment_session.go Show resolved Hide resolved
routing/payment_session.go Show resolved Hide resolved
lnrpc/routerrpc/router_server.go Outdated Show resolved Hide resolved
cmd/lncli/cmd_query_mission_control.go Show resolved Hide resolved
server.go Outdated Show resolved Hide resolved
This commit ensures that channel endpoints in EdgeInfo are ordered
node1 < node2 to not violate assumptions being made by dependent code.
This function is only ever called for channels connected to self.
This PR replaces the previously used edge and node ignore lists in path
finding by a probability based system. It modifies path finding so that
it not only compares routes on fee and time lock, but also takes route
success probability into account.

Allowing routes to be compared based on success probability is achieved
by introducing a 'virtual' cost of a payment attempt and using that to
translate probability into another cost factor.
This commit adds a new restriction to pathfinding that allows returning
only routes with a minimum success probability.
Previously every payment had its own local mission control state which
was in effect only for that payment. In this commit most of the local
state is removed and payments all tap into the global mission control
probability estimator.

Furthermore the decay time of pruned edges and nodes is extended, so
that observations about the network can better benefit future payment
processes.

Last, the probability function is transformed from a binary output to a
gradual curve, allowing for a better trade off between candidate routes.
This commit exposes mission control state for rpc for development
purposes.
Adds querymc command to lncli to dump mission control state.
@joostjager
Copy link
Collaborator Author

@halseth ptal

This commit exposes the three main parameters that influence mission
control and path finding to the user as command line or config file
flags. It allows for fine-tuning for optimal results.
Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is yuuuuge! 🔥 🔥 🔥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 should be fixed if one has time payments Related to invoices/payments performance routing nodes routing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants