Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lnrpc: deprecate QueryRoutes with more than one route #2497

Merged
merged 10 commits into from Mar 13, 2019

Conversation

Projects
None yet
6 participants
@joostjager
Copy link
Collaborator

joostjager commented Jan 17, 2019

This PR marks the num_routes argument of the query routes rpc as deprecated.

The use case for returning multiple routes is limited at best, but it does require maintaining a dedicated algorithm (k-shortest) in lnd.

It also encourages blindly attempting the second best route without interpreting the outcome of the best route, which is likely to result in less efficient payments.

The recommended approach is to request a single route, attempt payment, process the outcome if it failed (eg black list edges) and request another route, taking into account the outcome of the first attempt.

Currently QueryRoutes with multiple routes may have a use in special scripts (probing, rebalancing, ..). To keep facilitating those, a follow up PR will add any source to destination queries to QueryRoutes.

@halseth

This comment has been minimized.

Copy link
Collaborator

halseth commented Jan 17, 2019

Is the workaround you mention (blacklisting edges) currently possible? Maybe we should provide that before deprecating this.

@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Jan 17, 2019

No, not possible currently.

But if we want to go this way, I thought it is better to signal it sooner rather than later.

@Roasbeef

This comment has been minimized.

Copy link
Member

Roasbeef commented Jan 18, 2019

People can still use it while we mark it deprecated, before we remove it all together, we should have the replacement in place though.

@joostjager joostjager requested a review from halseth Jan 21, 2019

Show resolved Hide resolved lnrpc/rpc.proto Outdated
@halseth

This comment has been minimized.

Copy link
Collaborator

halseth commented Jan 22, 2019

Change LGTM.

Question is still whether we should wait to have a replacement before deprecating, but as you say maybe best to just start signaling right away.

EDIT: Would be nice to be able to point to the "new way" of doing it when deprecating.

@treygriffith

This comment has been minimized.

Copy link

treygriffith commented Jan 22, 2019

As a user of the QueryRoutes rpc, our use case for multiple routes is to implement our own maximum time lock similar to the maximum fee that is already supported by the rpc. I suppose you could do this with edge blacklisting as well, but I'm not sure that's the right tool for the job.

@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Jan 23, 2019

@treygriffith thanks for mentioning your use case.

I don't think it works very well with black listing, but #1941 should do it.

@joostjager joostjager requested a review from Roasbeef Jan 24, 2019

@joostjager joostjager force-pushed the joostjager:querysingleroute branch from 53d1b3f to cbb4344 Feb 1, 2019

@Roasbeef Roasbeef added this to the 0.6 milestone Feb 5, 2019

@joostjager joostjager force-pushed the joostjager:querysingleroute branch 3 times, most recently from 070e5ad to 9bca98a Feb 5, 2019

@ZapUser77

This comment has been minimized.

Copy link

ZapUser77 commented Feb 7, 2019

This seems like a bad idea.
Scenario:
"best" path yielded is A-B-C-D-E-F
However, B-C 100% of the balance is on C side, so this can't be used. With only 1 routed returned, you have no alternative, and no way of knowing which channel is the problem. And "blacklisting" a channel isn't a solution unless it's a single-use blacklist and you have some way to determine exactly which edge is the problem.

"To keep facilitating those, a follow up PR will add any source to destination queries to QueryRoutes."
This will definitely be beneficial.

@joostjager joostjager force-pushed the joostjager:querysingleroute branch from 9bca98a to bbf8f24 Feb 7, 2019

@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Feb 7, 2019

First thing to note is that returning multiple routes alleviates the problem you describe above somewhat, but it is still not a real solution. It could still be that all returned routes fail and that the route that would work is not in the set. You could ask for all existing routes, but that quickly becomes infeasible.

The plan is that before we actually remove num_routes, we would have any-to-any route queries with black list implemented. Then it is always possible to implement the k-shortest algorithm client side.

In addition to that, we are working towards returning structured errors from SendToRoute (#1662), which will allow a client to pinpoint the failing edge, update a (single use) black list and pass that into query routes.

@ZapUser77

This comment has been minimized.

Copy link

ZapUser77 commented Feb 7, 2019

Thanks.
Would the sender know exactly which channel failed in the new errors?
Is "insufficient local balance" covered under "temporary channel failure"?
This is something that perplexes me about AMP as well... how will one know the available channel balance of a non-directly connected channel? [All falls under the "how do we know which channels to avoid" due to insufficient local balance.]

@joostjager joostjager force-pushed the joostjager:querysingleroute branch from bbf8f24 to e9e83eb Feb 14, 2019

@joostjager joostjager force-pushed the joostjager:querysingleroute branch 6 times, most recently from 0e723a6 to 643a7fd Mar 1, 2019

@joostjager joostjager requested a review from halseth Mar 5, 2019

@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Mar 5, 2019

Added replacement functionality so that k-shortest can really be removed in the first major release after 0.6.

@halseth ptal

@joostjager joostjager force-pushed the joostjager:querysingleroute branch 2 times, most recently from 7e1fbaa to d062139 Mar 5, 2019

@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Mar 5, 2019

@Roasbeef route cache restored

@joostjager joostjager force-pushed the joostjager:querysingleroute branch from 3c7c456 to 44ffcd8 Mar 5, 2019

joostjager added some commits Mar 5, 2019

routing: remove redundant fee limit check in newRoute
This check was a left over from when the fee limit wasn't checked yet in
the path finding algorithm.
routing: allow nil maps for ignored edges and nodes
This allows removing a lot of empty map initialization code and makes
the code more readable.
lnrpc+routing: add edges and nodes restrictions to query routes
This commit allows the execution of QueryRoutes to be controlled using
lists of black-listed edges and nodes. Any path returned will not pass
through the edges and/or nodes on the list.
routing: take Vertex types as path finding source and target nodes
Currently public keys are represented either as a 33-byte array (Vertex) or as a
btcec.PublicKey struct. The latter isn't useable as index into maps and
cannot be used easily in compares. Therefore the 33-byte array
representation is used predominantly throughout the code base.

This commit converts the argument types of source and target nodes for
path finding to Vertex. Path finding executes no crypto operations and
using Vertex simplifies the code.

Additionally, it prepares for the path finding source parameter to be
exposed over rpc in a follow up commit without requiring conversion back
and forth between Vertex and btcec.PublicKey.
routing: add source parameter to query routes
This commit allows execution of QueryRoutes from any source node.
Previously this was restricted to only the self node.
lnrpc: deprecate QueryRoutes with more than one route
Now that QueryRoutes gained the ability to route from any source node
and takes in edge and node black lists, all pieces are in place to have
users implemented their own k-shortests path algorithm. Or any other algorithm
they might wish to use and currently can't.

This commit marks the num_routes field as deprecated as a preparation
for removing k-shortest for lnd.

@joostjager joostjager force-pushed the joostjager:querysingleroute branch from 44ffcd8 to 6cc82b4 Mar 6, 2019

@Roasbeef
Copy link
Member

Roasbeef left a comment

The next chapter on the path to removing ksp! First pass review completed, pretty straight forward diffs with no major comments. The only thing I think we should do before merging this is to update the QueryRoutes integration test to exercise this new behavior, and the new parsing logic along the way.

Show resolved Hide resolved routing/pathfind.go
routerrpc: move query routes into sub server
This commit moves the query routes backend logic from the main
rpc server into the sub server. It is another step towards splitting up
the main rpc server code.

In addition to this, a unit test is added to verify rpc parameter
parsing.
@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Mar 11, 2019

The only thing I think we should do before merging this is to update the QueryRoutes integration test to exercise this new behavior, and the new parsing logic along the way.

I think we should be careful with integration tests and only use them to test code that isn't testable otherwise. The queryroutes rpc behaviour can be tested in a unit test as well. I added this test in the last commit.

@Roasbeef
Copy link
Member

Roasbeef left a comment

LGTM 💎

@Roasbeef Roasbeef merged commit ad88490 into lightningnetwork:master Mar 13, 2019

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.07%) to 60.257%
Details
@wamde

This comment has been minimized.

Copy link

wamde commented Apr 12, 2019

Does this mean that moving forward advanced routing functionality will be seen as something which should live outside lnd? Will this be extendable via plugins, or are we encouraged to build on top of the gRPC API?
I am asking because I am re-implementing KSP for rebalance-lnd, but until now it was not clear to me whether it would be a short term fix or something we should invest more into.

@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Apr 16, 2019

@wamde What do you mean exactly with 'advanced routing functionality'?

KSP specifically has in our opinion not many good use cases, so that is why we decided to move it out of lnd. All code has a recurring maintenance cost that must weigh up against the benefits.

Can you tell me why you need ksp for rebalancing rather than trying the best route first and after processing the outcome of that, query again for the (then) best route?

@wamde

This comment has been minimized.

Copy link

wamde commented Apr 19, 2019

Let me explain (re-stating some obvious stuff for context):

  • Rebalancing tries to find loops in the graph, with some constraints. We may want to force an exit channel (to drain that one specifically), a re-entry channel (to top up that one specifically), or both.
  • Due to many factors (let's say the network is still in its infancy), the shortest path may not work (unbalanced channel, offline node, you name it). So we may want to try the 2nd best route, 3rd and so on. My personal experience with rebalance-lnd is that it takes 1-10 paths to complete a rebalance, and for poorly connected channels it may not succeed at all.
  • QueryRoutes currently gives k routes sorted by fee_milli_msat, but does not seem to try and maximise the chance of these paths working (yet -- I know there is more to come but for now that's what we get), so we typically take n routes and try them all sequentially. We may want to finetune this by weighing certain nodes specifically, not just looking at the total fee in msats but in actual sats, by also optimising for short routes (in terms of # of hops and not just fees) etc.
  • Finding routes is slow, so if we can compute n routes and try them all without having to wait for another re-computation after a failure, it helps. Even better, if there was a way to thread a pathfinding service and have another thread probing/trying such routes, the speediness of rebalances would greatly improve.

All of this to say that I think that we are fine re-implementing all of that and adding flexibility to it on top of lnd on our own, but:

  • I would like not to re-invent the wheel, so I was seeking clarification on the direction being taken at the lnd level
  • it would be great, if generic pathfinding stayed in lnd, to have a better framework to plug new optimisers or whatever as plugins as opposed to rebuilding the whole thing on top of graph RPCs
@joostjager

This comment has been minimized.

Copy link
Collaborator Author

joostjager commented Apr 19, 2019

Thanks for the clarification @wamde

Finding routes is slow, so if we can compute n routes and try them all without having to wait for another re-computation after a failure, it helps

In your experience, is calling QueryRoutes for 20 routes at once significantly faster than calling it 20 times and requesting just a single route in every call?

I am sensitive to the performance argument in general. There are probably many ways to speed up path finding. One good candidate is keeping the graph in memory, but this is quite a refactor which we started but haven't completed yet.

if there was a way to thread a pathfinding service and have another thread probing/trying such routes

These calls can be made concurrently, but ideally you only want to start the next path finding round when the previous probe has completed. To take into account the outcome of that probe.

I would like not to re-invent the wheel, so I was seeking clarification on the direction being taken at the lnd level

Yes understood. On the lnd level we want to offer the right tools. If ksp is a way around slow path finding, the right thing to do is to optimize path finding. Ksp started to take more maintenance effort than we'd like, so we decided to move it out. This loads some of the work onto the user, but it also creates an opportunity to implement it in a better, more tailored way perhaps.

would be great, if generic pathfinding stayed in lnd, to have a better framework to plug new optimisers or whatever as plugins as opposed to rebuilding the whole thing on top of graph RPCs

What do you mean with generic pathfinding exactly? We want to move the RPCs in a direction where they remain relatively simple, but still powerful enough to solve real world path finding problems. Any feedback on that is always welcome. A caller defined hop limit could also be an extension of the current api.

Plugins are not on the roadmap atm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.