New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

routing: use correct fees by searching backwards #1321

Closed
wants to merge 5 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@joostjager
Collaborator

joostjager commented Jun 4, 2018

The original findPath implementation can come up with a path that is rejected in newRoute because of "channel graph has insufficient capacity". The reason is that findPath calculates with the same amt all the way, while newRoute takes into account that amt needs to increase with every hop (looking from target back to source).

This change makes findPath use the correct fees and eliminates the possibility that newRoute returns the insufficient capacity error.

In this PR:

  • Refactor of the path finding unit test to be able add more cases without code duplication (table driven).
  • Unit test for invalid "channel graph has insufficient capacity" result
  • Reworked search algorithm to already take into account increasing amounts that need to be send to next hops
@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 4, 2018

@Roasbeef This is work in progress. Can you do an early check and tell me what you think of the general idea?

@joostjager joostjager force-pushed the joostjager:routing branch from dec110e to 9fd1b88 Jun 4, 2018

@Roasbeef

This comment has been minimized.

Member

Roasbeef commented Jun 4, 2018

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 4, 2018

The original findPath implementation can come up with a path that is rejected in newRoute because of "channel graph has insufficient capacity". The reason is that findPath calculates with the same amt all the way, while newRoute takes into account that amt needs to increase with every hop (looking from target back to source).

This change make findPath use the correct fees and eliminates the possibility that newRoute returns the insufficient capacity error.

@Roasbeef

This comment has been minimized.

Member

Roasbeef commented Jun 4, 2018

Gotcha, can you update the PR description, and also the commit messages with this information? Thanks!

In the past this wasn't an issue as each instance kicked off a k-shortest paths routine, so the failing paths would be excluded with a continue. We'll want to first get in the PR which adds fee limits to the path finding routine (say you don't want to pay more than 2% or w/e), as that will affect this implementation.

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 5, 2018

The fee limit PR as it is now will fail finding a route if the lowest weight route exceeds the max fee. There may be a possible route, but it isn't found because the time/fee weight balance doesn't favor that route.

When the fee limits are merged, I can update this PR to also take the fee limits into account already during findPath. This will lead to less failed route calculations, because the algorithm will continue exploring the graph for higher weight paths that do satisfy the fee limit (which means a higher total time lock).

@halseth

This comment has been minimized.

Collaborator

halseth commented Jun 6, 2018

@joostjager can you add a test case that exercise the failure scenario you are trying to fix? Will help understanding the fix :)

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 6, 2018

Yes, I am working on that.

@joostjager joostjager force-pushed the joostjager:routing branch 2 times, most recently from ecbbfd0 to 77f11f5 Jun 7, 2018

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 7, 2018

Quite some changes were needed to add this unit test in a maintainable way. I think trying to work towards a single test graph where many scenarios can be tested is better than setting up different graphs for different test cases.

Also, better visualization of the test graph would be useful. Maybe replace the ascii art with a graphviz generation script.

I see more potential work in the routing area, but will leave the scope of this PR as it is now to keep it manageable. Waiting for #1113

@joostjager joostjager force-pushed the joostjager:routing branch 2 times, most recently from 77f11f5 to 8b8a7eb Jun 7, 2018

@Roasbeef

This comment has been minimized.

Member

Roasbeef commented Jun 8, 2018

Also, better visualization of the test graph would be useful. Maybe replace the ascii art with a graphviz generation script.

The first graph there is just the most basic one, the other graphs themselves are a bit more complex, and agreed that having a rendering of them would make it easier to set up tests, however, it's also pretty straight forward to add additional vertexes/edges to the basic graph at runtime.

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 8, 2018

Yes indeed, runtime modifications is another option. Only makes it slightly more difficult to reason about what is happening.

The graph jsons contain info that could also be generated runtime. Another idea would be to describe the graph with only the essential values. For example, only aliases and generate the pubkeys (deterministically) in memory when the test runs. It makes the graph description short and can probably be described in go code instead of a separate json. Then, for a special test case like the one described in this issue, it is not much effort to bring up a custom graph.

This contradicts my comment above about working towards a single graph. It depends on the amount of effort required.

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 10, 2018

Implemented alternative way to describe graph in #1358

@halseth

This is an excellent observation! We might not see this routing failures that often, as long as fees are small compared to payment amounts, but totally agree that this is the way we should find paths correctly.

My main concern atm is that the path finding procedure does read a bit convoluted after these changes. Since we are shoehorning the target->source search into the existing block of code, variable names, map indexes, method names, comments and the steps taken now looks a bit off and is hard to follow. Instead we should try to do a bigger rewrite, where it is turned into a proper target->source search, where running amounts are kept as "distances". I don't think the changes should be that large.

You also mentioned that edge weights are a bit off since we are rounding to sat some places instead of using msat. I think it would be worthwhile to fix this, and make sure we are using msat everywhere. How big of a change would this be?

// the route search was executed backwards.
{ target: "elst", paymentAmt: 100000, totalTimeLock: 103,
expectedHops: []expectedHop{
{alias: "phamnuwen", fwdAmount: 100000200, fee: 100010000, timeLock: 102},

This comment has been minimized.

@halseth

halseth Jun 12, 2018

Collaborator

I don't get this to add up, maybe I'm missing something.

For a payment of 100,000,000 msat, the expected fees should be:
sophon->elst: 200 msat + 100,000,000 msat * 0/1,000,000 msat = 200 msat
pham->sophon: 10,000 msat + 100,000,200 msat * 1,000,000/1,000,000 msat = 100,010,200 msat

We must send paymentAmt+fees = 100,000,000 msat + 200 msat + 100,010,200 msat = 200,010,400 msat. This should not be able to go through the 120,000,000 msat channel?

This comment has been minimized.

@joostjager

joostjager Jun 12, 2018

Collaborator

Great catch! This is indeed not correct. I hope that conversion to table-driven made this easier to spot. And I think it actually uncovers two existing bugs in newRoute on master:

  • Fee calculation for a hop does not include the fee that needs to be paid to the next hop
  • The incoming channel capacity "sanity" check does not include the fee to be paid to the current hop

I also made the expected total amount for a route explicit in the test table. Need to be careful with calculating expected values in unit tests, to prevent duplication of the same (incorrect logic). In this case, the calculation of the expected value was ok, but still.

See c21c980 for fixes.

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jun 12, 2018

With regards to you other comments:

  • Naming and structure: Now that we are on the same page about the general idea of this PR, I will start that kind of finishing work.

    One problem that I encountered in making this as simple as possible, is that ChannelEdgePolicy does not contain the "from" node. Therefore I needed to create a reverse lookup table and pass an extra argument to processNode. Do you think it would be good to add this property to ChannelEdgePolicy? Or, because it is currently specific to the backwards search (I think), deal with it in findPath and accept the extra complexity?

  • I will look at msat vs sat. But channel capacities are an on-chain thing expressed in whole satoshis and I think it is good to keep that explicit by using btcutil.amount. Where did I say that edge weights are a bit off because of rounding? I know that the channel capacity check was not correct because of rounding, but don't think it affects the weights. Edge weights (on master) are indeed a bit off, but for another reason. And that is again that there is no running amount and fees (and therefore weights) are only based on payment amount.

@joostjager joostjager force-pushed the joostjager:routing branch 3 times, most recently from 39f1d23 to eee132a Jun 12, 2018

name: "three hop with fee carry over",
paymentAmount: 100000,
feeRates: []lnwire.MilliSatoshi{10000, 10000, 10000},
expectedTotalAmount: 102010,

This comment has been minimized.

@joostjager

joostjager Jun 12, 2018

Collaborator

@halseth This test on master fails. newRoute on master returns 102000 for total amount.

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

Good catch! Does it make sense to add this test + fix to a separate PR?

This comment has been minimized.

@joostjager

joostjager Jun 13, 2018

Collaborator

Good idea. The fee limit changes are dependent on the previous (backward search) work, but this one is independent.

@joostjager joostjager force-pushed the joostjager:routing branch 5 times, most recently from 285f3f9 to 76098dd Jun 13, 2018

@halseth

I'm really starting to like these changes! 😀

Table driven tests removes a lot of code duplication, and makes it easier to spot errors in the tests. 👏

Most of my comments are about code cleanup and style to make the routing algorithm easier to follow.

for _, testCase := range basicGraphPathFindingTests {
t.Run(testCase.target, func(subT *testing.T) {
testBasicGraphPathFindingCase(subT, graph, aliases, &testCase)

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

Awesome refactor! 💯

This comment has been minimized.

@Roasbeef
}
var basicGraphPathFindingTests = []basicGraphPathFindingTestCase{
// Basic route with one intermediate hop

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

nit: period at end of sentences.

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Fixed

{alias: "songoku", fwdAmount: 100000, fee: 110, timeLock: 101},
{alias: "sophon", fwdAmount: 100000, fee: 0, timeLock: 101},
}},
// Basic direct (one hop) route

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

nit: add a newline between test cases.

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Fixed

" │ │ └──────┘ ",
" │ │ ",
" │ │ | 100k sat ",
" │ │ | ",

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

nit: you forgot to add "▼" to the new channels 😛

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Fixed

@@ -3188,7 +3188,8 @@ func unmarshallRoute(rpcroute *lnrpc.Route,
routingHop := &routing.ChannelHop{
ChannelEdgePolicy: channelEdgePolicy,
Capacity: btcutil.Amount(hop.ChanCapacity),
Bandwidth: lnwire.NewMSatFromSatoshis(
btcutil.Amount(hop.ChanCapacity)),

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

hm, something weird about the formatting here?

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Fixed

if tempDist < distance[fromVertex].dist && bandwidth >= amountToSend &&
amountToSend >= edge.MinHTLC && edge.TimeLockDelta != 0 {
amountToReceive := amountToSend + fee

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

Add a comment explaining this calculation :)

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Comment added.

// distance map with a distance of 0. This indicates our starting
// point in the graph traversal.
sourceVertex := Vertex(sourceNode.PubKeyBytes)
distance[sourceVertex] = nodeWithDist{

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

we must still make sure the source is added to the map with dist infinity

This comment has been minimized.

@joostjager

joostjager Jun 13, 2018

Collaborator

The source node is already added in the graph.ForEachNode loop further up. The reason it was explictly added here previously, was to set (overwrite) the distance to 0. But because of backwards searching, this is no longer necessary.

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

Hmm, we are then assuming that the source node will always be returned from graph.ForEachNode. That's a reasonable assumption to make, but I think it won't hurt to also handle the case where it's not.

Maybe that's fine anyway, because we will just finish our path finding, and then discover no path was found to the source node? Or do we depend on it being in the map?

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

It is always returned. If you look at

node, err := fetchLightningNode(nodes, selfPub)

it can be seen that even channeldb itself relies on the source pubkey being present in the node bucket.

If these semantics would change at some point, path finding would never find a path anymore. So the bug that would be created, will quickly surface.

I'd prefer not to add a check anyway, because it can only elicit questions from developer about the reason there might be for this to happen.

Comment added.

This comment has been minimized.

@halseth

halseth Jul 12, 2018

Collaborator

Reasonable, agreed!

// further our graph traversal.
pivot := Vertex(bestNode.PubKeyBytes)
err := bestNode.ForEachChannel(tx, func(tx *bolt.Tx,
edgeInfo *channeldb.ChannelEdgeInfo,
outEdge, _ *channeldb.ChannelEdgePolicy) error {
outEdge *channeldb.ChannelEdgePolicy, inEdge *channeldb.ChannelEdgePolicy) error {

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

can be written outEdge, inEdge *channeldb.ChannelEdgePolicy

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Fixed

pathEdges = append(pathEdges, prev[prevNode].edge)
prevNode = Vertex(prev[prevNode].prevNode)
// If the potential route is below the max hop limit, then we'll use

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

nit: This comment should not mention the hop limit, as that is first checked below.

This comment has been minimized.

@joostjager

joostjager Jun 14, 2018

Collaborator

Fixed. Probably something that was moved earlier.

@@ -616,6 +638,163 @@ func TestKShortestPathFinding(t *testing.T) {
assertExpectedPath(t, paths[1], "roasbeef", "satoshi", "luoji")
}
func TestNewRoute(t *testing.T) {

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

the newRoute changes should be removed, since they were separated into a new PR.

This comment has been minimized.

@joostjager

joostjager Jun 13, 2018

Collaborator

This PR is based upon the other PR, because I need the TestNewRoute function. Those changes are shown here to because the base of this PR is master. But really, all commits in this PR are stacked on top of the newRoute bugfix. Please let me know how to do this better.

This comment has been minimized.

@halseth

halseth Jun 13, 2018

Collaborator

👍

@joostjager joostjager force-pushed the joostjager:routing branch from 76098dd to fde56a7 Jun 14, 2018

@joostjager joostjager force-pushed the joostjager:routing branch 3 times, most recently from d60dff0 to 7e384c6 Jul 28, 2018

@halseth

This is getting reallly close now, most comments about style, and removal of the (now) unnecessary DB migration.

Very excited to get this in!

@@ -85,6 +85,11 @@ func (d dbNode) ForEachChannel(cb func(ChannelEdge) error) error {
return d.node.ForEachChannel(d.tx, func(tx *bolt.Tx,
ei *channeldb.ChannelEdgeInfo, ep, _ *channeldb.ChannelEdgePolicy) error {
// Skip channels for which no outgoing edge policy is available.
if ep == nil {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Looking at where it is used, I think it should only need to know whether there's a channel or not between the nodes, not if the policy is known or not.

So in this case, we could extract both pubkeys from the ei, and compare them to the current dbNode to figure out the key of the peer.

if err := putChanEdgePolicyUnknown(edges, edge.ChannelID,
key[:]); err != nil {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

To separate long lines like this, the first thing would be

err := putCha...
if err != nil {
    return err 
}
// connecting node. If the callback returns an error, then the iteration is
// halted with the error propagated back up to the caller.
//
// Unknown policies are passed into the callback as nil values.

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

👍

incomingNode := toEdgePolicy.Node.PubKeyBytes[:]
fromEdgePolicy, err := fetchChanEdgePolicy(
edges, chanID, incomingNode, nodes,
// Using the incoming node, the incoming edge policy

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Excellent! 😀

@@ -2014,6 +2041,16 @@ func (c *ChannelEdgeInfo) BitcoinKey2() (*btcec.PublicKey, error) {
return key, nil
}
// OtherNodeKeyBytes returns the node key bytes of the other end of
// the channel.
func (c *ChannelEdgeInfo) OtherNodeKeyBytes(thisNodeKey []byte) [33]byte {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

👍

return pureFee + timeLockPenalty
return int64(fee) + timeLockPenalty
}
// findPath attempts to find a path from the source node within the

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Could add to the method description here that we search backwards now, and why 😄

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Added

// that are not known to us yet in the distance map.
for vertex := range additionalEdges {
additionalEdgesWithSrc := make(map[Vertex][]*edgePolicyWithSource)
for vertex, outgoingEdgePolicies := range additionalEdges {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Yeah, we probably shouldn't change ChannelEdgePolicy, as it directly maps to the wire messages being sent on the network. I'm fine with keeping this as is now, as I agree that this could be considered an implementation detail the caller shouldn't need to worry about.

// If the estimated band width of the channel edge is not able
// to carry the amount that needs to be send, return.
if bandwidth < amountToSend {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

This could be checked earlier in processEdge?

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Moved up

// If the amountToSend is less than the minimum required amount,
// return.
if amountToSend < edge.MinHTLC {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

same, could check earlier for an early return.

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Moved up

expectedHops: []expectedHop{
{alias: "luoji", fwdAmount: 100000, fee: 0, timeLock: 101},
}},
// Basic route with one intermediate hop.

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Looks good now!

if err != nil {
return err
}

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Ok, semantics change compared to master, but I think it is better like this. As you say, policies are not required for the auto pilot.

But the change become bigger than I liked, because of the lookup of LightningNode which was previously done inside LightningNode.ForEachChannel. Needed to gain access to db somehow.

I kept the changes apart in commit d258efe. Maybe there is a cleaner way?

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Hm, yeah this might not even work, since calling FetchLightningNode inside an already open DB transaction might cause problems.

I don't see an immediate way of achieving what we want without doing more changes to the database to make it possible to fetch the node without the edge policy. I'm okay with keeping your original implementation (skipping a nil-policy) with an added TODO saying that we ideally should support the case where the policy is nil.

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Ok, pulling out that commit.

@joostjager joostjager force-pushed the joostjager:routing branch from dd4f26f to 763a424 Jul 30, 2018

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Jul 30, 2018

@halseth I think we have resolved all points now.

// with the node's public key and sends with the compact edge ID.
// For each chanID, there will be two entries within the bucket, as the
// graph is directed: nodes may have different policies w.r.t to fees
// for their respective directions. An unknown policy is represented as

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

👍

//
// maps: pubKey || edgeID -> edge policy for node
// maps: pubKey || chanID -> channel edge policy for node

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

For some reason I thought edgePolicies were stored in a bucket in addition to this. So I can understand why we were talking about different things now.. 😛 The final solution you came up with looks good to me though 😄

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Ah, that explains it 😄

if edgeBytes := edges.Get(edgeKey[:]); edgeBytes != nil {
// An unknown policy value does not have a update time recorded, so
// it also does not need to be removed.
if edgeBytes := edges.Get(edgeKey[:]);

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

Would be even safer to check !bytes.Equals(edgeBytes, unknownPolicy) (unknown policy wouldn't be required to be empty slice), where unknownPolicy is a constant defined as the empty slice.

@@ -2836,6 +2890,11 @@ func fetchChanEdgePolicy(edges *bolt.Bucket, chanID []byte,
return nil, ErrEdgeNotFound
}
// No need to deserialize unknown policy.
if len(edgeBytes) == 0 {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

again, compare agains unknownPolicy constant.

This comment has been minimized.

@halseth

halseth Aug 1, 2018

Collaborator

When this is addressed this PR can be squashed and rebased, and I think it should be good 👍

_, err := fetchChanEdgePolicy(edges,
channelID[:], keyBytes[:], nodes)
if err == ErrEdgeNotFound {

This comment has been minimized.

@halseth

halseth Jul 30, 2018

Collaborator

In case of other error we must return it such that the migration can be aborted.

This comment has been minimized.

@joostjager

joostjager Jul 30, 2018

Collaborator

Good one, fixed.

func migrateEdgePolicies(tx *bolt.Tx) error {
nodes := tx.Bucket(nodeBucket)
if nodes == nil {
return ErrGraphNotFound

This comment has been minimized.

@halseth

halseth Jul 31, 2018

Collaborator

I think we can safely return nil if the buckets are not found? We are probably then trying to migrate an empty DB.

This comment has been minimized.

@joostjager

joostjager Jul 31, 2018

Collaborator

Yes, indeed.

log.Tracef("Adding unknown edge policy present for node %x, channel %v",
keyBytes, channelId)
putChanEdgePolicyUnknown(edges, channelId, keyBytes)

This comment has been minimized.

@halseth

halseth Jul 31, 2018

Collaborator

error must be handled here also.

This comment has been minimized.

@joostjager

joostjager Jul 31, 2018

Collaborator

Good point, added.

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Aug 1, 2018

unknownPolicy constant restored as empty slice and rebased.

@joostjager joostjager force-pushed the joostjager:routing branch from 5588c18 to 253b11a Aug 4, 2018

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Aug 4, 2018

Rebased after gofmt commits on master

joostjager added some commits Jun 18, 2018

channeldb: store unknown policies in database
The commit ensures that for every channel, there will always
be two entries in the edges bucket. If the policy from one or
both ends of the channel is unknown, it is marked as such.

This allows efficient lookup of incoming edges. This is
required for backwards payment path finding.

@joostjager joostjager force-pushed the joostjager:routing branch from 253b11a to 508401a Aug 5, 2018

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Aug 5, 2018

Amended all commits with gofmt

@halseth

This change looks mostly LGTM now, awesome work! Should get review from someone else before merge, however.

if edgeBytes := edges.Get(edgeKey[:]); edgeBytes != nil {
// An unknown policy value does not have a update time recorded, so
// it also does not need to be removed.
if edgeBytes := edges.Get(edgeKey[:]); edgeBytes != nil && !bytes.Equal(edgeBytes[:], unknownPolicy) {

This comment has been minimized.

@halseth

halseth Aug 6, 2018

Collaborator

split over two lines

var incomingPolicy, outgoingPolicy *ChannelEdgePolicy
if policy1 != nil {

This comment has been minimized.

@halseth

halseth Aug 6, 2018

Collaborator

I know I suggested earlier to do it this way, but I didn't anticipate we would need these switch statements 😛

Could we instead do:

chanID := nodeEdge[33:]
edgeInfo, err := fetchChanEdgeInfo(edgeIndex, chanID)
    if err != nil {
        return err
    }

toEdgePolicy, err := fetchChanEdgePolicy(
    edges, chanID, nodeNub, nodes,
)
...

otherNode, err := edgeInfo.OtherNodeKeyBytes(nodePub)
if err != nil {
    return err
}

toEdgePolicy, err := fetchChanEdgePolicy(
    edges, chanID, otherNode, nodes,
)

(or just deserialize the toEdgePolicy directly if first checked if not unknownPolicy, as you suggested initially)

This comment has been minimized.

@joostjager

joostjager Aug 7, 2018

Collaborator

Yes, this is better indeed than the ugly switches. Glad to hear to my initial creation was not so bad 😉 but I agree with you that readability and not duplicating code is important too.

@@ -2899,6 +2943,20 @@ func putChanEdgePolicy(edges *bolt.Bucket, edge *ChannelEdgePolicy, from, to []b
return edges.Put(edgeKey[:], b.Bytes()[:])
}
// putChanEdgePolicyUnknown marks the edge policy as unknown in the edges bucket.

This comment has been minimized.

@halseth

halseth Aug 6, 2018

Collaborator

nit: both lines a bit long

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Aug 7, 2018

Review comments addressed

@Roasbeef

Excellent work! This PR includes quite a few nice clean ups w.r. to the testing code, and fixes a few legacy bugs related to incorrectly applying the fee limit during path finding.

The PR looks mostly good to me, however I noticed that in one area, it'll create a DB transaction inside of an existing one. Instead, it should carry over the same transaction as in the past, this has lead to deadlocks, and also creating a new transaction is an expensive activity. This is my only major comment in this PR.

I'll also move to start running this on a few nodes doing path finding, just to see if anything weird pops up.

if err != nil && err != ErrEdgeNotFound &&
err != ErrGraphNodeNotFound {
return err
policy1, policy2, err := fetchChanEdgePolicies(edgeIndex,

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

Forgetting to check the error here.

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

We already have the edge info here, it's the value of the cursor seek, why fetch it again? Not blocking, more of a nit that code seems to have been unnecessarily changed.

This comment has been minimized.

@joostjager

joostjager Aug 10, 2018

Collaborator

Initially it was unchanged. I modified it after discussion with @halseth for readability (over performance), see comments above.

@@ -2824,6 +2864,20 @@ func putChanEdgePolicy(edges *bolt.Bucket, edge *ChannelEdgePolicy, from, to []b
return edges.Put(edgeKey[:], b.Bytes()[:])
}
// putChanEdgePolicyUnknown marks the edge policy as unknown in the edges bucket.
func putChanEdgePolicyUnknown(edges *bolt.Bucket, channelID uint64, from []byte) error {

This comment has been minimized.

@Roasbeef
for _, testCase := range basicGraphPathFindingTests {
t.Run(testCase.target, func(subT *testing.T) {
testBasicGraphPathFindingCase(subT, graph, aliases, &testCase)

This comment has been minimized.

@Roasbeef
return fmt.Errorf("Unexpected node in policy")
}
}
incomingPolicy, err := fetchChanEdgePolicy(

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

Ignoring the error here and above.

// If the edge has no time lock delta, the payment will always
// fail, so return.
// TODO(joostjager): Is this really true? Can't it be that

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

Well, it's just a dumb value to set, as it more or less guarantees that the node won't be able to properly handle a timed out HTLC, and will likely lose money.

@@ -684,13 +798,6 @@ func findPath(tx *bolt.Tx, graph *channeldb.ChannelGraph,
"too many hops")
}
// As our traversal of the prev map above walked backwards from the

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

Yay, no more reversal!

return err
}
// Lookup the full node details in order to be able to

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

This looks a bit dangerous and could possibly lead to a deadlock: we're creating a new DB transaction inside of an existing db transaction.

This comment has been minimized.

@Roasbeef

Roasbeef Aug 10, 2018

Member

bolt allow concurrent readers, but I'm weary of making additional transactions inside of callbacks like this. We started to expose the transaction object in these methods to safely allow nested calls without creating new database transactions. We could possibly add methods onto the EdgeInfo struct, which accepts the existing transaction (or if nil creates a new one), in order to allow the caller to fetch the full LightningNode struct without creating a new database transaction.

@Roasbeef

This comment has been minimized.

Member

Roasbeef commented Aug 10, 2018

Actually, I'll merge this as is, and then add a follow up commit to avoid the double DB transaction issue. Thanks for bearing with us through this review process! The final PR really turned out well!

@Roasbeef

This comment has been minimized.

Member

Roasbeef commented Aug 10, 2018

Committed as 29b6bae!

@Roasbeef Roasbeef closed this Aug 10, 2018

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Aug 10, 2018

I am happy that this one has been merged now. Will work on the follow-up.

@joostjager

This comment has been minimized.

Collaborator

joostjager commented Aug 10, 2018

Ah, I see you fixed the open issues yourselves. I was thinking of adding a tx parameter to FetchLightningNode, but this works too of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment