add modified greedy topK centrality heuristic to autopilot #4384

bhandras · 2020-06-17T17:41:17Z

This PR adds a very simple autopilot heuristic to the already existing heuristics, that simply calculates the betweenness centrality of the current graph and returns normalized node scores with the exception of the ones we already have channels with.

This method successfully approximates Maximum Betweenness Improvement (MBI) given sufficient number of new edges added but is considerably worse than estimating MBI with a real greedy algo which would add ghost edges to all nodes and select the one which would result in the largest centrality value for our node.

Roasbeef · 2020-06-23T23:29:37Z

This method successfully approximates Maximum Betweenness Improvement (MBI) given sufficient number of new edges added but is considerably worse than estimating

Just to confirm, you say "approximates" since it'll heavily weight towards the node that gives the largest improvement, but may fail to select them, correct?

but is considerably worse than estimating MBI with a real greedy algo which would add ghost edges to all nodes and select the one which would result in the largest centrality value for our node.

Hmm, yeah implementing this may require some tweaks to the way the scoring system works atm. I think we still do want the stochastic aspect to avoid all nodes opening a channel to the exact same set of nodes (say they start with zero channels, or one channel to the same starting node). One alternative here would maybe be attempting to use the pubkey of the node driving the agent to add some jitter (by removing a sub-set of nodes?).

Roasbeef · 2020-06-23T23:33:55Z

I think we'd also want to have a proper incremental calculation algo before we did ghost edge simulation since we'd need to re-calculate the entire graph with each iteration with the current algo.

Roasbeef

I think this is ready to leave the draft stage? Will start to use it to bootstrap some new nodes on testnet.

autopilot/agent.go

bhandras · 2020-07-07T14:07:17Z

This method successfully approximates Maximum Betweenness Improvement (MBI) given sufficient number of new edges added but is considerably worse than estimating

Just to confirm, you say "approximates" since it'll heavily weight towards the node that gives the largest improvement, but may fail to select them, correct?

but is considerably worse than estimating MBI with a real greedy algo which would add ghost edges to all nodes and select the one which would result in the largest centrality value for our node.

Hmm, yeah implementing this may require some tweaks to the way the scoring system works atm. I think we still do want the stochastic aspect to avoid all nodes opening a channel to the exact same set of nodes (say they start with zero channels, or one channel to the same starting node). One alternative here would maybe be attempting to use the pubkey of the node driving the agent to add some jitter (by removing a sub-set of nodes?).

This type of greedy algo approximates as for a single new connection simply selecting the node with the largest centrality may not result in the best MBI. In practice this means that this algo may need more connections to reach the best possible centrality improvement than if we were to select the actual node which would result in the max improvement for our node (which requires recalculation for each/most nodes).

The stochastic selection just adds another layer of distortion.

carlaKC

First pass looks good!

autopilot/top_centrality.go

autopilot/interface.go

carlaKC · 2020-07-08T11:04:19Z

autopilot/top_centrality.go

+			continue
+		}
+
+		// Skip passed nodes not in the graph.


When would a peer not be in the centrality graph but be passed into this function? Happy with the check, perhaps just a comment explaining when this happens.

Good point, comment added, ptal.

carlaKC · 2020-07-08T11:08:12Z

autopilot/top_centrality_test.go

+// TestTopCentralityNonEmptyGraph tests that we return the correct normalized
+// centralitiy values given a non empty graph, correctly filtered down to the
+// passed nodes and omitting nodes which we have channels with.
+func TestTopCentralityNonEmptyGraph(t *testing.T) {


Could these tests be flattened into one? Since we're getting node scores and asserting length for both? The tests could just have a buildGraph function which provides the graph we want, rather than having two similar tests except for this one input.

You raised a great question. Actually they were already kind of flattened as by adding the empty node set and empty channel set we can include the other test in in this one. Made some structural changes to (hopefully) make it more readable too and extended the tests to cover connectivity from none, to full.

carlaKC

Nice changes to the tests 🥇 Just two nitty-nits from me, change looks good!

autopilot/centrality_testdata_test.go

autopilot/top_centrality_test.go

halseth

Big ACK on this approach. I think the "non-exact" top K algorithm is good enough for now, as we do want some jitter in channel selection anyway.

We can explore a more exact greedy variant later, but I think for now the simplicity and speed of this approach is a big pro 👍

halseth · 2020-07-16T12:45:03Z

autopilot/top_centrality.go

+
+	// As we don't currently support incremental graph updates, we
+	// don't need to cache anything.
+	bc, err := NewBetweennessCentralityMetric(


Should this be moved to NewTopCentrality, such that only a refresh is needed when this method is called?

Changed construction of the metric a bit to be able to do this cleanly. PTAL

halseth · 2020-07-16T12:47:05Z

autopilot/top_centrality.go

+
+		result[nodeID] = &NodeScore{
+			NodeID: nodeID,
+			Score:  centrality[nodeID],


nit: do score, ok := centrality[nodeID] above for readability and one less lookup

halseth · 2020-07-16T12:48:55Z

autopilot/top_centrality.go

+
+// Name returns the name of the heuristic.
+func (g *TopCentrality) Name() string {
+	return "topk_centrality"


should we rather name the heuristic simply "centrality"? Feels more right to not have the user have to care about the underlying algorithm, and we can change to the greedy algorithm later without changing the name.

Yeah, good point. Done

autopilot/betweenness_centrality.go

halseth · 2020-07-17T13:49:07Z

autopilot/top_centrality.go

 	// Calculate betweenness centrality for the whole graph.
-	if err := bc.Refresh(graph); err != nil {
+	if err := g.centralityMetric.Refresh(graph); err != nil {


Squash with or move this change before the initial TopCentrality commit

This commit removes an extra filter on address availability which is not needed as the scored nodes are a already prefiltered subset of the whole graph where address availability has already been checked.

This commit creates a new autopilot heuristic which simply returns normalized betweenness centrality values for the current graph. This new heuristic will make it possible to prefer nodes with large centrality when we're trying to open channels. The heuristic is also somewhat dumb as it doesn't try to figure out the best nodes, as that'd require adding ghost edges to the graph recalculating the centrality as many times as many nodes there are (minus the one we already have channels with).

The commit also reindents the source to conform with ts=8 guideline.

halseth

LGTM ✅

bhandras force-pushed the atpl_bc_topk branch 3 times, most recently from 2026f0f to 382e3c5 Compare June 23, 2020 15:58

Roasbeef reviewed Jun 23, 2020

View reviewed changes

autopilot/agent.go Show resolved Hide resolved

Roasbeef added the v0.11 label Jun 30, 2020

Roasbeef added this to In progress in v0.11.0-beta via automation Jun 30, 2020

Roasbeef added this to the 0.11.0 milestone Jun 30, 2020

cfromknecht added v0.12 and removed v0.11 labels Jul 7, 2020

cfromknecht modified the milestones: 0.11.0, 0.12.0 Jul 7, 2020

cfromknecht removed this from In progress in v0.11.0-beta Jul 7, 2020

bhandras marked this pull request as ready for review July 7, 2020 14:07

bhandras requested a review from halseth as a code owner July 7, 2020 14:07

bhandras requested review from Roasbeef and carlaKC July 7, 2020 14:07

carlaKC reviewed Jul 8, 2020

View reviewed changes

bhandras force-pushed the atpl_bc_topk branch from 382e3c5 to b4458ba Compare July 14, 2020 15:22

bhandras requested review from carlaKC and removed request for halseth July 14, 2020 15:22

bhandras force-pushed the atpl_bc_topk branch from b4458ba to aea0175 Compare July 14, 2020 15:33

carlaKC approved these changes Jul 15, 2020

View reviewed changes

autopilot/centrality_testdata_test.go Outdated Show resolved Hide resolved

autopilot/top_centrality_test.go Outdated Show resolved Hide resolved

halseth self-requested a review July 15, 2020 19:04

bhandras force-pushed the atpl_bc_topk branch from aea0175 to 57e13a2 Compare July 15, 2020 22:24

halseth reviewed Jul 16, 2020

View reviewed changes

bhandras force-pushed the atpl_bc_topk branch 2 times, most recently from b7dd44c to 2172de6 Compare July 17, 2020 09:52

bhandras requested a review from halseth July 17, 2020 09:52

halseth reviewed Jul 17, 2020

View reviewed changes

autopilot/betweenness_centrality.go Outdated Show resolved Hide resolved

bhandras requested a review from halseth July 17, 2020 11:46

bhandras force-pushed the atpl_bc_topk branch 3 times, most recently from 665ac18 to 1d574c1 Compare July 17, 2020 14:00

halseth suggested changes Jul 17, 2020

View reviewed changes

bhandras added 5 commits July 17, 2020 16:12

atpl: remove unneeded extra filter on address availability

906b0b7

This commit removes an extra filter on address availability which is not needed as the scored nodes are a already prefiltered subset of the whole graph where address availability has already been checked.

autopilot+test: make centrality test data available for other tests

82ddcce

autopilot: add TopCentrality to the available heuristics

ccabad8

autopilot+test: testify betweenness centrality tests

afbbeae

The commit also reindents the source to conform with ts=8 guideline.

bhandras force-pushed the atpl_bc_topk branch from 1d574c1 to afbbeae Compare July 17, 2020 14:13

bhandras requested a review from halseth July 17, 2020 14:13

halseth approved these changes Jul 17, 2020

View reviewed changes

bhandras merged commit 77549f1 into lightningnetwork:master Jul 17, 2020

Roasbeef modified the milestones: 0.12.0, 0.11.0 Jul 17, 2020

halseth mentioned this pull request Jul 20, 2020

autopilot: add new greedy centrality and expected fee revenue heuristic #4187

Open

3 tasks

bhandras deleted the atpl_bc_topk branch September 12, 2023 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add modified greedy topK centrality heuristic to autopilot #4384

add modified greedy topK centrality heuristic to autopilot #4384

bhandras commented Jun 17, 2020

Roasbeef commented Jun 23, 2020

Roasbeef commented Jun 23, 2020

Roasbeef left a comment

bhandras commented Jul 7, 2020

carlaKC left a comment

carlaKC Jul 8, 2020

bhandras Jul 14, 2020

carlaKC Jul 8, 2020

bhandras Jul 14, 2020

carlaKC left a comment

halseth left a comment

halseth Jul 16, 2020

bhandras Jul 17, 2020

halseth Jul 16, 2020

bhandras Jul 17, 2020

halseth Jul 16, 2020

bhandras Jul 17, 2020

halseth Jul 17, 2020

bhandras Jul 17, 2020

halseth left a comment

add modified greedy topK centrality heuristic to autopilot #4384

add modified greedy topK centrality heuristic to autopilot #4384

Conversation

bhandras commented Jun 17, 2020

Roasbeef commented Jun 23, 2020

Roasbeef commented Jun 23, 2020

Roasbeef left a comment

Choose a reason for hiding this comment

bhandras commented Jul 7, 2020

carlaKC left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlaKC left a comment

Choose a reason for hiding this comment

halseth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

halseth left a comment

Choose a reason for hiding this comment