New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoring based autopilot attachment heuristic #2006

Merged
merged 12 commits into from Dec 7, 2018

Conversation

Projects
None yet
4 participants
@halseth
Copy link
Collaborator

halseth commented Oct 4, 2018

This PR makes the current autopilot attachment heuristic scoring based, meaning that instead of returning a set of candidate nodes to open channels to, it takes a list of nodes (currently all nodes in the graph), and scores them according to the existing model.

This is done as a step in making it possible for the autopilot to select nodes from a diverse set of models. Making them scoring based makes it easier to combine the result from multiple heuristics when considering candidates for channels.

The new API NodeScores takes the current channel graph, the existing local channels and available funds, and a set of nodes. It returns a map of scored nodes, where a higher score indicates it is more preferred as a channel counterparty. Note that we currently only require the returned scores to be non-negative, higher meaning more preferred. As we want to pick a node at random according to this score distribution, we therefore must normalize the scores before picking a channel candidate.

TODO:

  • Update existing unit tests to new API
  • Add more unit test for new NodeScores implementation
  • Add more unit tests for the agent's use of NodeScores
  • Add unit tests for weightedChoice and chooseN: #2293
  • Increase logging around channel opening loop
@joostjager
Copy link
Collaborator

joostjager left a comment

How should the different active heuristics interact? Currently the score based heuristic will always be preferred, but if it is exhausted and cannot produce more channel candidates, we'll fall back to the standard heuristic.

The current setup in this PR contains quite a few layers of selection/scoring/weighting/greedy loops. I think it is worth thinking about a simplification. Similar to the probability machine that we discussed for mission control.

What may be an idea is to lift the 'foreach node' loop out of the prefattach heuristic into agent.go. So every time autopilot wakes up, it will iterate over all nodes and for every candidate node make a call into a heuristic. It will supply the heuristic with candidate node and info on current channels, graph and available on-chain balance. As a result of this call, agent.go will expect two values:

  • A score indicating the expected improvement of the self node performance after creating a channel with this node (this is different than just a node score). Can be 0, meaning that opening a channel leads to no improvement or even degradation.
  • The size of the channel to open.

What agent.go subsequently does is take the node with the highest expected improvement and open a channel to that node with the returned size. If all nodes return 0, autopilot is done for now.

This change will decrease the (pending) available funds and change the set of (pending) channels.

This process is executed in a loop until there are no nodes anymore for which a new channel would improve the self node situation.

Modeling it like this creates a powerful interface for the heuristic, without needing to be aware of the complete environment it is operating in. Logic becomes more distributed. Expression of the heuristic will be different compared to the current interface, but I think it will be a better fit with how we ourselves tend to think about autopilot behaviour.

Both the prefattachment heuristic and the scoring heuristic can be converted into this model. If we'd want to combine both models, we could again make a composite heuristic that for example averages the "improvement score" and channel size for both child-heuristics, and return that to the agent.

var heuristic AttachmentHeuristic
var availableFunds btcutil.Amount
var numChans uint32
for _, h := range a.cfg.Heuristics {

This comment has been minimized.

@joostjager

joostjager Oct 4, 2018

Collaborator

I would introduce a struct called CompositeAttachmentHeuristic that executes this first come first serve logic. The list of child-heuristics is maintained in that struct. Then at least this code doesn't need to be aware of multiple heuristics being active.

It is then also an option to move the moreChans / distribute calls in that struct.

@halseth halseth force-pushed the halseth:autpilot-score-attachement branch from 6b4fef5 to 8958bac Oct 12, 2018

@halseth halseth added the blocked label Oct 31, 2018

@halseth

This comment has been minimized.

Copy link
Collaborator

halseth commented Oct 31, 2018

Blocked by #2039

@Roasbeef

This comment has been minimized.

Copy link
Member

Roasbeef commented Nov 2, 2018

So every time autopilot wakes up, it will iterate over all nodes and for every candidate node make a call into a heuristic.

The idea is to remove the idea of candidate nodes (which are generated via a single heuristic), and instead move to a model where we have a set of scoring heuristics, we obtain the top N from each source, then normalize the scores into a composite heuristic.

I think the idea of a pre/post candiate attachment evaluation is worthwhile! We can eventually later experiment with actions like: "if I connect to this node, whole will that affect a network/graph wide metric?". From my PoV, they're a distinct, but complementary. This PR is attempting to abstract out the set of heuristics so we can combine data sources, while I think what you're suggesting is more of an incremental optimization engine. Although, thinking about it a bit more, I think this can be layered one, we take the candidates with the best normalized score, then query the global heuristic to determine how attaching to this node will affect the global node score, and open a channel to them based on that information.

One component that's still generally unaddressed is: how large of a channel to open? Our current thinking is that this will be based on node acceptance policies once they start to more closely monitor their inflows and outflow to achieve a nice equilibrium. The current code as is, will simply attempt to open the largest possible channel, and if it has enough funds, open as many of them as possible. On testnet this is fine, but obviously on mainnet, we don't want to put all our satoshis in one basket.

@Roasbeef

This comment has been minimized.

Copy link
Member

Roasbeef commented Nov 2, 2018

So every time autopilot wakes up, it will iterate over all nodes and for every candidate node make a call into a heuristic. It will supply the heuristic with candidate node and info on current channels, graph and available on-chain balance

Where would the set of initial candidates be drawn from? Let's assume the current pref attachment heuristic goes away all together for a sec

@Roasbeef
Copy link
Member

Roasbeef left a comment

I think this is a great start! However, it isn't very close to what I'd had in mind originally. Most of my comments are discussion of the design of the system as a whole. Once we merge the autopilot RPC stuff, then we can start to rebase this and make the diff a bit smaller. I've also moved some of the discussion w.r.t architecture offline to eventually report back to this PR with the final design direction.

return fundsAvailable, numAdditionalChans, true
}

// distribute will take the available funds and passed directives, and return a

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

Perhaps the heuristics should instead suggest the size of the channel to open? What's implement here was meant to be nothing more than a stop gap.

// set constraints. In the case that is possible, the funds available for new
// channels, and the number of additional channels that can opened is also
// returned.
func (h *HeuristicConstraints) moreChans(channels []Channel,

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

This was initially written with the assumption that there'd only ever be a single heuristic that drove the entire agent. With this multi-heuristics scoring based model, I think we may need to re-think this approach all together.

targetAllocation := btcutil.Amount(float64(totalFunds) * p.threshold)
fundsAvailable := targetAllocation - totalChanAllocation
return fundsAvailable, numAdditionalChans, true
// We'll try to open more channels as long as the constraints allow it.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

So with this change to the scoring based heuristics, I envisioned removing this method NeedMoreChans all together. Instead the agent determines this.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

So with this change, we'd have an interface something like:

type AttachmentHeuritic interface {
    SelectTopN(self *btcec.PublicKey, graph ChannelGraph, skipNodes map[NodeID]struct{}) ([]AttachmentScore, error) 
}

Eventually once we move to the retry/queue based model, we remove the skipNodes all together.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

In this case AttachmentScore is more or less a two-tuple, of the node info (pubkey, etc) and the float64 score (or an int, whichever).

// node is found, but has no addresses to connected to.
//
// NOTE: Part of the autopilot.ChannelGraph interface.
func (d *databaseChannelGraph) FetchNode(pubkeyBytes NodeID) (Node, error) {

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

We may eventually also want to expose the feature bit information as well. As eventually the pilot may be aware of things like watch towers. As an example, if we don't have any channels yet, and want to open a channel, then it may make sense to open a channel up to a tower so we can pay it for tokens to watch over other channels. ofc, we'd need to ensure that we don't have it watch channels that we have directly with the node.

"github.com/btcsuite/btcutil"
)

// ScoreAttachment is an implementation of the AttachmentHeuristic

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

Referring back to my earlier comment, I'd intended this to instead be a modification of the existing interface. So they'd all be scoring based.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

Although for what we have in mind, could pin this as an ExternalScoringHeuristic.


// We'll continue our attachment loop until we've exhausted the current
// amount of available funds.
for _, node := range prefNodes {

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

For the external scores, we may want to insert a bit of jitter in here. So for example adding a supplementary score based on the xor distance (or just regular distance) of a node's pubkey to our own.

totalChans, a.totalBalance,
)
if !needMore {
var heuristic AttachmentHeuristic

This comment has been minimized.

@Roasbeef

Roasbeef Nov 2, 2018

Member

IMO the constraints should be lifted to the agent level. So instead, this loop would query all heuristics for their top candidates, and then (to start something simple) sum the scores from each heuristic, and use that as a basis (along with jitter) to select a possible candidate.

@@ -42,3 +53,10 @@ message EnableRequest{
}

message EnableResponse {}

message SetNodeScoresRequest{

This comment has been minimized.

@Roasbeef
@joostjager

This comment has been minimized.

Copy link
Collaborator

joostjager commented Nov 2, 2018

Where would the set of initial candidates be drawn from? Let's assume the current pref attachment heuristic goes away all together for a sec

The candidate set is in this case all nodes known. So for every node known, heuristics will be queried and they'll return the 'gain' of connecting to this node.

@halseth halseth force-pushed the halseth:autpilot-score-attachement branch from 8958bac to 4fc1f58 Nov 5, 2018

@halseth halseth changed the title [experimental] Scoring based autopilot attachment directive Scoring based autopilot attachment directive Nov 5, 2018

@halseth halseth changed the title Scoring based autopilot attachment directive Scoring based autopilot attachment heuristic Nov 5, 2018

@halseth halseth removed the blocked label Nov 5, 2018

@halseth

This comment has been minimized.

Copy link
Collaborator

halseth commented Nov 5, 2018

Thanks for the review and suggestions!

The scope of this PR has been reduced to only making the existing attachment heuristic scoring based. It is intended to keep the current autopilot behavior, but making the API scoring based to later making it easy to add heuristics like the discussed "External scoring based" one.

@Roasbeef
Copy link
Member

Roasbeef left a comment

I think this PR is heading in the right direction now! Most of my questions surround the current selection algorithm given the scores of each possible node. You'll find a hypothetical example in-line, I might be totally mis-reading the intention here though. Also I notice the tests haven't been updated, I'm guessing the plan is to leave that until we have a set interface/algorithm?

// size. The scores will be in the range [0, M], where 0 indicates no
// improvement in connectivity if a channel is opened to this node,
// while M is the maximum possible improvement in connectivity.
NodeScores(g ChannelGraph, chans []Channel, fundsAvailable btcutil.Amount,

This comment has been minimized.

@Roasbeef

Roasbeef Nov 14, 2018

Member

Depending on how the future heuristics specify this score, we may want to add a degree of normalization across all the scores. Otherwise, we may be comparing something like the in-degree of nodes to the aggregate amount that we've routed through them in the past (just an example).

This comment has been minimized.

@halseth

halseth Nov 22, 2018

Collaborator

Yeah, I was thinking about this as well. I think in order to do that we must decide on the range scores should be in. I think it makes sense for scores to be in the range 0.0-1.0, but they must be "calibrated" somehow. Such that I know that if the heuristic give me a score of 0.01 I will know it is not a good node, even though it might score highest among those I checked.

This is what I tried to do with the "score threshold" constraint, but it doesn't translate easily across heuristics without a calibration. For instance the current pref attachment heuristic uses a node's fraction of the channels in the graph as its score, but this leads to pretty small scores for most nodes.

a.cfg.Constraints.MaxPendingOpens)
continue
}
s := scores[candidate]

This comment has been minimized.

@Roasbeef

Roasbeef Nov 14, 2018

Member

Check the second value here just to guard against bugs in weightedChoice?


// Pick a random number in the range [0.0, 1.0), and iterate the map
// until the number goes below 0. This means that each index is picked
// with a probablity eqaual to their normalized score.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 14, 2018

Member

eqaual -> equal

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Typo still present.


// Now use the score to make a weighted choice which
// node to open a channel to.
candidate, err := weightedChoice(scores)

This comment has been minimized.

@Roasbeef

Roasbeef Nov 14, 2018

Member

The proposed design was to select the node (aggregating all the heuristics) that results in the highest "gain". Atm, this will randomly select a node from the single heuristic. is the idea to leave the generic selection algorithm to a follow up PR?

This comment has been minimized.

@halseth

halseth Nov 22, 2018

Collaborator

Idea is to either here combine scores from different heuristics before making the choice, or create a combined heuristic that abstract this away.

// until the number goes below 0. This means that each index is picked
// with a probablity eqaual to their normalized score.
r := rand.Float64()
for k, v := range norm {

This comment has been minimized.

@Roasbeef

Roasbeef Nov 14, 2018

Member

I find this selection a bit hard to analyze. First, the map iteration of norm is already randomized. So the probability that a node is selected, is dependent on both the order of map iteration and the random value selected. As an example, if the iteration wasn't randomized, and the value .50 was selected each time, and both nodes had the same score (let's say .50 for the sake of the example), then the first node would always be selected.

One thing we could do, in order to assert guarantees w.r.t the selection here, is to write a set of testing/quick tests that target selection probabilities based on scores+ranges.

This comment has been minimized.

@halseth

halseth Nov 22, 2018

Collaborator

Added some explanation. Will add some unit/quick tests when the design is settled.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Much clearer now with the comment, thanks!

@@ -82,7 +82,7 @@ func (d dbNode) Addrs() []net.Addr {
//
// NOTE: Part of the autopilot.Node interface.
func (d dbNode) ForEachChannel(cb func(ChannelEdge) error) error {
return d.node.ForEachChannel(d.tx, func(tx *bolt.Tx,
return d.node.ForEachChannel(nil, func(tx *bolt.Tx,

This comment has been minimized.

@Roasbeef

Roasbeef Nov 14, 2018

Member

If it's closed or not? The idea is to not require a new db transaction for each node we explore, and instead do it under a single database transaction.

This comment has been minimized.

@halseth

halseth Nov 22, 2018

Collaborator

Was able to avoid this change by no longer keeping the Node objects around.

@halseth halseth force-pushed the halseth:autpilot-score-attachement branch from 4fc1f58 to fda3262 Nov 22, 2018

@halseth

This comment has been minimized.

Copy link
Collaborator

halseth commented Nov 22, 2018

Significant rewrite pushed, ready for another round of review :)

Tests should also be passing now :D

@halseth halseth referenced this pull request Nov 23, 2018

Open

[WIP] Bos score enabled Autopilot #2212

4 of 6 tasks complete
@Roasbeef
Copy link
Member

Roasbeef left a comment

Nice work with the new refactoring! Completed another round of initial review, will start to run this on one of my testnet nodes to get a feel for the new changes.

log.Infof("No eligible candidates to connect to")
continue
}
// openChans queries the agent's heuristic for a set of channel candidates, and

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

👍

I think some of the comments and code fragments below can be opened up a bit to take advantage of the new column space.

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Done!

continue
}
a.pendingConns[nodeID] = struct{}{}

go func(directive AttachmentDirective) {

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Now that the maps etc are a part of the agent, we can also break this goroutine out into its own function. This change would reduce the amount of vertical space utilized by this method in general.

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Done!

case graphChans == 0:
score = 0

// Set score 0 if channel is too small.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Seems we can just check this above (outside of the select loop), and continue altogether, not even including this node in the final scoring map.

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Done.


// Pick a random number in the range [0.0, 1.0), and iterate the map
// until the number goes below 0. This means that each index is picked
// with a probablity eqaual to their normalized score.

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Typo still present.

// until the number goes below 0. This means that each index is picked
// with a probablity eqaual to their normalized score.
r := rand.Float64()
for k, v := range norm {

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Much clearer now with the comment, thanks!

return NodeID{}, fmt.Errorf("no choice made")
}

// chooseN picks at random min[n, len(s)] nodes if from the

This comment has been minimized.

@Roasbeef
chanCandidates, err := a.cfg.Heuristic.Select(
a.cfg.Self, a.cfg.Graph, availableFunds,
numChans, nodesToSkip,
// Gather the set of all nodes in the graph, except those we

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

Typo in commit message: copmatible -> compatible

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Fixed.

return fmt.Errorf("unable to get graph nodes: %v", err)
}

// Use the heuristic to calculate a score for each node in the

This comment has been minimized.

@Roasbeef

Roasbeef Nov 28, 2018

Member

I dig the control flow of this new loop, haven't got to the rest of the commits yet, but as joost suggested in a prior commit, we can make a composite heuristic that itself combines other heuristics and allows us to hide the details w.r.t the eventual normalization scoring we add in.

@halseth halseth force-pushed the halseth:autpilot-score-attachement branch from 8152177 to 1fd66da Nov 29, 2018

@Roasbeef

This comment has been minimized.

Copy link
Member

Roasbeef commented Dec 4, 2018

Ready to land once final comments addressed.

pendingMtx.Unlock()
a.pendingMtx.Lock()
log.Debugf("Pending channels: %v", spew.Sdump(a.pendingOpens))
a.pendingMtx.Unlock()

This comment has been minimized.

@cfromknecht

cfromknecht Dec 4, 2018

Collaborator

do we need to unlock here? seems we can just hold through mergeChanState

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Fixed in following commit to keep this one purely a move.

// NOTE: we'll set the addrs, score and chansize below.
candidates := make(map[NodeID]*AttachmentDirective)
for nID := range nodes {
pub, err := btcec.ParsePubKey(nID[:], btcec.S256())

This comment has been minimized.

@cfromknecht

cfromknecht Dec 4, 2018

Collaborator

I wonder if we should delay parsing the pubkeys as long as possible to avoid the extra CPU and gc cycles. this has been a performance issue in the past

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Fixed by removing the NodeKey from the AttachmentDirective completely, and parsing only when calling ConnectToPeer.

// If the node is among or existing channel peers we give it a
// score of 0, as we don't need another channel.
case ok:
score = 0

This comment has been minimized.

@cfromknecht

cfromknecht Dec 4, 2018

Collaborator

Instead of giving a score of 0 in all non-default cases, why not remove them from the candidate map and filter them here?

This comment has been minimized.

@halseth

halseth Dec 6, 2018

Collaborator

Done!

autopilot: define HeuristicConstraints
This commit defines a new struct HeuristicConstraints that will be used
to keep track of the initial constraints the autopilot agent needs to
adhere to. This is currently done in the ConstrainedPrefAttachement
heuristic itself, but this lets us share these constraints and common
method netween several heuristics.
autopilot/prefattach: use HeuristicConstraints
This commit makes the pref attach heuristic and the agent use the
HeuristicConstraints internally.

@halseth halseth force-pushed the halseth:autpilot-score-attachement branch 7 times, most recently from 74673ec to a6708a9 Dec 6, 2018

halseth added some commits Dec 6, 2018

autopilot/agent: add maps to agent struct
This commit moves the maps used by the controller loop to the Agent
struct, in preparation for breaking it up into smaller parts.
autopilot/agent: split opening logic into own method
This commit takes the logic after the autopilot agent has decided that it
needs to open more channels, and moves it into a new method openChan.
autopilot/prefattach+interface: add API NodeScores
This commit adds a new method NodeScores to the AttachementHeuristic
interface. Its intended use is to score a set of nodes according to
their preference as channel counterparties.

The PrefAttach heuristic gets a NodeScores method that will score the
ndoes according to their number of already existing channels, similar to
what is done already in Select.
autopilot/prefattach_test: use NodeScores API for prefAttach tests
This commit converts the existing unit tests of Select into tests of
NodeScores.
autopilot/agent: add weightedChoice and chooseN algorithm
The algorithms will be used to select nodes at random from the weighted
distribution set by the node's scores given by the heuristic.
autopilot/interface+agent: remove NodeKey from AttachmentDirective
Instead parse the pubkey bytes only when needed.
autopilot/agent: use NodeScores to select channel candidates
This commit makes the autopilot agent use the new NodeScores heuristic
API to select channel candiates, instead of the Select API. The result
will be similar, but instead of selecting a set of nodes to open
channels to, we get a score based results which can later be used
together with other heuristics to choose nodes to open channels to.

This commit also makes the existing autopilot agent tests compatible
with the new NodeScores API.

@halseth halseth force-pushed the halseth:autpilot-score-attachement branch from a6708a9 to 6130189 Dec 6, 2018

@Roasbeef
Copy link
Member

Roasbeef left a comment

LGTM 🔱

I think this is read to land so we can open up the door to coding up a new set of heuristics with the new node scoring styles. We can make follow up PR's for the lingering TODO's there, and I may also add some more logging as I continue to test the new version.

@Roasbeef Roasbeef merged commit c071a17 into lightningnetwork:master Dec 7, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment