Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitcoin backend plugins planning #3354

Closed
cdecker opened this issue Dec 18, 2019 · 22 comments · Fixed by #3620
Closed

Bitcoin backend plugins planning #3354

cdecker opened this issue Dec 18, 2019 · 22 comments · Fixed by #3620
Assignees

Comments

@cdecker
Copy link
Member

cdecker commented Dec 18, 2019

I've been planning to work on a plugin-based bitcoin backend for a while but
haven't gotten around to actually working on it. Since others are trying to
get started on this I thought I'd share some of my thoughts around how this
could be accomplished.

Current situation

Our current interactions with the bitcoin network is nicely encapsulated in
lightningd/bitcoin.c. From there we can see that the RPC methods we call are
the following:

  • estimatesmartfee used for fee estimation on a couple of different
    horizons (1, 3, 6 blocks).
  • getblock (raw and hex encoded) used to retrieve a block after calling
    getblockhash to get its hash given the height.
  • getblockhash to translate a height into a blockhash
  • gettxout to check whether a tx output is still unspent, i.e., in the UTXO.
  • getnetworkinfo used to check that bitcoind is a supported version only.
  • getblockchaininfo used to check if bitcoind is in IBD.
  • sendrawtransaction used to send a transaction to the network.

Backend Interface

I'd like to make the interface for backend a bit more coarse-grained, in order
to give the plugins more freedom as to how they accomplish their task. My go
to example is getblockhash + getblock which is always called in this order
and requires multiple roundtrips to bitcoind for a single logical operation.

Some of the methods above can be translated 1-to-1 into the plugin:

  • sendrawtransaction can become a simple hook, if only for the reason that
    we don't want to expose it to the JSON-RPC users. In addition with
    prioritized and chained hooks we can have a failover chain that tries one
    plugin after the other in case of failure.

Other methods can be combined into more abstract methods:

  • getblockbyheight is a combination of getblockhash and getblock which
    either returns the hash and the block at a given height or an error if
    there is no block (yet). In addition we could give a tiny hint as to the
    context in which the query is done (scan) to tell the plugin whether we
    are in a linear scan to catch up with the blockchain head. That'd allow the
    plugin to pre-fetch the next few blocks, effectively pipelining the sync.
  • getchaininfo can subsume the calls to getnetworkinfo and
    getblockchaininfo because they are mostly done on startup, and
    effectively contain the same information. A version number for the set of
    capabilities exposed by the backend plugin would still be in order to guard
    against using an outdated backend with a newer version that expects some
    things to be available.
  • getoutputbyscid is a combination of getblockhash, getblock
    (verbosity=1) and gettxout used when verifying a channel's outpoint
    from its announcement. This is one of the most costly calls we have since
    it threads through 3 RPC calls and is called for every channel we learn
    about (it's cached later, but on new nodes this is the heavy hitter). We
    now optimized this to use filtered blocks, but it's still in use by some
    parts of the machinery.
  • getfilteredblock is an alternative to getoutputbyscid issue. It gets
    the entire block, filters out transactions that are not interesting (don't
    have unspent outputs or don't have P2WSH outputs) and returns them to be
    processed. We would probably implement either this bulk interface or
    getoutputbyscid. The latter would allow plugins to implement something
    smart, but comes at the cost of more hook calls.

And finally we can make some things notifications:

  • fee could give a bulk update on all horizons we are interested in and
    push a fee update from the plugin to lightningd. That means we could also
    respond to things happening on the network that could affect fees (new
    block being found), or we could implement fee smoothing at the plugin level.
  • block and tx notifications could alert us about incoming blocks (header
    and height) or transactions we might be interested in. The latter would
    mean we need to tell the plugin about addresses and outputs we are
    interested in, but it'd be only an optimization anyway since we'll catch
    them during the regular processing of blocks.

Hook vs JSON-RPC passthrough

Most of the above methods could be implemented either as JSON-RPC methods
(with passthrough to the user RPC interface) or as hooks. JSON-RPC methods are
nice since they allow the user to query the same backend that lightningd
uses internally (facilitating debugging), however they may end up cluttering
the RPC interface with semi-related things. Hooks on the other hand are
internal only and with the planned chaining of hooks we can have multiple
plugins with different priorities providing transparent failover (something
that with JSON-RPC is not planned right now), the downside is that the
failover isn't implemented yet and the backend is not accessible for
debugging.

@cdecker cdecker changed the title I've been planning to work on a plugin-based bitcoin backend for a while but Bitcoin backend plugins planning Dec 18, 2019
@cdecker
Copy link
Member Author

cdecker commented Dec 18, 2019

Ping @darosior :-)

What do you think about making the interface as coarse-grained as possible?

@cdecker cdecker self-assigned this Dec 18, 2019
@cdecker
Copy link
Member Author

cdecker commented Dec 18, 2019

See #2796 for discussion of hook chaining.

@darosior
Copy link
Collaborator

darosior commented Dec 18, 2019

Thank you for your guidance :-)

I'd like to make the interface for backend a bit more coarse-grained, in order
to give the plugins more freedom as to how they accomplish their task.

Yes that's what I had in mind too.

getchaininfo can subsume the calls to getnetworkinfo and
getblockchaininfo

Hmm iirc we use getnetworkinfo just to get core version so that's an open question as it might be out of scope for plugin implementing a backend different from core (I have esplora in mind).

And finally we can make some things notifications

Like logs from plugin to lightningd ? It still seems weird to me, but we could even do everything this way (event-driven). Or maybe you already meant the notification would make us trigger the hook ?

Hook vs JSON-RPC passthrough

I "started" implementing it using a bitcoin_data hook (open to suggestions for the name). I think it's cleaner than using the jsonrpc interface: like we have hooks for this, let's use them ?

@ZmnSCPxj
Copy link
Collaborator

I would prefer JSON-RPC rather than hooks, since as noted it allows the user to also use the backend.

I would suggest using some kind of namespace, e.g. name them backend::getblockbyheight.

How does this interface inform C-lightning of the fact of a reorg?

I am unsure if you can make fee and block notifications, those are lightningd->plugin, and you want them to be plugin->lightningd, unless I misunderstand something.

@darosior
Copy link
Collaborator

I am unsure if you can make fee and block notifications, those are lightningd->plugin, and you want them to be plugin->lightningd, unless I misunderstand something

Notficiations from plugins to lightningd are how we plugin_log() today, but we'd need to implement some logic lightningd-side so that they are not log-only anymore. (That's why I at first proposed another subdaemon in Berlin ^^)

@ZmnSCPxj
Copy link
Collaborator

Or implemented as commands, so plugins "notify" lightningd by issuing a command.

@cdecker
Copy link
Member Author

cdecker commented Dec 19, 2019

Hmm iirc we use getnetworkinfo just to get core version so that's an open
question as it might be out of scope for plugin implementing a backend
different from core (I have esplora in mind).

You're right, we are unlikely to require a version check on the backend if we
split the functionality up into several plugins (one could do feerate
estimations, others could be good as a source for blocks, etc). All we really
need to ensure is that we have a hook or method registered for all the
individual methods we intend to call.

Like logs from plugin to lightningd ? It still seems weird to me, but we
could even do everything this way (event-driven). Or maybe you already meant
the notification would make us trigger the hook ?

Indeed the logs shipped from plugin to lightningd are formatted as a
JSON-RPC 2.0 notification, and they have the log method defined, so we can
add more notification types (with other method identifiers such as
block-notify or tx-notify) that plugins could inject into the message
stream to notify about events on the backend.

I would keep the notifications as lightweight as possible and still rely on
lightningd polling on timer and on notification so that a lazy backend can
skip the notifications and things still work fine (if we are polling a remote
service we can also let that be driven by lightningd polling), but optimized
plugins could hint that there is something interesting happening. It also
means that if we have multiple backend plugins we don't get flooded by huge
block notifications containing the blocks if a block is found but we get
hints, and then pick a single backend that we'd like to give us the full block
(reducing the cost if we poll some remote service as well, since we only
listen for metadata and fetch the full block only if lightningd picked us to
deliver it using the hook/method)

I "started" implementing it using a bitcoin_data hook (open to suggestions
for the name). I think it's cleaner than using the jsonrpc interface: like we
have hooks for this, let's use them ?

You multiplex all the calls I mentioned above into a since hook? That's
certainly a start, but by splitting it into smaller units we can pick and
choose plugins as we like. Again having one provide the fee estimations,
another one as a fast way to learn about block headers and a third one as a
provider for block data.

I would prefer JSON-RPC rather than hooks, since as noted it allows the user
to also use the backend.

Yeah, there's that advantage, but I think we will likely need to make a
case-by-case decision. For one I'm a bit scared to expose things like fee
setting to a user, just so plugins can update the fee. We've had a couple of
cases where users accidentally set their on-chain fee way too low, thinking
that they are manipulating the off-chain routing fee.

On the other hand it'd be ok if lightningd were polling or listening for fee
notifications since users can't interfere with that. So I think anything that
changes state internally should not be exposed as a JSON-RPC call that can be
called on lightningd itself, but rather be polled by lightningd on the
plugin or use notifications. Maybe the general pattern of the plugin notifying
that something interesting has happened and then triggering a poll in
lightningd is the best option.

I am unsure if you can make fee and block notifications, those are
lightningd->plugin, and you want them to be plugin->lightningd, unless I
misunderstand something.

The JSON-RPC spec doesn't specify that notifications can flow in only one
direction (I read the spec really carefully because I didn't believe it myself
back then), and so we already have notifications flowing in both directions:
log notifications flow from plugin to lightningd, whereas the ones
documented in the plugin doc flow from lightningd to the plugin. We can
definitely make use of plugins to notify lightningd about things happening
in the backend that might then trigger a method call or a hook call.

The notifications from plugins are handled here:

static void plugin_notification_handle(struct plugin *plugin,
const jsmntok_t *toks)
{
const jsmntok_t *methtok, *paramstok;
methtok = json_get_member(plugin->buffer, toks, "method");
paramstok = json_get_member(plugin->buffer, toks, "params");
if (!methtok || !paramstok) {
plugin_kill(plugin,
"Malformed JSON-RPC notification missing "
"\"method\" or \"params\": %.*s",
toks->end - toks->start,
plugin->buffer + toks->start);
return;
}
/* Dispatch incoming notifications. This is currently limited
* to just a few method types, should this ever become
* unwieldy we can switch to the AUTODATA construction to
* register notification handlers in a variety of places. */
if (json_tok_streq(plugin->buffer, methtok, "log")) {
plugin_log_handle(plugin, paramstok);
} else {
plugin_kill(plugin, "Unknown notification method %.*s",
json_tok_full_len(methtok),
json_tok_full(plugin->buffer, methtok));
}
}

All we'd need to do is to add more cases there, or we could generalize it and
use the existing autodata infrastructure to register notification handlers
that then get dispatched here. I didn't fully code it up back then because I
only had a single notification type 😉

@darosior
Copy link
Collaborator

darosior commented Dec 20, 2019

You multiplex all the calls I mentioned above into a since hook? That's
certainly a start, but by splitting it into smaller units we can pick and
choose plugins as we like. Again having one provide the fee estimations,
another one as a fast way to learn about block headers and a third one as a
provider for block data.

Yep, with content like method and params. Will change anyway as we seem to tend more toward JSONRPC methods instead of a hook :-)

I would prefer JSON-RPC rather than hooks, since as noted it allows the user
to also use the backend.

Yeah, there's that advantage, but I think we will likely need to make a
case-by-case decision. For one I'm a bit scared to expose things like fee
setting to a user

Hmm I see using JSONRPC methods subscription like having estimate[smart?]fees calls, an user wouldn't set it anyway.

--

Overall I think this can be separated in two parts:

  • First part
    • Make the bitcoin-cli interaction a plugin, almost done (btw after familiarizing with the code I now understand why lightning-cli renders JSON :p)
    • Make lightningd/bitcoind poll RPC instead of bitcoin-cli directly (getchaininfo, sendrawtransaction, getblockbyheight, getfilteredblock).
  • Second part
    • Make lightningd reacts to plugin notification
    • Use some of this ""reactions"" as events to trigger polling

@darosior
Copy link
Collaborator

darosior commented Dec 26, 2019

I think, as you mentioned, a versioning is necessary to be able to update our backend without breaking alternative backends. An explicit getversion command sent as the first request could serve both this purpose and the plugin ACKing it exposes all the commands necessary for normal functioning of this version.

@cdecker
Copy link
Member Author

cdecker commented Dec 28, 2019

Well we could also pass in the backend version in the manifest :-)

@darosior
Copy link
Collaborator

But this means adding a plugin-specific field to the getmanifest ? Also I'm planning to add to the getversion / setup / init / whatever call made before operations start the chain and a is_synced boolean. So that we poll until the plugin responds with:

{
    "chain": "main",
    "is_synced": true,
    "version": "1"
}

And this completely eliminates the need for the chaininfo call.

@cdecker
Copy link
Member Author

cdecker commented Dec 30, 2019

I still think that the version of the call should be identified by the call
itself. An rpcmethod being registered should have a version attached which
indicates the format of the arguments and the result. This can be done easily
since the rpcmethod object in the reply to getmanifest is an object and we
can add a semantic version to it, indicating what format the plugin
understands and what format the plugin returns.

We currently do not have this luxury for hooks, since they're just the string
name to which we subscribe, but it shouldn't be too hard to transition to
objects instead of strings, and then we can pass a bit more information about
what capabilities we have.

I like the semantic versioning here because it allows us to easily determine
whether lightningd and the plugin are compatible. I'm not suggesting we
implement multiple versions and call according to what the plugin expects, but
at least we can version things that are non-breaking and/or breaking (though a
simpler scheme might allow to do the same).

Having a per-call version allows us to pick-and-chose which plugin to use for
a specific call. For example we might have a really advanced fee-estimation
plugin, but with a terribly slow getblock implementation, or not implement
it at all.

This unbundling of calls is a rather nice thing imho. However, it introduces
additional complexity, and I understand if the general feeling is that it is
too much flexibility for little gain.

@darosior
Copy link
Collaborator

Ok, I didn't understood you were talking about a per-call versioning. This seems a good idea for running up to date plugins with non up to date lightningd.

Having a per-call version allows us to pick-and-chose which plugin to use for
a specific call. For example we might have a really advanced fee-estimation
plugin, but with a terribly slow getblock implementation, or not implement
it at all.

So lightningd would pick one when receiving the getmanifest response ? How would it differentiate the advanced fee estimation plugin from the other one ? As far as I understand this means we'd need startup options to hint which plugin to use for which call ? And we'd allow different plugins to register the same calls

@cdecker
Copy link
Member Author

cdecker commented Dec 31, 2019

I'm thinking more of the inverse, where a plugin implements only part of the functionality we require from the backend: have a plugin that handles getfees well and that exposes only that, while chain tracking is in another plugin, and a third plugin implements the sendtransaction method. I'd like to unbundle the plugins as much as possible, and allow plugins to implement only what they are good at, while making sure in aggregate all the necessary functions are implemented (maybe by having a low-priority fallback on bitcoind).

@ZmnSCPxj
Copy link
Collaborator

ZmnSCPxj commented Jan 4, 2020

The approach of unbundling seems fine to me. Complexity only matters when implementing and debugging. Though I suppose that is equivalent to saying "complexity always matters".

@darosior
Copy link
Collaborator

Just to give some updates, I'm still on this. Just that I noticed that to not lose asynchronicity, I'd have to make libplugin use ccan/io (need for io_loop) which turns out to be more time-consuming than expected..

@darosior
Copy link
Collaborator

darosior commented Mar 8, 2020

@cdecker, after #3570 what do you think should be done before we can document the API ?

@darosior
Copy link
Collaborator

Hmm maybe we could use perkw feerates while it's not too late and externalizes the pervbyte-to-perweight computation to the plugin.. Wdyt ?

@fiatjaf
Copy link
Contributor

fiatjaf commented Mar 11, 2020

It would be nice if the plugin could provide only the list of relevant transactions instead of the raw block, according to some query it would be given. Then I would be able to do the filtering on the machine my bitcoind is running and only send the relevant data to the machine where lightningd is running, it would be much faster and healthier for my internet connection (I would have to run a custom listener on the bitcoind machine that would interface with the plugin, but that's fine).

Failing that, a list of all transactions in JSON instead of the raw block would be good so I would be able to use a ton of different block explorers on trustedcoin and similar plugins (because currently only blockchain.info provides the full block and the alternatives @cdecker gave me didn't work).

@darosior
Copy link
Collaborator

darosior commented Mar 11, 2020

(because currently only blockchain.info provides the full block and the alternatives @cdecker gave me didn't work).

Yep I had to use blockchain.info for the esplora plugin.. Cf Blockstream/esplora#171

It would be nice if the plugin could provide only the list of relevant transactions instead of the raw block, according to some query it would be given.

That's what was intended in the first place, and I gave it some thoughts since then.

We need blocks to retrieve all likely-to-be-lightning-related transactions. We now use get_filtered_block, which parses raw blocks, in place of get_output_by_scid as it's more efficient. We have two choices if we want less bandwidth load between the Bitcoin plugin serving blocks and lightningd :

  • Implement getfilteredblock plugin-side (I tried that at first but there were some complications that I can't remember of, maybe I'll retry soon)
  • Implement getoutputbyscid plugin-side so that plugins can do something smart. That's simpler and really great for plugins using trusted source such as yours, mine, or for example one hitting Electrum servers etc..
    However doing so would be a big efficiency loss for the default lightningd running against bcli !
    The (crazy) thing I thought of is to poll bitcoind and store filetered raw blocks bcli-side like in a database and be able to immediately serve outputs. But I think it's overkill.

@cdecker @ZmnSCPxj do you have an idea for this ?

EDIT: Also on a more meta standpoint I don't think we should adapt our backend because of centralized Bitcoin APIs not providing a fundamental Bitcoin component.
But I do think we should optimize the bandwidth between the plugin and lightningd though :-)

@cdecker
Copy link
Member Author

cdecker commented Mar 25, 2020

Seems like we got the discussion about the intrinsic trade-off between bulk and micro interfaces. The filteredblock interface was mainly designed for the backfilling of our utxo set (block missing when verifying a channel announcement, so we fetch all interesting outputs instead). Since we may have non-p2wsh outputs of interest for us, as deposit or withdrawal change for example, we can't fully rely on filteredblock to stay in sync with the blockchain.

I like @darosior's idea of moving some of the caching and speed optimizations to the backend Plugin, and keep lightningd as agnostic as possible from the backend implementation. That'd allow the plugin to decide whether to use a bulk interface (process full block when asked for a single output to amortize the cost) or a micro interface (got asked for one output, just gonna fetch one output). This makes sense since the plugin is in a better position to make an informed decision as to what works best, but it also means additional work and complexity for the plugin (maintaining its own DB etc).

Bandwidth between lightningd and the plugin is cheap (existing connection, local roundtrip), communication between plugin and blockchain may be expensive (spawn bitcoin-cli to talk to bitcoind, HTTP request to the other side of the world to contact a block explorer or electrum server, etc)

@darosior
Copy link
Collaborator

darosior commented Mar 30, 2020

Ok then, the status quo seems better. Think I'll PR the documentation for the next release !

In a v2 of the Bitcoin backend, it would be great to take advantage of the performance Electrum-backed systems such as Esplora can bring, without breaking compat with regular bitcoind..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants