The final chapter in the plugin saga (hooks) #2237

cdecker · 2019-01-09T20:02:01Z

This PR implements the hooks infrastructure for plugins and adds the htlc_accepted hook as a sample of how to implement hooks and how they can be used.

The REGISTER_PLUGIN_HOOK macro is the glue that binds everything together, defining the full flow of call -> serialize_request -> send_request -> deserialize_response -> callback.

This is not yet complete, missing one commit to actually react to the response to the htlc_accepted hook, and I'm planning to add another commit that tests the various hook return values.

I'm adding this now since I'd like to get the reviews going.

cdecker · 2019-01-09T21:18:54Z

Notice that in htlc_accepted_hook_deserialize all parsing failures are fatal. I'm not exactly sure if this is what we want, but I expect that we can't really recover from an error in this case, since going back to default behavior might not be what we want (if the plugin hook checks some policies, and would reject it, the default action might be to accept and thus to violate the policy). Therefore I made it as pedantic as possible in its current form.

Another nit in this case is that at this point we have not committed to the DB yet, which means that crashing will cause the DB transaction to rollback, potentially dropping the commitment we just got. This could however be solved by a variant of fatal that commits in case we have a DB transaction open.
Edit: as mentioned below, calling the plugin actually commits the DB transactions, hence this cannot end up rolling back.

What do you guys think @rustyrussell and @niftynei?

niftynei

Nice stuff. I know you've got a few more commits outstanding but wanted to get these comments to you.

Re: fatal... the biggest downside is how long it takes to fail, imo. if this was something we could check at startup, no big deal, but as implemented you don't find out there's something wrong with how you respond to htlc's until you've already got one inflight. what other options did you consider?

pardon my ignorance here, but can you remind me what the scope of fatal is again? Is it the whole of c-lightning? The process that it's running in?

niftynei · 2019-01-09T23:34:29Z

contrib/pylightning/lightning/plugin.py

+        """
+        if name in self.methods:
+            raise ValueError(
+                "A hook for method {} was already registered".format(name, self.methods[name])


your error message says a hook for method... but it seems like it could also collide on a rpcmethod, no? maybe better to say Method {} was already registered? unless you're using a canonical list of hook names?

niftynei · 2019-01-09T23:40:45Z

lightningd/plugin_hook.h

+ *
+ * - `response_cb` is called once the plugin has responded and the
+ *   response has been parsed by `deserialize_response`. In addition
+ *   an arrbitrary additional argument of type `cb_arg_type` can be


Suggested change

* an arrbitrary additional argument of type `cb_arg_type` can be

* an arbitrary additional argument of type `cb_arg_type` can be

Isn't it talk like a pirate day? My bad 😉

niftynei · 2019-01-09T23:45:32Z

lightningd/plugin_hook.h

+#define PLUGIN_HOOK_CALL_DEF(name, payload_type, response_cb_arg_type)         \
+	void plugin_hook_call_##name(struct lightningd *ld,                    \
+				     payload_type payload,                     \
+				     response_cb_arg_type cb_arg);             \


is this duplication intentional? 76-78 looks identical to 79-81

Yes, like mentioned above this one is the declaration of the function (no function body) and the lower one is the function definition (which contains the body). It's a bit of an antipattern since the definition should be in the matching .c file, but since we'd like to define this in a macro it has to be in the same place, and since we may want to call this from somewhere other than the module defining it (hooks should be defined together with the data they act on, and called from the entrypoints of the hooks), this is the best I could come up with.

Maybe @rustyrussell has a better option? An alternative that I haven't tried yet is to use a #define instead of this tiny function, avoiding all the linker issues altogether.

This Just Worked for me?

diff --git a/lightningd/plugin_hook.h b/lightningd/plugin_hook.h index 48dffe23..74be1e74 100644 --- a/lightningd/plugin_hook.h +++ b/lightningd/plugin_hook.h @@ -71,12 +71,8 @@ void plugin_hook_call_(struct lightningd *ld, const struct plugin_hook *hook, * *`. Not really necessary, but nice since it also makes sure that * the method-name is correct for the call. */ -/* FIXME: Find a way to avoid back-to-back declaration and definition */ #define PLUGIN_HOOK_CALL_DEF(name, payload_type, response_cb_arg_type) \ - void plugin_hook_call_##name(struct lightningd *ld, \ - payload_type payload, \ - response_cb_arg_type cb_arg); \ - void plugin_hook_call_##name(struct lightningd *ld, \ + static inline void plugin_hook_call_##name(struct lightningd *ld, \ payload_type payload, \ response_cb_arg_type cb_arg) \ { \

Interesting, I thought static inline would only be callable in the same compilation unit, but you're right, since we are #includeing the file, this probably means the client header as well. Will fixup 👍

niftynei · 2019-01-10T00:31:19Z

lightningd/plugin_hook.c

+		/* No such hook name registered */
+		return false;
+	} else if (hook->plugin != NULL) {
+		/* Another plugin already registered for this name */


This seems like it could get real crowded really quick. I'm sure it's fine for v1 but it's probably worth noting in the docs somewhere (if not already) about how collisions fail

Yeah, that might become a problem once people start using a myriad of different plugins, but I see no clean to fan-out to multiple plugins (serial vs parallel), and more importantly how to do fan-in, i.e., given two or more results which one to pick (serial call, and first non-default is picked vs. some sort of priority vs. ...).

Further complicating this is that hooks might have different ways of handling this, i.e., a hook that is just used to synchronously save information somewhere is trivially safe to use in parallel and have multiple ones, while a hook that actually modifies the continuation is not.

So the solution I'm eyeing currently is that we just punt this sort of multiplexing to a multiplexer plugin, that will then run the child-plugins instead of registering them with lightningd directly, and that would then know how to handle things.

We should probably start out like this and if there is a need to, we can start moving this kind of functionality into the plugin_hooks as necessary (I personally don't think there will be a huge competition for hooks, since they cater really specific needs, and most users will be happy with the default implemented in lightningd anyway).

Right I think this makes sense at the outset. It seems worth noting somewhere doc wise to cut down on potential for confusion later on.

niftynei · 2019-01-10T00:34:31Z

lightningd/plugin.c

@@ -1013,7 +983,7 @@ static void plugin_config(struct plugin *plugin)
 	json_object_end(req->stream);

 	jsonrpc_request_end(req);
-	plugin_send(plugin, req->stream);
+	plugin_request_send(plugin, req);


should these req's be annotated with takes(req)?

Suggested change

plugin_request_send(plugin, req);

plugin_request_send(plugin, take(req));

See @rustyrussell's comment below, I seem to be missing some detail about the TAKES annotation, but I'd generally agree that that's what I needed.

niftynei · 2019-01-10T00:39:36Z

lightningd/plugin_hook.c

+		plugin_request_send(hook->plugin, req);
+	} else {
+		/* If no plugin has registered for this hook, just
+		 * call the callback with a NULL result. Safes us the


Suggested change

* call the callback with a NULL result. Safes us the

* call the callback with a NULL result. Saves us the

niftynei · 2019-01-10T03:10:01Z

lightningd/peer_htlcs.c

+			fatal("Plugin specified an invalid 'payment_key': %s",
+			      json_tok_full(buffer, resulttok));
+	} else {
+		fatal("Plugin responded with an unknowk result to the "


Suggested change

fatal("Plugin responded with an unknowk result to the "

fatal("Plugin responded with an unknown result to the "

rustyrussell

Minor changes, but I just don't think this is ready for 0.6.3 given our haste in release. I believe full plugin support is what justifies a 0.7 release, and this should be part of that process...

rustyrussell · 2019-01-10T04:33:48Z

lightningd/plugin.c

+			 struct jsonrpc_request *req TAKES)
+{
+	/* Add to map so we can find it later when routing the response */
+	tal_steal(plugin, req);


This leaks take() if caller does it as implied by the TAKES above. You can just document that this function consumed req, and remove the TAKES annotation?

Hm, seems I'm missing the nuances of TAKES here, I thought the annotation means that it "takes ownership" and that the caller should not assume that the argument is accessible after the call anymore.

It means "IF you call with take(x), it will take ownership".

Maybe we need an annotation for "consumes"?

rustyrussell · 2019-01-10T04:39:04Z

lightningd/peer_htlcs.c


+	/* Falling through here is ok, after all the HTLC locked */
 	*failcode = 0;


This needs careful design. For example, I'm pretty sure we want to separate hooks for terminating onions and non-terminating, since most hooks will want one or the other. For terminating we'll also want a hook after we've entered it into the db, I think, and maybe an API to query the db on restart.

In short, someone needs to use the APIs in anger before they can be finalized.

I don't think there is much difference, immediately failing the HTLC after it was accepted (calling local_fail_htlc -> fail_in_htlc) or passing through the hook and then calling fail_in_htlc from the hook callback doesn't really do much. Things would be different if we could prevent an HTLC being committted to in the first place (a real immediate failure, instead of commit-then-fail), but as it stands the only difference is that the two updates to the HTLC (committed then failed) are no longer in the same DB transaction.

As for terminating vs non-terminating onions (i.e., I'm the intended recipient vs just an intermediate hop), I did not differentiate on purpose in order to allow things like short-circuiting a payment, i.e., I know the payment_key because I'm in an ensemble of load-balancing nodes so I can claim the funds directly.

We need a real user of this htlc_accepted API; I'm not smart enough to design it without a concrete one :(

Consider a just-in-time invoicing system. The current hook is in the wrong place: any of the checks in handle_localpay could still fail.

And I can't see how the plugin's db can stay in sync with lightningd's: a restart or crash can lead to one committed and one not. For example, after plugin commits to its db, and says continue, we crash or simply shutdown. Now htlcs_reconnect() fails the HTLC on restart.

Adding an external dependency like this is going to be painstaking work; I think we need to pick the smallest subset so we can make mistakes with limited consequences.

IF you were interested in starting with a simpler case with potentially less downstream implications, I could imagine various hooks for when channels change state. For instance, a simple hook for when the funding transaction is confirmed, or if a channel is asked to closed.

On the plugin side, something simple like an email/sms alert when a channel closes could be nice. Perhaps this meets the balance between some utility and less consequential.

This is not meant to interfere with the built-in invoicing system, it is rather a complement to it, and is meant to be used at a much lower level. With this you can pretty much create an external invoicing system, that'll get queried for open invoices.

Consider a just-in-time invoicing system. The current hook is in the wrong place: any of the checks in handle_localpay could still fail.

I added a small test that short-circuits a payment going through l1 -> l2 -> l3 at the middle node l2 and just claims the HTLC, leaving the invoice on l3 unpaid. Notice that we don't use handle_localpay or forward_htlc at all if we get told to fail or resolve.

And I can't see how the plugin's db can stay in sync with lightningd's: a restart or crash can lead to one committed and one not. For example, after plugin commits to its db, and says continue, we crash or simply shutdown. Now htlcs_reconnect() fails the HTLC on restart.

As for the problems that may emerge from htlcs_reconnect, notice that I avoided having a forward-to-this channel call on purpose, so that we cannot end up with an outgoing HTLC that cannot be associated with an incoming one. The incoming HTLC on the other hand immediately fails, or gets told the preimage if the plugin returns, hence we end up always in a case that we already handle correctly.

I think as it stands this is the simplest hook I can come up with: we get an HTLC and decide whether to continue, fail or resolve. Continue is the old, and proven, logic. Fail is about as simple as it gets, and resolve is just resolving an HTLC with a payment_key that did not come from an invoice, but somewhere else 😉

After discussing this in person with @rustyrussell I now understand the issue a bit better. We may end up imposing that the hook calls must be idempotent so that we can call them again in case we have an HTLC that was accepted and wasn't resolved or failed or forwarded.

rustyrussell · 2019-01-10T04:41:04Z

common/json.c

+{
+	size_t hexlen = tok->end - tok->start;
+
+	if (hexlen != sizeof(preimage->r))


Pretty sure hex_decode already does this check?

👍 will drop duplicate check :-)

cdecker · 2019-01-10T11:21:45Z

Another nit in this case is that at this point we have not committed to the DB yet, which means that crashing will cause the DB transaction to rollback, potentially dropping the commitment we just got. This could however be solved by a variant of fatal that commits in case we have a DB transaction open.

My mistake, the last part is not an issue, since calling out to the plugin will result in the DB transaction that triggered the event to be committed. So rather than us forgetting stuff, this means that actions spanning across a plugin hook are no longer in the same DB transaction, which makes them not atomic anymore. Shouldn't be an issue for now since we never rollback a DB transaction currently and only failures can result in aborted transactions. Just something to keep in mind when working with hooks in the future.

cdecker · 2019-01-10T11:28:38Z

Re: fatal... the biggest downside is how long it takes to fail, imo. if this was something we could check at startup, no big deal, but as implemented you don't find out there's something wrong with how you respond to htlc's until you've already got one inflight. what other options did you consider?

There's not much we can do imho, since it is only when we are actually parsing a reply that we can check it's return value to the call. The good thing is that we can localize parse errors in the deserializer of the hook, hence decide on a per-hook level whether this it should be fatal or not, and there are probably hooks in which continue is acceptable if the plugin misbehaves.

pardon my ignorance here, but can you remind me what the scope of fatal is again? Is it the whole of c-lightning? The process that it's running in?

It exits the process it is running in, so in this case lightningd which is equivalent to killing the entire c-lightning. In subdaemons it'll exit the subdaemon only, but if it's a required daemon (gossipd, hsmd, ...) it'll cause lightningd to exit as well.

Minor changes, but I just don't think this is ready for 0.6.3 given our haste in release. I believe full plugin support is what justifies a 0.7 release, and this should be part of that process...

Agreed. In that case I can take a bit more care and not rush this out the door just to make the cut. If you're ok with it I'd like to disable the entire plugin subsystem for 0.6.3 (put it behind the experimental flag) in order not to ship incomplete functionality.

I'll address the remainder of the comments inline and add fixups as required 😉

conscott · 2019-01-11T09:51:46Z

lightningd/peer_htlcs.c

@@ -590,16 +592,194 @@ static void channel_resolve_reply(struct subd *gossip, const u8 *msg,
 	tal_free(gr);
 }

+/**


Nit : Perhaps all the hook structures / functions can live in another file? It seems there are many options for future hooks and I can imagine them bloating some files that are already quite large, so it might set the precedent to keep those in their own space.

I considered something similar, but most of the logic is actually either taken from the peer_htlcs.c file (in the callback) or wraps and unwraps data that may not be available elsewhere, which is why I ended up putting it here. I'd prefer to keep the logic at least in the same place.

cdecker · 2019-01-14T20:57:44Z

Since the htlc_accepted hook proves to be a bit more controversial than I thought (and rightfully so), I split the PR into two branches, with this one retaining only the infrastructure changes to enable hooks, and moving the htlc_accepted hook into #2267.

Since this part was more or less accepted, this should be much easier to review. I just added a single fixup to address the unused-function issue for our dummy hook.

Turns out that checking boundaries is important... Signed-off-by: Christian Decker <decker.christian@gmail.com>

Instead of creating a new map I opted to re-use the Plugin.methods map, since the semantics are really similar and we don't allow duplicates. The only difference is in how they are announced to lightningd, so we use an enum to differentiate rpcmethods from hooks, since only the former will get added to the JSON-RPC dispatch table in lightningd. Signed-off-by: Christian Decker <decker.christian@gmail.com>

Also fixes a crash if the doc was empty.

I might have gone a bit overboard with the type-checking, but typesafe_cb_cast is quite nice to use, so why not. The macro to register a new hook encapsulates the entire flow from param serialization, to dispatch, parsing and callback dispatch in one bundle. I was tempted to have the callback outside of the registration, but it's unlikely that we'll have two calls to the same hook with different callbacks. Signed-off-by: Christian Decker <decker.christian@gmail.com>

None of the existing callbacks was making use of it and we will be exposing the method callback interface to outside compilation unit where the struct definition is not visible. So just remove it. Signed-off-by: Christian Decker <decker.christian@gmail.com>

There is very little that is plugin specific in the jsonrpc_request so this just extracts the common parts so we can reuse them outside of the plugin compilation unit as well.

This is the first use of the `hooks` autodata field, and it required a dummy element in order for the section not to be dropped, it'll be removed once we have actual hooks.

plugin_request_new did nothing special aside from registering the request ID with the dispatch code. This duty has now been moved to plugin_request_send instead, which is also exposed so we can use that code in plugin_hook. Signed-off-by: Christian Decker <decker.christian@gmail.com>

This ties all the things together, using the serializer to transform the payload into a valid `jsonrpc_request`, sending it to the plugin, and then using the deserializer on the way back before calling the hook callback with the appropriate information. Notice that the serializer and deserializer is skipped if we don't have a plugin that registered for this hook. Signed-off-by: Christian Decker <decker.christian@gmail.com>

We therefore keep a reference to the DB and will wrap and unwrap when a hook returns. Notice that this might cause behavior changes when moving logic into a hook callback, since the continuation runs in a different transaction than the event that triggered the hook in the first place. Should not matter too much, since we don't use DB rollbacks at the moment, but it's something to keep in mind. Signed-off-by: Christian Decker <decker.christian@gmail.com>

Signed-off-by: Christian Decker <decker.christian@gmail.com>

rustyrussell · 2019-01-17T05:42:40Z

Ack fdc66cf

cdecker added plugin pylightning labels Jan 9, 2019

cdecker added this to the v0.6.3 milestone Jan 9, 2019

cdecker self-assigned this Jan 9, 2019

cdecker requested review from rustyrussell and niftynei January 9, 2019 20:02

cdecker mentioned this pull request Jan 9, 2019

Channel state redundancy for production backends #2225

Closed

cdecker force-pushed the plugin-8 branch from d4654b8 to 7141b4b Compare January 9, 2019 22:20

niftynei reviewed Jan 10, 2019

View reviewed changes

rustyrussell reviewed Jan 10, 2019

View reviewed changes

niftynei approved these changes Jan 10, 2019

View reviewed changes

rustyrussell modified the milestones: v0.6.3, v0.6.4 Jan 10, 2019

conscott reviewed Jan 11, 2019

View reviewed changes

cdecker force-pushed the plugin-8 branch from 055b251 to dbe0237 Compare January 14, 2019 20:02

cdecker changed the title ~~WIP: The final chapter in the plugin saga (hooks)~~ The final chapter in the plugin saga (hooks) Jan 14, 2019

cdecker mentioned this pull request Jan 14, 2019

plugin: Implement the htlc_accepted hook #2267

Merged

cdecker force-pushed the plugin-8 branch from 6335d35 to ed445e8 Compare January 15, 2019 10:43

cdecker added 8 commits January 15, 2019 12:11

notifications: Fix iteration over notification topics

f9974d1

Turns out that checking boundaries is important... Signed-off-by: Christian Decker <decker.christian@gmail.com>

pylightning: Always assign the init method in plugin

e32e490

Also fixes a crash if the doc was empty.

jsonrpc: Generalize plugin_request to jsonrpc_request

82569dc

There is very little that is plugin specific in the jsonrpc_request so this just extracts the common parts so we can reuse them outside of the plugin compilation unit as well.

plugin: Add hooks that a plugin might register

95d112b

This is the first use of the `hooks` autodata field, and it required a dummy element in order for the section not to be dropped, it'll be removed once we have actual hooks.

cdecker added 2 commits January 15, 2019 12:46

cdecker force-pushed the plugin-8 branch from ed445e8 to 3a0c2e3 Compare January 15, 2019 11:46

json: Add a helper to decode a hex-encoded value from JSON

fdc66cf

Signed-off-by: Christian Decker <decker.christian@gmail.com>

cdecker force-pushed the plugin-8 branch from 3a0c2e3 to fdc66cf Compare January 15, 2019 12:44

cdecker mentioned this pull request Jan 16, 2019

Add default init method for pylightning plugins #2271

Closed

rustyrussell merged commit ed356da into ElementsProject:master Jan 17, 2019

	* an arrbitrary additional argument of type `cb_arg_type` can be
	* an arbitrary additional argument of type `cb_arg_type` can be

	plugin_request_send(plugin, req);
	plugin_request_send(plugin, take(req));

	* call the callback with a NULL result. Safes us the
	* call the callback with a NULL result. Saves us the

	fatal("Plugin responded with an unknowk result to the "
	fatal("Plugin responded with an unknown result to the "


		/* Falling through here is ok, after all the HTLC locked */
		*failcode = 0;

The final chapter in the plugin saga (hooks) #2237

The final chapter in the plugin saga (hooks) #2237

Conversation

cdecker commented Jan 9, 2019

cdecker commented Jan 9, 2019 • edited

niftynei left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rustyrussell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdecker commented Jan 10, 2019

cdecker commented Jan 10, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdecker commented Jan 14, 2019

rustyrussell commented Jan 17, 2019

cdecker commented Jan 9, 2019 •

edited

niftynei left a comment •

edited