Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The final chapter in the plugin saga (hooks) #2237

Merged
merged 11 commits into from Jan 17, 2019

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented Jan 9, 2019

This PR implements the hooks infrastructure for plugins and adds the htlc_accepted hook as a sample of how to implement hooks and how they can be used.

The REGISTER_PLUGIN_HOOK macro is the glue that binds everything together, defining the full flow of call -> serialize_request -> send_request -> deserialize_response -> callback.

This is not yet complete, missing one commit to actually react to the response to the htlc_accepted hook, and I'm planning to add another commit that tests the various hook return values.

I'm adding this now since I'd like to get the reviews going.

@cdecker cdecker added this to the v0.6.3 milestone Jan 9, 2019
@cdecker cdecker self-assigned this Jan 9, 2019
@cdecker
Copy link
Member Author

cdecker commented Jan 9, 2019

Notice that in htlc_accepted_hook_deserialize all parsing failures are fatal. I'm not exactly sure if this is what we want, but I expect that we can't really recover from an error in this case, since going back to default behavior might not be what we want (if the plugin hook checks some policies, and would reject it, the default action might be to accept and thus to violate the policy). Therefore I made it as pedantic as possible in its current form.

Another nit in this case is that at this point we have not committed to the DB yet, which means that crashing will cause the DB transaction to rollback, potentially dropping the commitment we just got. This could however be solved by a variant of fatal that commits in case we have a DB transaction open.
Edit: as mentioned below, calling the plugin actually commits the DB transactions, hence this cannot end up rolling back.

What do you guys think @rustyrussell and @niftynei?

Copy link
Collaborator

@niftynei niftynei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff. I know you've got a few more commits outstanding but wanted to get these comments to you.

Re: fatal... the biggest downside is how long it takes to fail, imo. if this was something we could check at startup, no big deal, but as implemented you don't find out there's something wrong with how you respond to htlc's until you've already got one inflight. what other options did you consider?

pardon my ignorance here, but can you remind me what the scope of fatal is again? Is it the whole of c-lightning? The process that it's running in?

"""
if name in self.methods:
raise ValueError(
"A hook for method {} was already registered".format(name, self.methods[name])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your error message says a hook for method... but it seems like it could also collide on a rpcmethod, no? maybe better to say Method {} was already registered? unless you're using a canonical list of hook names?

*
* - `response_cb` is called once the plugin has responded and the
* response has been parsed by `deserialize_response`. In addition
* an arrbitrary additional argument of type `cb_arg_type` can be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* an arrbitrary additional argument of type `cb_arg_type` can be
* an arbitrary additional argument of type `cb_arg_type` can be

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it talk like a pirate day? My bad 😉

#define PLUGIN_HOOK_CALL_DEF(name, payload_type, response_cb_arg_type) \
void plugin_hook_call_##name(struct lightningd *ld, \
payload_type payload, \
response_cb_arg_type cb_arg); \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this duplication intentional? 76-78 looks identical to 79-81

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, like mentioned above this one is the declaration of the function (no function body) and the lower one is the function definition (which contains the body). It's a bit of an antipattern since the definition should be in the matching .c file, but since we'd like to define this in a macro it has to be in the same place, and since we may want to call this from somewhere other than the module defining it (hooks should be defined together with the data they act on, and called from the entrypoints of the hooks), this is the best I could come up with.

Maybe @rustyrussell has a better option? An alternative that I haven't tried yet is to use a #define instead of this tiny function, avoiding all the linker issues altogether.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Just Worked for me?

diff --git a/lightningd/plugin_hook.h b/lightningd/plugin_hook.h
index 48dffe23..74be1e74 100644
--- a/lightningd/plugin_hook.h
+++ b/lightningd/plugin_hook.h
@@ -71,12 +71,8 @@ void plugin_hook_call_(struct lightningd *ld, const struct plugin_hook *hook,
  * *`. Not really necessary, but nice since it also makes sure that
  * the method-name is correct for the call.
  */
-/* FIXME: Find a way to avoid back-to-back declaration and definition */
 #define PLUGIN_HOOK_CALL_DEF(name, payload_type, response_cb_arg_type)         \
-       void plugin_hook_call_##name(struct lightningd *ld,                    \
-                                    payload_type payload,                     \
-                                    response_cb_arg_type cb_arg);             \
-       void plugin_hook_call_##name(struct lightningd *ld,                    \
+       static inline void plugin_hook_call_##name(struct lightningd *ld,      \
                                     payload_type payload,                     \
                                     response_cb_arg_type cb_arg)              \
        {                                                                      \

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I thought static inline would only be callable in the same compilation unit, but you're right, since we are #includeing the file, this probably means the client header as well. Will fixup 👍

/* No such hook name registered */
return false;
} else if (hook->plugin != NULL) {
/* Another plugin already registered for this name */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could get real crowded really quick. I'm sure it's fine for v1 but it's probably worth noting in the docs somewhere (if not already) about how collisions fail

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that might become a problem once people start using a myriad of different plugins, but I see no clean to fan-out to multiple plugins (serial vs parallel), and more importantly how to do fan-in, i.e., given two or more results which one to pick (serial call, and first non-default is picked vs. some sort of priority vs. ...).

Further complicating this is that hooks might have different ways of handling this, i.e., a hook that is just used to synchronously save information somewhere is trivially safe to use in parallel and have multiple ones, while a hook that actually modifies the continuation is not.

So the solution I'm eyeing currently is that we just punt this sort of multiplexing to a multiplexer plugin, that will then run the child-plugins instead of registering them with lightningd directly, and that would then know how to handle things.

We should probably start out like this and if there is a need to, we can start moving this kind of functionality into the plugin_hooks as necessary (I personally don't think there will be a huge competition for hooks, since they cater really specific needs, and most users will be happy with the default implemented in lightningd anyway).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I think this makes sense at the outset. It seems worth noting somewhere doc wise to cut down on potential for confusion later on.

@@ -1013,7 +983,7 @@ static void plugin_config(struct plugin *plugin)
json_object_end(req->stream);

jsonrpc_request_end(req);
plugin_send(plugin, req->stream);
plugin_request_send(plugin, req);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these req's be annotated with takes(req)?

Suggested change
plugin_request_send(plugin, req);
plugin_request_send(plugin, take(req));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See @rustyrussell's comment below, I seem to be missing some detail about the TAKES annotation, but I'd generally agree that that's what I needed.

plugin_request_send(hook->plugin, req);
} else {
/* If no plugin has registered for this hook, just
* call the callback with a NULL result. Safes us the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* call the callback with a NULL result. Safes us the
* call the callback with a NULL result. Saves us the

fatal("Plugin specified an invalid 'payment_key': %s",
json_tok_full(buffer, resulttok));
} else {
fatal("Plugin responded with an unknowk result to the "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fatal("Plugin responded with an unknowk result to the "
fatal("Plugin responded with an unknown result to the "

Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes, but I just don't think this is ready for 0.6.3 given our haste in release. I believe full plugin support is what justifies a 0.7 release, and this should be part of that process...

struct jsonrpc_request *req TAKES)
{
/* Add to map so we can find it later when routing the response */
tal_steal(plugin, req);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaks take() if caller does it as implied by the TAKES above. You can just document that this function consumed req, and remove the TAKES annotation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, seems I'm missing the nuances of TAKES here, I thought the annotation means that it "takes ownership" and that the caller should not assume that the argument is accessible after the call anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means "IF you call with take(x), it will take ownership".

Maybe we need an annotation for "consumes"?


/* Falling through here is ok, after all the HTLC locked */
*failcode = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs careful design. For example, I'm pretty sure we want to separate hooks for terminating onions and non-terminating, since most hooks will want one or the other. For terminating we'll also want a hook after we've entered it into the db, I think, and maybe an API to query the db on restart.

In short, someone needs to use the APIs in anger before they can be finalized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is much difference, immediately failing the HTLC after it was accepted (calling local_fail_htlc -> fail_in_htlc) or passing through the hook and then calling fail_in_htlc from the hook callback doesn't really do much. Things would be different if we could prevent an HTLC being committted to in the first place (a real immediate failure, instead of commit-then-fail), but as it stands the only difference is that the two updates to the HTLC (committed then failed) are no longer in the same DB transaction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for terminating vs non-terminating onions (i.e., I'm the intended recipient vs just an intermediate hop), I did not differentiate on purpose in order to allow things like short-circuiting a payment, i.e., I know the payment_key because I'm in an ensemble of load-balancing nodes so I can claim the funds directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a real user of this htlc_accepted API; I'm not smart enough to design it without a concrete one :(

Consider a just-in-time invoicing system. The current hook is in the wrong place: any of the checks in handle_localpay could still fail.

And I can't see how the plugin's db can stay in sync with lightningd's: a restart or crash can lead to one committed and one not. For example, after plugin commits to its db, and says continue, we crash or simply shutdown. Now htlcs_reconnect() fails the HTLC on restart.

Adding an external dependency like this is going to be painstaking work; I think we need to pick the smallest subset so we can make mistakes with limited consequences.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IF you were interested in starting with a simpler case with potentially less downstream implications, I could imagine various hooks for when channels change state. For instance, a simple hook for when the funding transaction is confirmed, or if a channel is asked to closed.

On the plugin side, something simple like an email/sms alert when a channel closes could be nice. Perhaps this meets the balance between some utility and less consequential.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not meant to interfere with the built-in invoicing system, it is rather a complement to it, and is meant to be used at a much lower level. With this you can pretty much create an external invoicing system, that'll get queried for open invoices.

Consider a just-in-time invoicing system. The current hook is in the wrong place: any of the checks in handle_localpay could still fail.

I added a small test that short-circuits a payment going through l1 -> l2 -> l3 at the middle node l2 and just claims the HTLC, leaving the invoice on l3 unpaid. Notice that we don't use handle_localpay or forward_htlc at all if we get told to fail or resolve.

And I can't see how the plugin's db can stay in sync with lightningd's: a restart or crash can lead to one committed and one not. For example, after plugin commits to its db, and says continue, we crash or simply shutdown. Now htlcs_reconnect() fails the HTLC on restart.

As for the problems that may emerge from htlcs_reconnect, notice that I avoided having a forward-to-this channel call on purpose, so that we cannot end up with an outgoing HTLC that cannot be associated with an incoming one. The incoming HTLC on the other hand immediately fails, or gets told the preimage if the plugin returns, hence we end up always in a case that we already handle correctly.

I think as it stands this is the simplest hook I can come up with: we get an HTLC and decide whether to continue, fail or resolve. Continue is the old, and proven, logic. Fail is about as simple as it gets, and resolve is just resolving an HTLC with a payment_key that did not come from an invoice, but somewhere else 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this in person with @rustyrussell I now understand the issue a bit better. We may end up imposing that the hook calls must be idempotent so that we can call them again in case we have an HTLC that was accepted and wasn't resolved or failed or forwarded.

common/json.c Outdated
{
size_t hexlen = tok->end - tok->start;

if (hexlen != sizeof(preimage->r))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure hex_decode already does this check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 will drop duplicate check :-)

@cdecker
Copy link
Member Author

cdecker commented Jan 10, 2019

Another nit in this case is that at this point we have not committed to the DB yet, which means that crashing will cause the DB transaction to rollback, potentially dropping the commitment we just got. This could however be solved by a variant of fatal that commits in case we have a DB transaction open.

My mistake, the last part is not an issue, since calling out to the plugin will result in the DB transaction that triggered the event to be committed. So rather than us forgetting stuff, this means that actions spanning across a plugin hook are no longer in the same DB transaction, which makes them not atomic anymore. Shouldn't be an issue for now since we never rollback a DB transaction currently and only failures can result in aborted transactions. Just something to keep in mind when working with hooks in the future.

@cdecker
Copy link
Member Author

cdecker commented Jan 10, 2019

Re: fatal... the biggest downside is how long it takes to fail, imo. if this was something we could check at startup, no big deal, but as implemented you don't find out there's something wrong with how you respond to htlc's until you've already got one inflight. what other options did you consider?

There's not much we can do imho, since it is only when we are actually parsing a reply that we can check it's return value to the call. The good thing is that we can localize parse errors in the deserializer of the hook, hence decide on a per-hook level whether this it should be fatal or not, and there are probably hooks in which continue is acceptable if the plugin misbehaves.

pardon my ignorance here, but can you remind me what the scope of fatal is again? Is it the whole of c-lightning? The process that it's running in?

It exits the process it is running in, so in this case lightningd which is equivalent to killing the entire c-lightning. In subdaemons it'll exit the subdaemon only, but if it's a required daemon (gossipd, hsmd, ...) it'll cause lightningd to exit as well.

Minor changes, but I just don't think this is ready for 0.6.3 given our haste in release. I believe full plugin support is what justifies a 0.7 release, and this should be part of that process...

Agreed. In that case I can take a bit more care and not rush this out the door just to make the cut. If you're ok with it I'd like to disable the entire plugin subsystem for 0.6.3 (put it behind the experimental flag) in order not to ship incomplete functionality.

I'll address the remainder of the comments inline and add fixups as required 😉

@rustyrussell rustyrussell modified the milestones: v0.6.3, v0.6.4 Jan 10, 2019
@@ -590,16 +592,194 @@ static void channel_resolve_reply(struct subd *gossip, const u8 *msg,
tal_free(gr);
}

/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : Perhaps all the hook structures / functions can live in another file? It seems there are many options for future hooks and I can imagine them bloating some files that are already quite large, so it might set the precedent to keep those in their own space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered something similar, but most of the logic is actually either taken from the peer_htlcs.c file (in the callback) or wraps and unwraps data that may not be available elsewhere, which is why I ended up putting it here. I'd prefer to keep the logic at least in the same place.

@cdecker cdecker changed the title WIP: The final chapter in the plugin saga (hooks) The final chapter in the plugin saga (hooks) Jan 14, 2019
@cdecker
Copy link
Member Author

cdecker commented Jan 14, 2019

Since the htlc_accepted hook proves to be a bit more controversial than I thought (and rightfully so), I split the PR into two branches, with this one retaining only the infrastructure changes to enable hooks, and moving the htlc_accepted hook into #2267.

Since this part was more or less accepted, this should be much easier to review. I just added a single fixup to address the unused-function issue for our dummy hook.

Turns out that checking boundaries is important...

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Instead of creating a new map I opted to re-use the Plugin.methods
map, since the semantics are really similar and we don't allow
duplicates. The only difference is in how they are announced to
lightningd, so we use an enum to differentiate rpcmethods from hooks,
since only the former will get added to the JSON-RPC dispatch table in
lightningd.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Also fixes a crash if the doc was empty.
I might have gone a bit overboard with the type-checking, but
typesafe_cb_cast is quite nice to use, so why not. The macro to
register a new hook encapsulates the entire flow from param
serialization, to dispatch, parsing and callback dispatch in one
bundle. I was tempted to have the callback outside of the
registration, but it's unlikely that we'll have two calls to the same
hook with different callbacks.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
None of the existing callbacks was making use of it and we will be
exposing the method callback interface to outside compilation unit
where the struct definition is not visible. So just remove it.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
There is very little that is plugin specific in the jsonrpc_request so
this just extracts the common parts so we can reuse them outside of
the plugin compilation unit as well.
This is the first use of the `hooks` autodata field, and it required a
dummy element in order for the section not to be dropped, it'll be
removed once we have actual hooks.
plugin_request_new did nothing special aside from registering the
request ID with the dispatch code. This duty has now been moved to
plugin_request_send instead, which is also exposed so we can use that
code in plugin_hook.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
This ties all the things together, using the serializer to transform
the payload into a valid `jsonrpc_request`, sending it to the plugin,
and then using the deserializer on the way back before calling the
hook callback with the appropriate information.

Notice that the serializer and deserializer is skipped if we don't
have a plugin that registered for this hook.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
We therefore keep a reference to the DB and will wrap and unwrap when
a hook returns.

Notice that this might cause behavior changes when moving logic into a
hook callback, since the continuation runs in a different transaction
than the event that triggered the hook in the first place. Should not
matter too much, since we don't use DB rollbacks at the moment, but
it's something to keep in mind.

Signed-off-by: Christian Decker <decker.christian@gmail.com>
Signed-off-by: Christian Decker <decker.christian@gmail.com>
@rustyrussell
Copy link
Contributor

Ack fdc66cf

@rustyrussell rustyrussell merged commit ed356da into ElementsProject:master Jan 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants