Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow remote signer to run without chain backend #6006

Merged
merged 17 commits into from
Jan 8, 2022

Conversation

guggero
Copy link
Collaborator

@guggero guggero commented Nov 19, 2021

Depends on btcsuite/btcwallet#781.

This fixes/finishes the remote signing in that it makes it no longer required for the signing instance to have its own chain backend. Therefore all address synchronization issues are removed as well, the signing instance can now be completely offline (except for the incoming RPC connection).
Additionally, only the signrpc interface needs to be implemented by the remote signer.

Fixes #5243

Fixes #5321

Fixes lightninglabs/loop#437

@Roasbeef Roasbeef added this to the v0.15.0 milestone Nov 19, 2021
Copy link
Collaborator

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a high level pass. Very necessary change as the remote signers can now live in a "cold" environment without running a bitcoin node. Thanks for the effort!

lnrpc/signrpc/signer_server.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Show resolved Hide resolved
lnwallet/btcwallet/signer.go Show resolved Hide resolved
@carlaKC carlaKC self-requested a review November 29, 2021 11:32
@guggero guggero force-pushed the remote-signer-no-chain branch 2 times, most recently from 302ce8b to e6a54da Compare November 29, 2021 20:05
@guggero guggero marked this pull request as ready for review November 29, 2021 20:05
@guggero guggero changed the title PoC: Allow remote signer to run without chain backend Allow remote signer to run without chain backend Nov 29, 2021
@guggero guggero modified the milestones: v0.15.0, v0.14.2 Nov 29, 2021
@guggero guggero force-pushed the remote-signer-no-chain branch 2 times, most recently from 59b1cbc to aa9aa45 Compare November 30, 2021 11:36
@guggero
Copy link
Collaborator Author

guggero commented Nov 30, 2021

One question I was pondering: Should we add a new itest mode where every node in the itest gets its own remote signer? Since we don't need a chain backend anymore it should be doable. That would certainly increase the test coverage of the signing. But I'm not sure if that warrants the extra time spent on it (and also the new itest GitHub slot that would be taken up by it).

Copy link
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass looks good! Will take a more detailed look at tests + RPCKeyRing in next look.

lnwallet/btcwallet/signer.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Show resolved Hide resolved
chainreg/chainregistry.go Show resolved Hide resolved
docs/remote-signing.md Show resolved Hide resolved
@carlaKC
Copy link
Collaborator

carlaKC commented Dec 1, 2021

Should we add a new itest mode where every node in the itest gets its own remote signer? Since we don't need a chain backend anymore it should be doable.

Could be an easier solution to just add a single itest case with 1x remote signer and open channel/ transact close? Will be a bit of additional wiring on the harness for single use but I think that'll give us good enough coverage?

Copy link
Collaborator

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I'll need to get more familiar with the hd wallet implemented in lnwallet to understand the scope change. Meanwhile, have questions re unit tests and the 256 key families init.

Should we add a new itest mode where every node in the itest gets its own remote signer?

Then I guess we would need to visit every test to validate the expected behavior of the remote signer?

lnwallet/btcwallet/psbt.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Show resolved Hide resolved
lnwallet/btcwallet/signer.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Show resolved Hide resolved
lnwallet/btcwallet/psbt_test.go Show resolved Hide resolved
lnwallet/rpcwallet/rpcwallet.go Show resolved Hide resolved
lnwallet/rpcwallet/rpcwallet.go Show resolved Hide resolved
chainreg/chainregistry.go Show resolved Hide resolved
lnwallet/btcwallet/btcwallet.go Show resolved Hide resolved
@guggero
Copy link
Collaborator Author

guggero commented Dec 2, 2021

Should we add a new itest mode where every node in the itest gets its own remote signer? Since we don't need a chain backend anymore it should be doable.

Could be an easier solution to just add a single itest case with 1x remote signer and open channel/ transact close? Will be a bit of additional wiring on the harness for single use but I think that'll give us good enough coverage?

Not sure I understand what you mean. Isn't this what's already been done in the lntest/itest/lnd_remote_signer_test.go test?

@carlaKC
Copy link
Collaborator

carlaKC commented Dec 3, 2021

Isn't this what's already been done in the lntest/itest/lnd_remote_signer_test.go test?

Yeah ofc, forgot we had that!

Copy link
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed to the best of my ability + tested locally (np2wkh/ p2wkh + channel open).

I haven't done much wallet work, so can't be confident that I'd pick up any subtle bugs, so you might want to pick up another reviewer with more domain knowledge.

One thing I did notice while testing is that watch-only won't shut down if signer is down. Could use Criticalf if signing fails, or add a health check if we're running with remote signer.

lnwallet/btcwallet/signer_test.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/signer.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/signer.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/signer.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/signer.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Show resolved Hide resolved
lnwallet/btcwallet/psbt.go Show resolved Hide resolved
lnwallet/rpcwallet/rpcwallet.go Outdated Show resolved Hide resolved
lnwallet/btcwallet/btcwallet.go Show resolved Hide resolved
With the remote signing instance now not needing to know anything about
addresses or current derivation indices, we don't need to forward any
such calls to that instance and can simplify the RPCKeyRing
considerably.
This fixes lightninglabs/loop#437 by adding all accounts that are used
in liquidity products such as Loop or Pool. Since both of these products
use key families below 255, we can get by with that number.
The alternative to creating way too many accounts (which increases the
default wallet size by ~250kB) would be to hard code the exact accounts
used by Loop (99) and Pool (210). But that sounds like a bad idea given
that there could always be more accounts being added to those (or other)
products. By making sure the first 255 accounts exist, we have a lot
more flexibility in those products for choosing key families.
To enable converting an existing wallet with private key material into a
watch-only wallet on first startup with remote signing enabled, we add a
new flag. Since the conversion is a destructive process, this shouldn't
happen automatically just because remote signing is enabled.
@guggero
Copy link
Collaborator Author

guggero commented Jan 6, 2022

I added an itest for SignPsbt in 8994d86. @yyforyongyu PTAL.

Copy link
Collaborator

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM🥳 The added itest is very helpful as now I just treat the itest as example usages for all the endpoints. Again very nice job!!

"purging all private key material")

ns := tx.ReadWriteBucket(waddrmgrNamespaceKey)
err = b.wallet.Manager.ConvertToWatchingOnly(ns)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. With this commit I guess we can now turn a non-watch-only node into a watch-only node.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐊

Great touch with the new flag to migrate the wallet to a watch only version, this'll make it much easier to production systems to retroactively adopt the remote signer architecture.

chainreg/chainregistry.go Show resolved Hide resolved
lnwallet/btcwallet/btcwallet.go Show resolved Hide resolved
@Roasbeef Roasbeef merged commit fd63301 into lightningnetwork:master Jan 8, 2022
@guggero guggero deleted the remote-signer-no-chain branch January 11, 2022 09:34
@joostjager
Copy link
Contributor

It's a nice step forward to be able to isolate the key material in a minimal lnd instance, but what's your thought on adding policy controls to prevent a hacker from using the signing instance in an unlimited way? If there are no limits, it's almost as if the hacker owns the keys themselves I'd think.

The current remote signing API is low-level and I am not sure if it is possible to add useful limits.

The lightning signer project for example works on higher level: https://gitlab.com/lightning-signer/rust-lightning-signer/-/blob/master/lightning-signer-server/src/server/remotesigner.proto

@Roasbeef
Copy link
Member

Roasbeef commented Feb 2, 2022

but what's your thought on adding policy controls to prevent a hacker from using the signing instance in an unlimited way? If there are no limits, it's almost as if the hacker owns the keys themselves I'd think.

This'll take place at a higher level. With the current architecture, it just so happens that the rpcwallet is checked into the lnd repo. It's possible for someone to either modify that package, or insert another gRPC service with the same interface in front of it (applying proxy checks then denying or forwarding the request along). This service can implement the rate limiting or policy controls, unbeknownst to lnd. Verification of certain actions will require minimally adding additional fields to the SignDescriptor proto we send out.

The purpose of the lower level API was always to enable additional verification at a higher level. Check out this old lnd ML post on the roadmap we've been following here: https://groups.google.com/a/lightning.engineering/g/lnd/c/3rF65UxvDtM/m/ktDLAGtlAQAJ.

@joostjager
Copy link
Contributor

joostjager commented Feb 2, 2022

I understand that you can have another grpc service with the same interface that acts as the remote signer. What I am wondering is what needs to be added to that interface in order to implement more advanced signing policies. For example an hourly limit on the total amount of all htlcs offered on a particular channel. I don't see this mentioned in the ML post.

If I understand it correctly, you are suggesting to extend the SignDescriptor to include metadata that is required for enforcing such policies?

Also from the lightning-signer project, this is an interesting list of potential policies: https://gitlab.com/lightning-signer/docs/-/blob/master/policy-controls.md

Have you considered implementing the (outgoing) lightning-signer interface in lnd so that that remote signing service can be used as is?

@Roasbeef
Copy link
Member

Roasbeef commented Feb 2, 2022

If I understand it correctly, you are suggesting to extend the SignDescriptor to include metadata that is required for enforcing such policies?

Yeah exactly. The policy logic needs to have essentially a mirrored view of the commitment state, this information can be passed over in the SignDescriptor messages, with it being given the initial channel state (connected to an existing node), or it identifying which signing operations are funding related (can be identified via the key family used, the PSBT has all this information.

You may want to chat w/ the Voltage team, last I heard they're working on some draft code to start to implement policies in the way I've outlined above. The upside with the way lnd is laid out, is that we can re-use most of the code in the lnwallet and hltcswitch packages, so we won't need to re-write all the logic from scratch. Ultimately, you'll end up with a mini-lnd instance that spots checks everything that comes by, using all the same routines as base lnd.

Have you considered implementing the (outgoing) lightning-signer interface in lnd so that that remote signing service can be used as is?

Last I checked their repo, they have an adapter for lnd, but it hasn't been touched in a few years, and they ended up renaming a bunch of things unnecessarily, which caused breaking changes. AFAICT now, the team has sort of unofficially been rolled into Spiral based on where they spend their time today.

One difference with the lower level vs the higher level approach is that the higher level approach makes more assumptions w.r.t how the protocol looks today. For example see this call: SignJusticeSweep. Post anchors, it's possible that the signer needs to re-spend the justice transaction several times as the "attacker" has more freedom w.r.t going to the second level, doing other sub-aggregations, etc, etc. The lower level approach lets us keep that logic in lnd (handling all the various breach spending sub-cases), while keeping the exposed API (sign this w/ this info) more agnostic. AFAICT, the interface doesn't handle these cases as LDK+CL haven't yet implemented anchors. Also consider that interface will also need to nearly entirely redone when we move over to taproot based channels.

@joostjager
Copy link
Contributor

Still, I wonder if the high-level approach with many different calls is genuinely different from lnd's low-level API with metadata in sign descriptors. Won't you end up packing the same data that lightning-signer has in function arguments into the metadata?

It would be nice if at least some code is usable across implementations, but it seems that, unfortunately, the work that the lightning-signer team has done needs to be replicated in full based on lnwallet and htlcswitch?

@Roasbeef
Copy link
Member

Roasbeef commented Feb 7, 2022

Still, I wonder if the high-level approach with many different calls is genuinely different from lnd's low-level API with metadata in sign descriptors.

IIRC, their high-level API was essentially a mirror of the main signing/wallet interface that LDK used internally, so I'd just chalk that up to path dependence, as that was the first implementation they worked on, and also where they initially received their funding.

Won't you end up packing the same data that lightning-signer has in function arguments into the metadata?

It depends, maybe more data can be inferred as w/e hypothetical implementation by Voltage can use more of the existing lnd libraries. Maybe it ends up just being a slimmer lnd binary to being with (gated by build tags) that validates everything as if it was actually receiving it over the p2p network. In the end, the design space is rather large, and lightning-signer is just the first stab at something like this. I'm not sure how much production usage it's received at this point (what I know of first hand so far is a series of demos/PoCs), nor how "finalized" their architecture/API is. If they direct their attention back to lnd to finish up their work that would be cool. Concurrently other teams that need something more urgently and maybe have a different tech stack may be inclined to build something more custom for their environment and security model (maybe you just want to make sure you don't sign a breach and nothing else).

It would be nice if at least some code is usable across implementations, but it seems that, unfortunately, the work that the lightning-signer team has done needs to be replicated in full based on lnwallet and htlcswitch?

It's feasible that someone can just make an API bridge that takes our lower level messages, and annotates them properly for the lightning-signer project. I've given the team pointers in the past on how one would extend our messages to provide that extra information. In the end, the code to handle verification of this nature already exists across every implementation.

@Roasbeef
Copy link
Member

Roasbeef commented Feb 7, 2022

It's feasible that someone can just make an API bridge that takes our lower level messages, and annotates them properly for the lightning-signer project.

With this PR, we started to define some custom PSBT tags, like the stuff needed to do the private key operations to send for revocation keys. Continuing down this line, an alternative path could be to use more custom tags to provide the extra annotation information. This can be passed via functional arguments to the main signing interface, or by having the caller set the additional new fields in the SignDescriptor. This could be an interesting path, as then we can standardize things around PSBT fields w/ additional signing context (building off the lightning-signer work w.r.t what needs to be sent), vs a slightly more implementation specific gRPC interface.

(I think we can move this convo to a new issue at this point?)

@devrandom
Copy link

Just a quick note after reading the above.

We've been heads-down working on a beta release and also upstreaming changes to C-Lightning for integration. C-L is pretty close to LDK for signer architecture and API, so it was the low-hanging fruit. But we're definitely interested in moving the LND integration forward.

I do have some concerns about the metadata approach. For one, I feel that approach is less "type safe", there seem to be more opportunities for breakage compared to an API that can be checked at compile-time. Another issue is that new calls are required for a validating signer. For example, the signer needs to see that the counterparty signed the current tx and revoked the old one, otherwise the state didn't really move forward. Providing these pieces of information requires new calls that are not actual signing operation. Also doesn't seem to fit the PSBT paradigm.

About SignJusticeSweep - the method takes a tx and an input index, so the node is free to construct txs and re-sign them as needed. I do agree that there will be some API changes as the protocol is revised, e.g. with taproot. But changes will be required in the metadata approach too - different pieces of information made available to the signer. It just seems to be irreducible complexity.

Finally, I want to mention that we are renaming the project to Validating Lightning Signer (VLS).

@joostjager
Copy link
Contributor

Created new issue where we can continue the discussion: #6243

@Roasbeef
Copy link
Member

Roasbeef commented Feb 8, 2022

@devrandom replying here, but we can move to the issue afterwards

I do have some concerns about the metadata approach. For one, I feel that approach is less "type safe", there seem to be more opportunities for breakage compared to an API that can be checked at compile-time

It would be no less type safe, Go is a typed language, as a protos. VLS uses protos already, and this would basically just be proto translation.

Another issue is that new calls are required for a validating signer. For example, the signer needs to see that the counterparty signed the current tx and revoked the old one, otherwise the state didn't really move forward. Providing these pieces of information requires new calls that are not actual signing operation.

That's correct, from an implementation perspective, this could take the form of db action hooks, or ofc we can just shove everything into a fake signing operation ;).

If you look at in from another perspective, with the way the protocol works, we can't actually sign again until the remote party revokes their commitment (we don't have the revocation point needed to make a new state otherwise). As a result, the existence of that next revocation point (which can be validated) may be sufficient to validate that a state transition actually happened.

But changes will be required in the metadata approach too - different pieces of information made available to the signer. It just seems to be irreducible complexity.

Yep that's unavoidable, but in the lnd case, the slimmer signer and the full node may just be using the exact same set of libraries to construct+validate. I don't see how it's any more complex. Ultimately you need to send the same amount of information in the abstract, the question here is how that abstraction is exposed to the implementation. Calls like this already map pretty cleanly as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants