Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lncli] unable to restore chan backups: rpc error: code = Unknown desc = unable to unpack chan backup: unable to derive shachain root key: unable to derive private key issue #3881

Closed
MSauce opened this issue Dec 28, 2019 · 16 comments
Assignees
Milestone

Comments

@MSauce
Copy link

MSauce commented Dec 28, 2019

Background

I had a db corruption issue and trying to recover from my SCB. Using the same seed and recovered the on-chain funds successfully. When attempting to restore off-chain via lncli create or restorechanbackup I get:

[lncli] unable to restore chan backups: rpc error: code = Unknown desc = unable to unpack chan backup: unable to derive shachain root key: unable to derive private key issue.

I found #3583 and I seem to be having a similar problem. It starts saving some of the channels back to disk but fails around the 5th one with that error and can't continue. I've noticed that the channel point it fails on is a channel from 2018 - I'm curious if there are some backwards compatibility issues.

Any advice on where to go from here? Is there a way for me to edit my channel.backup and remove the old channels to see if that helps?

Your environment

  • version of lnd: 0.8.2-beta
  • which operating system (uname -a on *Nix): Ubuntu 18.04
  • version of btcd, bitcoind, or other backend: bitcoind 0.19.0.1
  • any other relevant environment details

Steps to reproduce

Try to recover channels from channel.backup on same seed using lncli create or restorechanbackup.

Expected behaviour

Recover channel.backup successfully.

Actual behaviour

[lncli] unable to restore chan backups: rpc error: code = Unknown desc = unable to unpack chan backup: unable to derive shachain root key: unable to derive private key issue.

2019-12-28 19:22:46.769 [INF] CHBU: Restoring ChannelPoint(removed) to disk:
2019-12-28 19:22:46.827 [INF] LTND: Inserting 1 SCB channel shells into DB
2019-12-28 19:22:46.827 [INF] CHBU: Restoring ChannelPoint(removed) to disk:
2019-12-28 19:22:46.899 [INF] LTND: Inserting 1 SCB channel shells into DB
2019-12-28 19:22:46.899 [INF] CHBU: Restoring ChannelPoint(removed) to disk:
2019-12-28 19:22:46.965 [INF] LTND: Inserting 1 SCB channel shells into DB
2019-12-28 19:22:46.965 [INF] CHBU: Restoring ChannelPoint(removed) to disk:
2019-12-28 19:22:47.023 [INF] LTND: Inserting 1 SCB channel shells into DB
2019-12-28 19:22:47.023 [INF] CHBU: Restoring ChannelPoint(removed) to disk:
2019-12-28 19:22:47.082 [INF] LTND: Inserting 1 SCB channel shells into DB
2019-12-28 19:22:47.083 [INF] CHBU: Restoring ChannelPoint(cb2a5f78d093234b9c3d6b9227b70573bb7cb0b1b2d3c369278c09a4198683cb:0) to disk:

And then I receive the previous error.

@guggero
Copy link
Collaborator

guggero commented Jan 3, 2020

This is quite strange, especially because it only seems to affect one or a few channels. It's possible that it being an old channel could be the reason for this.

It would really help if you could run the dumpbackup command of my channel tool and post the content of the ShaChainRootDesc of the affected channel here.

@MSauce
Copy link
Author

MSauce commented Jan 3, 2020

Thanks Guggero.

ShaChainRootDesc: (chantools.dumpDescriptor) {
Path: (string) (len=17) "m/1017'/0'/5'/0/0",
Pubkey: (string) (len=66) "02faff90e2d7eb7dcd8c5ca5856179812d100ed25902042706a84c3b32ed6304f6"
}

@guggero
Copy link
Collaborator

guggero commented Jan 3, 2020

Thanks a lot!
This looks quite normal. I'll need to dig deeper into the code to see what's going on.

@MSauce
Copy link
Author

MSauce commented Jan 3, 2020

Let me know if there is something else I can provide to help.

@guggero
Copy link
Collaborator

guggero commented Jan 4, 2020

I've added a new derivekey command to the channel tool.
Could you please run the following command (replace xprv... with your root key):

for i in {1..20000}; do chantools derivekey --neuter --rootkey xprv... --path m/1017\'/0\'/5\'/0/$i | grep Public >> keys.txt; done

This will take a few minutes depending on the speed of your machine. After it finishes, there will be a file keys.txt with the first 20k public keys. Could you check if 02faff90e2d7eb7dcd8c5ca5856179812d100ed25902042706a84c3b32ed6304f6 is in that list? And if so, at what line number?
Thank you very much in advance!

@MSauce
Copy link
Author

MSauce commented Jan 5, 2020

It is not in keys.txt

@guggero
Copy link
Collaborator

guggero commented Jan 6, 2020

Ok, that's really strange. Somehow this incorrect public key for the ShaChainRootDesc made it into the channel backup. I'm pretty sure this has something to do with the channel being as old as it is. Maybe we changed something in the SHA chain encoding.
I'm still trying to figure out where this incorrect value could come from.

In the meantime can you please try if you can continue with the restore if you remove the offending channel from the backup file? So you can at least recover the funds from the other channels?
I wrote the filterbackup command for that (in the chantools binary).

@guggero
Copy link
Collaborator

guggero commented Jan 6, 2020

I think I found the problem. We changed how the ShaChainRootDesc is created in #769, which was released with version v0.4-beta.
If you created the channel with version v0.3-beta or earlier, this would explain the behavior.

But as far as I know the SHA chain root is not really used if an SCB is restored since lnd initiates DLP and the remote party force-closes and hands over their per_commit_point.
So I'm going to write a command that fixes this old channel by just writing any valid public key into the ShaChainRootDesc.

@MSauce
Copy link
Author

MSauce commented Jan 6, 2020

Awesome! I'll try recovering the funds now with the filterbackup and then give your pubkey replacement command a try when it's ready for the channels with issues (I believe there are at least two).

Thanks for your help.

@guggero
Copy link
Collaborator

guggero commented Jan 6, 2020

I've added the fixoldbackup command to my tool.
I was unable to test it with an old file as it's hard to replicate it. So please let me know if there are any issues with running the command.
Hopefully this does indeed work and you're able to rescue the funds.
You're welcome, thanks for your patience!

@MSauce
Copy link
Author

MSauce commented Jan 6, 2020

Using your filter tool I was able to remove 10 channels from my backup and recover the remaining channels successfully.

I just tried your fixoldbackup and it "fixed" the 10 channels I had filtered (as expected) but attempting the recovery with the fixed backup failed with [lncli] unable to restore chan backups: rpc error: code = Unknown desc = unable to unpack chan backup: unable to derive shachain root key: unable to derive private key on the first fixed channel point 73a1a0e6669235d494efb25c8bc1d5c3a1ff9f65f50bf0662c52896a3bb2e3d5:0

I dumped it with dumpbackup and the fixed ShaChainRootDesc of 73a1a0e6669235d494efb25c8bc1d5c3a1ff9f65f50bf0662c52896a3bb2e3d5:0 is:

ShaChainRootDesc: (dump.KeyDescriptor) {
    Path: (string) (len=17) "m/1017'/0'/5'/0/0",
    PubKey: (string) (len=66) "02faff90e2d7eb7dcd8c5ca5856179812d100ed25902042706a84c3b32ed6304f6"
   }

This is the same pubkey the other channel was failing on.

@guggero
Copy link
Collaborator

guggero commented Jan 6, 2020

Ok, strange. Maybe I did something wrong. You did use the file that was created in the sub directory results with the name backup-fixed-....backup for the second restore, right?

@MSauce
Copy link
Author

MSauce commented Jan 6, 2020

I did for sure, yes.

I double checked and you can run the fixoldbackup on the output file and it 'fixes' the same 10 channels again so something is wrong with the replacement.

@Roasbeef Roasbeef added this to the 0.9.1 milestone Jan 7, 2020
@guggero
Copy link
Collaborator

guggero commented Jan 7, 2020

Yeah, sorry. Just wanted to make sure.
It was my fault, learned something new about golang in the process... Maybe I should start adding unit tests to the project.
Can you try again please?

@MSauce
Copy link
Author

MSauce commented Jan 7, 2020

Ok - no errors this time on the restore.

However the log eventually said 2020-01-07 13:17:09.890 [INF] NTFN: Cancelling epoch notification, epoch_id=35 and the node got restarted. On restart, it attempts a remote commitment on one of the fixed channel points and then shuts down. Now it seems to be stuck in that loop. I'll DM you a more detailed log but not sure what parts of this are sensitive.

@guggero
Copy link
Collaborator

guggero commented Jan 14, 2020

Closing this as I didn't hear back and this can possibly be resolved manually with the tools I provided. The channels causing the problem are very old and not supported (see release notes of v0.4-beta) and a workaround will not be implemented in lnd.

@guggero guggero closed this as completed Jan 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants