Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated BIP39 Recovery (scan derivation paths, automatic, restore) #6219

Merged
merged 2 commits into from
Aug 20, 2020

Conversation

lukechilds
Copy link
Contributor

@lukechilds lukechilds commented Jun 8, 2020

Resolves #6155

Implements automated scanning of all known BIP32 chains that can be imported into Electrum.

I'm opening this PR for review, it's not ready to be merged yet. Account discovery is functional but still needs to be integrated in to the GUI.

Currently it's implemented as a script for easy testing, you can quickly test it out by running the electrum/scripts/bip39_recovery.py script with a mnemonic parameter and also an optional second passphrase parameter. It will return a list of any previously used accounts it can find. You can use this mnemonic that I've created a few test accounts in:

much bottom such hurt hunt welcome cushion erosion pulse admit name deer

e.g:

$ ./bip39_recovery.py "much bottom such hurt hunt welcome cushion erosion pulse admit name deer"
[
    {
        "derivation_path": "m/84'/0'/0'",
        "description": "Standard BIP84 native segwit (Account 0)",
        "script_type": "p2wpkh"
    },
    {
        "derivation_path": "m/84'/0'/1'",
        "description": "Standard BIP84 native segwit (Account 1)",
        "script_type": "p2wpkh"
    },
    {
        "derivation_path": "m/0'",
        "description": "Non-standard legacy (Account 0)",
        "script_type": "p2pkh"
    },
    {
        "derivation_path": "m/84'/0'/2147483646'",
        "description": "Samourai Whirlpool post-mix",
        "script_type": "p2wpkh"
    }
]

Diversions from BIP44

Due to the fact that at discovery we only need to check which accounts has been used, not the final balance of each account, we take a few shortcuts:

  • Don't scan the change chain, if a BIP44 compliant account is used it will have received a TX in one of the first 20 (BIP44 gap limit) addresses of the external chain.
  • Move on to the next account as soon as we find a single address with funds. No need to keep scanning until we exhaust the gap limit.

If one of the discovered accounts is selected by the user for recovery, then Electrum will proceed to properly scan both internal external chains and exhaust the gap limit when it creates the wallet.

Limitations

Electrum takes an account level import path like m/84'/0'/0' and derives the wallet with the non hardened is_change/address_index sub trees. Due to this we cannot import Bitcoin Core accounts as they use the format m/0'/0'/address_index', notice all sub trees use hardened derivation.

All other wallets listed on walletsrecovery.org can be successfully recovered.


Please let me know if this code looks ok so far. Next steps will be to migrate the account discovery code from the script to electrum.bip39_recovery and then integrate it in to the recovery wizard GUI.

CC @ecdsa @SomberNight @aantonop

Copy link
Member

@SomberNight SomberNight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine

electrum/scripts/bip39_recovery.py Outdated Show resolved Hide resolved
@lukechilds
Copy link
Contributor Author

lukechilds commented Jun 8, 2020

Thanks for reviewing @SomberNight, I have two questions re GUI integration.

1. Whats the best way to hook into the derivation path view?

I want to add a button in this view that says "Detect Existing Accounts" which when clicked, opens a popup with a loading icon, calls the account discovery function, displays a list of results to the user, allows them to select one, and populate the path/script type inputs with the result.

Screenshot 2020-06-08 at 21 08 46

For example the previous view to enter the seed has it's own file in electrum/gui/qt/seed_dialog.py, so I could just edit that. However the derivation path is implemented in a GUI agnostic way in electrum/base_wizard.py which sets up some params and then inits the UI with self.choice_and_line_dialog() where choice_and_line_dialog() is a generic view provided by the GUI implementation.

def derivation_and_script_type_dialog(self, f):
message1 = _('Choose the type of addresses in your wallet.')
message2 = ' '.join([
_('You can override the suggested derivation path.'),
_('If you are not sure what this is, leave this field unchanged.')
])
if self.wallet_type == 'multisig':
# There is no general standard for HD multisig.
# For legacy, this is partially compatible with BIP45; assumes index=0
# For segwit, a custom path is used, as there is no standard at all.
default_choice_idx = 2
choices = [
('standard', 'legacy multisig (p2sh)', normalize_bip32_derivation("m/45'/0")),
('p2wsh-p2sh', 'p2sh-segwit multisig (p2wsh-p2sh)', purpose48_derivation(0, xtype='p2wsh-p2sh')),
('p2wsh', 'native segwit multisig (p2wsh)', purpose48_derivation(0, xtype='p2wsh')),
]
else:
default_choice_idx = 2
choices = [
('standard', 'legacy (p2pkh)', bip44_derivation(0, bip43_purpose=44)),
('p2wpkh-p2sh', 'p2sh-segwit (p2wpkh-p2sh)', bip44_derivation(0, bip43_purpose=49)),
('p2wpkh', 'native segwit (p2wpkh)', bip44_derivation(0, bip43_purpose=84)),
]
while True:
try:
self.choice_and_line_dialog(
run_next=f, title=_('Script type and Derivation path'), message1=message1,
message2=message2, choices=choices, test_text=is_bip32_derivation,
default_choice_idx=default_choice_idx)
return
except ScriptTypeNotSupported as e:
self.show_error(e)
# let the user choose again

I can see the QT implementation of choice_and_line_dialog here:

@wizard_dialog
def choice_and_line_dialog(self, title: str, message1: str, choices: List[Tuple[str, str, str]],
message2: str, test_text: Callable[[str], int],
run_next, default_choice_idx: int=0) -> Tuple[str, str]:
vbox = QVBoxLayout()
c_values = [x[0] for x in choices]
c_titles = [x[1] for x in choices]
c_default_text = [x[2] for x in choices]
def on_choice_click(clayout):
idx = clayout.selected_index()
line.setText(c_default_text[idx])
clayout = ChoicesLayout(message1, c_titles, on_choice_click,
checked_index=default_choice_idx)
vbox.addLayout(clayout.layout())
vbox.addSpacing(50)
vbox.addWidget(WWLabel(message2))
line = QLineEdit()
def on_text_change(text):
self.next_button.setEnabled(test_text(text))
line.textEdited.connect(on_text_change)
on_choice_click(clayout) # set default text for "line"
vbox.addWidget(line)
self.exec_layout(vbox, title)
choice = c_values[clayout.selected_index()]
return str(line.text()), choice

I could just edit that method and wrap my code with. if title == "Script type and Derivation path" but that seems like a pretty nasty solution and would also break when i18n is active.

What's the best way to hook into this?

2. How should I provide network access to my method?

Currently in the script I'm creating a network instance with:

from electrum.simple_config import SimpleConfig
from electrum.network import Network
config = SimpleConfig()
network = Network(config)
network.start()

When I add this to the GUI I'll obviously want to use the existing network instance (if it exists during account creation?). I see Network is a singleton, and I can get the instance with Network.get_instance(), however I'd also like to keep the script functional by just updating it to import electrum.bip39_recovery and running it. It's useful to quickly test during dev and seems like a pretty handy script to expose to users too.

What's the best way to go about this? Inside the functoin use the global instance if it exists or create one if it doesn't? Or maybe pass a network instance in to the function, and pass in the one I create for the script or pass in the global instance for the GUI?

Not quite sure the best way to go about this.

@SomberNight
Copy link
Member

  1. How should I provide network access to my method?
    Or maybe pass a network instance in to the function, and pass in the one I create for the script or pass in the global instance for the GUI?

Yes, that sounds good.

GUI I'll obviously want to use the existing network instance (if it exists during account creation?)

Yes, typically it exists, unless the user used the --offline CLI flag.
When you get a reference to network, you should test if it's None, and if it is, just display an error message that they are offline.

  1. Whats the best way to hook into the derivation path view?
    the derivation path is implemented in a GUI agnostic way in electrum/base_wizard.py which sets up some params and then inits the UI with self.choice_and_line_dialog() where choice_and_line_dialog() is a generic view provided by the GUI implementation.

choice_and_line_dialog is already only used for this specific usecase.
I think you can just rename it to something "derivation"-specific, and modify it directly.
I imagine you only want to implement all this for the Qt GUI? (which is fine)
Make sure that the change does not break the kivy GUI.
i.e. rename the method, pass it whatever arguments you need, make sure both GUIs accept all the arguments even if they ignore some, make sure the return types are the same
Then you can just implement extra logic in the Qt-specific implementation of this method -- well you probably want to implement it somewhere else as I imagine it would be many lines of code, and just call it from there.

@lukechilds
Copy link
Contributor Author

@SomberNight thanks, makes sense!

Hadn't realised choice_and_line_dialog was only used for this one dialog.

And yes, I was only planning on implementing it in the QT GUI. CLI users can use the script, and I don't think the Kivy GUI allows setting the BIP39 option on a seed so not really applicable there.

@lukechilds lukechilds force-pushed the bip39-recovery branch 2 times, most recently from da21c58 to e124f61 Compare June 9, 2020 14:35
@lukechilds
Copy link
Contributor Author

@SomberNight when calling an async function from a QT WindowModalDialog I can do:

accounts = asyncio.run_coroutine_threadsafe(account_discovery(), network.asyncio_loop).result()

Which works but it blocks the QT thread so the UI hangs and there's no way for the user click the cancel button while they're waiting.

I'd like to:

  • Show the modal
  • Show some text like "Scanning for existing accounts..." or maybe an animated spinner
  • Fire off the account discovery to run asynchronously
  • Allow the user to interact with the UI, cancel the modal if they want
  • Once account discovery has completed, update the modal with the results

Are you able to point me in the right direction for how to achieve this?

Disclaimer: Apologies if this question seems obvious, I'm not really a Python developer, in fact the only other time I've ever written Python was for my SSL fingerprint PR. I'm not familiar with Python's async API works.

@SomberNight
Copy link
Member

if this question seems obvious

No worries, it's not obvious at all. Qt does not really support python asyncio.

Take a look at

WaitingDialog(self, msg, task, on_success, on_failure)

It does not behave exactly as you described but similar.

@lukechilds
Copy link
Contributor Author

lukechilds commented Jun 18, 2020

I've integrated the functionality into the GUI.

Demo:

electrum_bip39

Let me know if it looks ok or if you have any suggestions, thanks!

@lukechilds lukechilds marked this pull request as ready for review June 18, 2020 04:32
@rdymac
Copy link
Contributor

rdymac commented Jun 18, 2020

The resulting file name could use another label (as the [standard] one), so the user knows which one he selected after the scan.

@jlopp
Copy link

jlopp commented Jun 18, 2020

I'd highly encourage using a gap limit that far exceeds the standard. Consider the fact that there are a variety of applications such as payment processing that can result in huge gaps. If I recall correctly, btcpayserver set their gap limit to either 1,000 or 10,000. In my opinion the additional computation required is well worth the pain it will save for folks who have funds stranded beyond the standard gap limit.

@Enegnei
Copy link

Enegnei commented Jun 18, 2020

If I recall correctly, btcpayserver set their gap limit to either 1,000 or 10,000.

For reference, yes, their gap limit appears to be 10,000.

@SomberNight
Copy link
Member

SomberNight commented Jun 18, 2020

I'd highly encourage using a gap limit that far exceeds the standard.

Do you mean for the discovery, or after the wallet is created for normal operations?

If you mean for the discovery, then ok, perhaps. It seems a bit unlucky to have a wallet where none of the first few addresses are used but later ones are, but I guess it can happen if a merchant was griefed as soon as they started using a wallet.

But to be clear, regardless of what insane gap limits some services are using, we are not able to use insane high gap limits simply due to architecture limitations. IMO what these services do when they use a gap limit of 10000 by default is anti-social behaviour: they are pulling the whole ecosystem in the same direction and slowly forcing everyone to use larger and larger gap limits (what if poor user used insane service no.17, we should use a higher gap limit then...). By using those high gap limits they introduce the risk for the user that they might not be able to restore easily using other services. That's just how it goes, we are not participating in this game.

@lukechilds
Copy link
Contributor Author

lukechilds commented Jun 18, 2020

Some back-of-the-envelope calculations re gap limit:

It currently takes around 2 seconds per account for discovery to complete with a gap limit of 20.

That means in the best case scenario where there are 0 used accounts, we will only scan 13 paths so it'll take ~26 seconds.

Setting the gap limit to 10,000 will mean the best case recovery process increases from ~26 seconds to ~3.6 hours.

(assuming each account scan is exactly 500x slower since it's scanning 500x more addresses)

@SomberNight
Copy link
Member

Setting the gap limit to 10,000 will mean the best case recovery process increases from ~26 seconds to ~3.6 hours.

Default ElectrumX DOS limits will get hit a lot sooner than that, slowing you down to oblivion.

@lukechilds
Copy link
Contributor Author

One solution could be to read the gap limit from the Electrum config if it exists instead of hardcoding to 20. Or maybe add a new BIP39_recovery_gap_limit param specifically for this.

Then if you connect to your own local Electrum instance, disable DoS limits, and manually set the config param to 10,000 it would work.

But if you're capable of all this, you're probably also capable of just manually restoring your wallet.

@aantonop
Copy link

aantonop commented Jun 18, 2020 via email

@lukechilds
Copy link
Contributor Author

The resulting file name could use another label (as the [standard] one), so the user knows which one he selected after the scan.

@rdymac that would indeed be nice but unfortunately that isn't really possible without making quite large changes to the way the Electrum import wizard works.

Currently this PR is just adding a modal to one window and then auto-filling some fields in that window on completion.

Keeping this change simple and not too intrusive into the way Electrum works was a specific goal for this PR: #6155 (comment)

@SomberNight
Copy link
Member

The resulting file name could use another label (as the [standard] one), so the user knows which one he selected after the scan.

@rdymac that would indeed be nice but unfortunately that isn't really possible without making quite large changes to the way the Electrum import wizard works.

Yes, let's not change [standard], it is related to the wallet type.
Note that the chosen script type is visible in Wallet>Information.
The derivation path is not visible in the GUI atm, but we could add it to the same dialog. If you want this, please see #4700 (and move that discussion there).

@ecdsa
Copy link
Member

ecdsa commented Jun 18, 2020

IMO what these services do when they use a gap limit of 10000 by default is anti-social behaviour

that's my opinion too. a high limit might work if you are scanning blocks yourself, but we should not engage in such behaviour with public servers.

electrum/base_wizard.py Outdated Show resolved Hide resolved
@lukechilds
Copy link
Contributor Author

lukechilds commented Jun 22, 2020

I've addresses the feedback and tested/fixed when setting up with a HWW wallet.

Unless storing seed data on BaseWizard is an issue #6219 (comment) I think this should be ok to be merged.

Let me know if you require any other changes or you think there's any other scenarios I should test.

@SomberNight
Copy link
Member

SomberNight commented Jun 24, 2020

To expand on https://github.com/spesmilo/electrum/pull/6219/files#r444897453 ,
I think what we could pass around instead is a function (derivation_prefix) -> (account_xpub) (or similar), as then this could be (later?) implemented for hardware wallets as well.

and then, in the GUI, instead of guarding with self.seed_type == "bip39", you could check whether this function was supplied (whether it is None).

@andronoob
Copy link

andronoob commented Jun 27, 2020

@aantonop

This is not a tool for merchants or advanced users.

Why it can't? In my opinion, it can bring convenience for them as well. I think this may be implemented in a progressive style (with a manually specified derivation path, favorable to be selected from a list), with some rate limit to avoid unintentionally DoSing the server.

Sorry, I didn't get your point. It's not about the first funded address only. Maybe it's just not suitable for those wallets with huge gap-limit to be imported into Electrum.

Allowing the user to manually specify child key index also enables mis-uses.

@andronoob
Copy link

andronoob commented Jun 27, 2020

I also wonder whether the "BIP39-Electrum2.0 duality" issue (#1300) is properly handled here?

Electrum 2.0 seed phrase is not compatible with BIP39, but it still shares exactly the same wordlist with BIP39. Therefore, an Electrum 2.0 seed can be both a valid Electrum 2.0 seed and a valid BIP39 mnemonic, with a probability of one in sixteen. (12-word BIP39 mnemonic phrase has 4-bit checksum. 2^4=16)

It's then possible (although not very possible, because the BIP39 wallet would show empty transaction history and zero balance, so that the user is highly unlikely to continue) for a user to import an Electrum 2.0 seed into another BIP39-compliant wallet.

I think the seed generating step of the wallet wizard should also show some reminds/warnings about this issue.

@andronoob
Copy link

andronoob commented Jun 27, 2020

I dreamed that the new derivation path step of the wizard may look like this:

Specify your derivation path


Standard derivation paths

  • Legacy (P2PKH)
    Example: 1CuC3b...

  • SegWit, compatible (P2SH-P2WPKH)
    Example: 3Dq4Y9...

  • SegWit, native Bech32 (P2WPKH)
    Example: bc1qff6a...

    Account index (don't change unless you know what you are doing): _____

Non-standard derivation paths

  • Click here to select from the list...

Not sure?

  • Let me guess!

(empty, if no detection is ever executed) / No fund detected / Recoverable funds detected, congratulations! You may click "next" now. Detect funds / (Button hidden after executing the detection)


Don't touch things below if you have no idea what they mean.

BIP32 derivation path: _________ Address type: P2WPKH (dropdown menu)

Gap limit: ____ (auto-filled, depends on wallet type)

@ecdsa
Copy link
Member

ecdsa commented Jun 27, 2020

related: #6001

@SomberNight SomberNight added mnemonic/seed 🌼 topic-wizard 🧙‍♂️ related to wallet creation/restore wizard labels Jun 28, 2020
@lukechilds
Copy link
Contributor Author

lukechilds commented Jul 3, 2020

@SomberNight @ecdsa are you able to re-review when you have time?

I've resolved the issue of attaching the mnemonic/passphrase to the instance and instead now pass an (account_path) -> (account_xpub) function around.

I haven't implemented this function for when a HWW is being restored but I have tested the wizard during HWW restore and it correctly leaves out the recovery option.

I don't currently have an Android development environment setup to test the Kivy GUI, however I had a quick look at the Kivy view and it just accesses all the keyword args with kwargs.get(key). This change just adds a single get_account_xpub keyword arg so it should be safely ignored on Android.

@lukechilds
Copy link
Contributor Author

lukechilds commented Jul 29, 2020

@SomberNight @ecdsa sorry to keep pinging you guys but not heard anything for ~1 month now so just wanted to check you haven't forgotten about this.

AFAIK I've implemented all required changes and there aren't any conflicts, is there anything else you want from me for this or can we get it merged in?

lukechilds and others added 2 commits August 20, 2020 17:50
- use logger
- allow qt dialog to be GC-ed
- (trivial) add typing; minor formatting
@SomberNight SomberNight force-pushed the bip39-recovery branch 2 times, most recently from cc7a6e8 to 7b122d2 Compare August 20, 2020 17:08
@SomberNight
Copy link
Member

Starting with 41827cf (copy here: https://github.com/SomberNight/electrum/tree/pr/6219_bak20200820),
I've squashed all your commits, cc7a6e8,
and rebased on master, 7b122d2.

Copy link
Member

@SomberNight SomberNight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Reviewed code and tested.

I have some minor nits (in addition to what I pushed) but that's fine.
Noticed that the Qt dialog has a sizing bug on dark theme (too small by default) - might be platform dependent, and probably related to QDarkStyle, so ignoring for now (my naive attempts to fix it failed).

"description": description,
"derivation_path": bip32_ints_to_str(account_path),
"script_type": wallet_format["script_type"],
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better to return a NamedTuple or similar; e.g. so that fields can be typed and can be enumerated statically (by IDE).

@SomberNight SomberNight merged commit 928e43f into spesmilo:master Aug 20, 2020
@SomberNight
Copy link
Member

Thank you for your work.

@SomberNight SomberNight changed the title Automated BIP39 Recovery Automated BIP39 Recovery (scan derivation paths, automatic, restore) Oct 31, 2020
SomberNight added a commit that referenced this pull request Nov 4, 2020
follow-up #6219

for multisig, it's just confusing and useless as-is
@AlistairTB
Copy link

I would like to do exactly this with a multisig wallet using Zpub keys (for a read-only wallet). In case it's not obvious, the Zpubs were derived from BIP39 words and extended passphrase. Is this possible?

I've integrated the functionality into the GUI.

Demo:

electrum_bip39

Let me know if it looks ok or if you have any suggestions, thanks!

@SomberNight
Copy link
Member

@AlistairTB No, this is only implemented for singlesig wallets.

Also, it is not clear exactly how you would want to use this.
The functionality here is for bruteforcing the derivation path and script type.
For a multisig wallet given Zpubs, you already know the script type, and you would typically have the Zpubs already at the "account level" of the derivation, so you already know the derivation path.
Further note that it is not possible to do hardened derivation starting from a master public key.

@AlistairTB
Copy link

Thank you for the reality check. I was using it the way it was intended, however, I had the wallet type wrong (3/3 vs 2/3) and was looking to solve the wrong problem.

@St333p
Copy link

St333p commented Mar 7, 2023

Today I was helping a friend recover funds from a ledger and I'd say this feature would be very well appreciated also when creating a watch-only wallet from a hardware device. Shall I open a dedicated issue?

@SomberNight
Copy link
Member

I'd say this feature would be very well appreciated also when creating a watch-only wallet from a hardware device. Shall I open a dedicated issue?

Sure, go ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mnemonic/seed 🌼 topic-wizard 🧙‍♂️ related to wallet creation/restore wizard
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature/Plugin request: Mnemonic recovery UI with derivation path "scanning"
10 participants