replace json on disk, with leveldb/sql/...? wallet file does not scale. partial writes #4823

SomberNight · 2018-11-02T14:49:17Z

json on disk does not scale to large files, as every time we need to flush, the whole file needs to be rewritten.

This is a problem for:

channel_db (routing table for LN)
watcher_db (also LN)
even the wallet file itself...

neocogent · 2018-11-08T02:28:33Z

Would using sqlite3 be an option for this. I think it's pretty fast/efficient, though of course not a full blown sql server. I don't think installing a server is going to fly well for most users anyway but there are a lot of python apps that use sqlite3 as a backend (some even handling a lot of data like darktable). It has the advantage there are utils around for viewing/editing the db file (for advanced users doing fixes etc). The python module is widely available or often preinstalled. Anyway, just something to look at.

JSON has some advantages but only when files are small enough to visually inspect/edit.

ecdsa · 2019-05-16T08:06:36Z

this is done for channel_db and watchtower_db
we will not do it for the wallet file (not until we forward payments)

SomberNight · 2019-08-08T18:15:27Z

I looked at this again, specifically for the wallet file, to replace json. Being able to do partial writes will become necessary for large wallets, especially with lightning (and especially when we forward).

The main issue is that we currently encrypt the whole wallet file (by default). This is good for protecting metadata and also privacy reasons. This does not scale however: every time we need to make sure changes are persisted on disk, we serialize the whole wallet file json, compress it with zlib, encrypt it with AES, and write the whole thing to disk. (even with encryption disabled we write the whole json to disk every time)

We could simply replace json with sqlite/leveldb, but we could no longer simply encrypt the whole wallet file.

To be able to do partial writes, and have encryption, we either

encrypt individual records (e.g. keys and values if using a kv store), or
encrypt transparently between the filesystem layer and the db

The latter would be clearly preferred. (at the very least, there are many footguns and potential mistakes to be made with the first approach)

(note: ElectrumSV implemented option1 using sqlite)

To encrypt transparently below the db (option2), we would probably need the db itself to know about the encryption and do it for us. (I guess it's not strictly needed as you could create a virtual filesystem and do the encryption there yourself (see e.g. TrueCrypt/VeraCrypt) but that might be even more complicated)

I have found two viable ways to do option2.

Option2A: SQLCipher

https://github.com/sqlcipher/sqlcipher
https://www.zetetic.net/sqlcipher/design/
SQLCipher is an extension to sqlite that allows encrypting the db (using e.g. AES-CBC for each page).
It looks maintained and to have a long track record.
It's a C project, like sqlite, so we would need python bindings.
There seem to be a few competing projects (but none of them seems too reassuring):
https://github.com/leapcode/pysqlcipher
https://github.com/coleifer/pysqlite3
https://github.com/coleifer/sqlcipher3

Option2B: RocksDB

RocksDB is a key-value store (like LevelDB).
It has support for the kind of encryption we want since this PR. Well, except not the whole thing is implemented, so it does not work out of the box.
AFAICT we would need to subclass the BlockCipher class and implement (probably) AES.
Frustratingly, I could not find any examples of using RocksDB with this encryption option.
The PR author has likely contributed this to be used in arangodb, but there it is an feature only available in the Enterprise edition; I could not find the relevant code.

In any case, RocksDB is a C++ project, so we would then need python-bindings, see e.g. https://github.com/twmht/python-rocksdb
We would need to expose the encryption functionality through these bindings.

ysangkok · 2019-10-24T18:41:23Z

Your proposed solutions are difficult because they both exchange the storage layer and add encryption in one go, it is going to be difficult to get it right and have it be robust on all platforms (especially since the proposed alternatives need native code).

I think "event sourcing" should be considered as an alternative. What that means, is storing only changes. This is simple, but requires more disk space. But a typical big wallet pretty much only grows, so it shouldn't be too bad. Event sourcing is extremely inefficient when data is deleted/replaced. This is space-inefficient like a CoW filesystem where snapshots are made after each change.

Let's imagine the wallet file is now

{ "seed": "9dk",
  "qt_window_position": [500, 250]
  "transactions": ["01000000..."]}

Each time "qt_window_position" is changed, we need to reencrypt all the transactions, which are way larger.

In event sourcing, the final database state is implicit from a bunch of changes:

initial_state:

{"seed": "9dk"}

change0:

[["add", {"qt_window_position": [100, 100]}],["add", {"transactions": "010000..."}] ]

change1:

[["replace", {"qt_window_position": [500, 250]}]]

Something like json-path can be used to identify keys in a deep structure. EDIT: or json-patch which has an IETF spec.

The change files can be encrypted individually and never change. On wallet startup, they are all decrypted and applied serially, the wallet lives unencrypted in memory as before, as a plain Python object (as before). In contrast to migrating to LevelDB, it doesn't require switching to a flatter DB layout.

A naive implementation would write a single change file every single time save() is called. By introducing a transaction object (using the with construct of Python), the code can be migrated to having less transactions which will then require less change files.

EDIT: There is a heavy-weight Python-based event sourcing library that even supports encryption: eventsourcing#Application-level encryption

ecdsa · 2019-10-28T17:56:02Z

I like the idea of event sourcing
does json-patch require multiple files?
it seems that the changes could be added serially to the wallet file

ysangkok · 2019-10-29T15:37:59Z

The spec json-patch is concerned with encoding of the changes, not how to store them. The library json-patch takes JSON objects or strings (simply encoded objects). So one would have to develop a framing mechanism to put it in one file.

ecdsa · 2019-10-30T11:12:09Z

I can see how this would work with encryption, but one issue is what to do with the compression currently performed before encryption.

option 1: no compression at all. in that case only the last cypher block will need to be rewritten when we append something to the file
option 2: we compress only the main json, and patches added to the wallet file are not compressed.

rt121212121 · 2019-11-07T21:37:44Z

ElectrumSV will be dropping it's encryption except for things like private key data. My research led me to believe that most encryption options were unproven and would impose too much development overhead, and likely risk. At least for now, we'll be leaving it up to the user to ensure their wallets are secured appropriately if they require privacy.

ecdsa · 2019-11-24T16:49:35Z

I have implemented "append to encrypted" here: f28ebf9

ecdsa · 2019-11-25T11:58:47Z

Here is a proof of concept for storage using jsonpatch: 4f3b0bf

To make it efficient, we might want to replace all lists with dicts in the wallet file.

ecdsa · 2019-11-27T14:50:40Z

Here is my jsonpatch branch: https://github.com/spesmilo/electrum/commits/jsonpatch
It is still a draft, and I have not tested it with lightning.
I think should probably not be part of the next release.

SomberNight · 2020-03-17T20:52:49Z

Rotonen on IRC suggested using persistent, potentially with ZODB. There is a module for these that can be used to have encryption at rest.

Fixes: after adding a payment request, if the process was killed, the payreq might get lost. In case of using the GUI, neither the callee nor the caller called wallet.save_db(). Unclear where wallet.save_db() should be called... Now each method tries to persist their changes by default, but as an optimisation, the caller can pass write_to_disk=False e.g. when calling multiple such methods and then call wallet.save_db() itself. If we had partial writes, which would either rm the need for wallet.save_db() or at least make it cheaper, this code might get simpler... related: #6435 related: #4823

ln2max · 2021-09-14T17:09:59Z

I am still looking into the background and previous proposals for this issue. Based on skimming the comments so far here and in #5999 , I would like to propose unifying the wallet format to internally-tagged JSON-per-line objects, which can be written incrementally and periodically "squashed" down to a single snapshot entry. Using @ysangkok 's example above, a wallet with an initial snapshot entry and several incremental entries would look something like:

{"entry_format": "unencrypted", "entry_type": "snapshot", "data": {"seed": "9dk"}, "version": 1}
{"entry_format": "unencrypted", "entry_type": "incremental_update", "operations":  [{"type": "add", "data": {"qt_window_position": [100, 100]}],{"type": "add", "data": {"transactions": "010000..."}} ], "version": 1}
{"entry_format": "unencrypted", "entry_type": "incremental_update", "operations": [{"type": "replace", "data": {"qt_window_position": [500, 250]}}], "version": 1}

Conversely, an unencrypted wallet might look like:

{"entry_format": "encrypted", "mac": "d3adbeef", "ciphertext": "foobarbaz", "version": 1, "wallet_type": "xpubpw", "ephemeral_pubkey_bytes": "12ab"}

where the ciphertext block encapsulates an unencrypted-format entry, so that the parsing logic is exactly the same. We would merely add encrypt/decrypt operations to the existing pipeline.

Note the above is very much a sketch of how it might work and by no means complete. More fields will almost certainly be necessary.

Rather than computing the MAC over the entire entry (or entire wallet) as we do now, the MAC would be computed over the ciphertext and other relevant fields in a defined fashion, e.g:

DIGESTMOD = 'sha512_256'

def _entry_mac_version_1(key, ciphertext: str, version: int, wallet_type: str, ephemeral_pubkey_bytes: str):
    msg = "+".join((ciphertext, version, wallet_type, ephemeral_pubkey_bytes))
    return hmac.digest(key, msg, DIGESTMOD)

def _entry_mac_version_2(key, ciphertext: str, version: int, wallet_type: str, ephemeral_pubkey_bytes: str, some_new_fancy_parameter: str):
    msg = "+".join((ciphertext, version, wallet_type, ephemeral_pubkey_bytes, some_new_fancy_parameter))
    return hmac.digest(key, msg, DIGESTMOD)

def entry_mac(version, *args):
    if version == 1:
        return _entry_mac_version_1(*args)
    elif version == 2:
        return _entry_mac_version_2(*args)
    else:
        raise NotImplementedError("Unknown version {}".format(version))

This approach provides some major benefits:

it is extremely robust to system crashes and other sudden write interruptions. We can simply parse the wallet by iterating over the lines, and stop when we reach invalid JSON
JSON parsing is highly optimized and some very good, very fast libraries are available (ujson). Despite the very verbose proposed format, performance impact will (hopefully) be negligible
the format is modular and easy to expand as needs change in the future, we just add fields
while the format is less human-readable than the pretty-print JSON plaintext the wallet currently uses, it is reasonably human-readable both in the encrypted and unencrypted modes
unlike array-indexing based approaches, everything is extremely explicit, which reduces the chances that bugs or mistakes will happen in the wallet parsing. This also makes it easier for 3rd-party implementers to work with.

ecdsa · 2021-09-21T09:09:59Z

I think you have a good point about robustness to crashes and write interruptions.
I am in the process of rebasing the jsonpatch branch against the current codebase.
That branch is 2 years old, it is more a rewrite than a rebase..

ecdsa · 2021-09-22T12:33:14Z

@ln2max, I have updated and pushed the jsonpatch branch:
https://github.com/spesmilo/electrum/commits/jsonpatch
The work has been split in four separate commits, only the last commit deals with the encryption scheme.

I think we can improve it with your idea of using standalone encrypted blobs, one per update. This indeed makes it robust to crashes.

I do not like the extra layer of JSON in your proposal. Basically, you are embedding encrypted JSON in another layer of JSON. This is unnecessarily complex, and it leaks some info about the wallet.

I think we could use a sequence of encrypted binary blobs, without separator, each with its own MAC and length info.
Someone who has access to the file but does not have the password would not be able to see the separation between blobs.

ln2max · 2021-09-22T16:02:12Z

On Wed, Sep 22, 2021 at 05:33:24AM -0700, ThomasV wrote: ***@***.***, I have updated and pushed the `jsonpatch` branch:

https://github.com/spesmilo/electrum/commits/jsonpatch The work has been split in four separate commits, only the last commit deals with the encryption scheme. I think we can improve it with your idea of using standalone encrypted blobs, one per update. This indeed makes it robust to crashes.

Thanks. I will take a look.

I do not like the extra layer of JSON in your proposal. Basically, you are embedding encrypted JSON in another layer of JSON. This is unnecessarily complex, and it leaks some info about the wallet.

I see. Yes, this adds theoretical complexity (parsing JSON twice) but, as far as I can tell, it reduces actual complexity (i.e reduces the Shannon information content of the overall design) by allowing us to use the same underlying (and off the shelf/battle-tested) parser in all cases. JSON is admittedly not the world's best data serialization format. However, in this case, JSON as an outer format provides robustness (it's immediately clear to the parser if the JSON object is corrupted), human-readability, security (JSON parsers are well-reviewed) and (perhaps most importantly) modularity/expandability. One of the original considerations was to provide an extensible header format for encrypted wallets, so that new fields/descriptors/metadata can be added later while ensuring backwards compatibility. The key-value approach of a JSON object is a natural way to do this while avoiding having to build a custom parser with all its attendant downsides (security holes, etc). On the other hand, a binary-only format with MAC and length headers brings us back to the original question without resolving it.

I think we could use a sequence of encrypted binary blobs, without separator, each with its own MAC and length info. Someone who has access to the file but does not have the password would not be able to see the separation between blobs.

In order to hide information about the wallet in this scheme, it would be necessary to encrypt the MAC and length info of each incremental update. This scheme would strongly impact crash-resistance. If the length information must be encrypted, and the length information must be known for a block in order to locate the end of that block in the sea of binary data, then the length information must be stored encrypted in the previous (!) block. As a result, we are now vulnerable to data corruption in either previous or current block, and we also have to re-write the previous block when appending to the wallet. (in order to encode MAC and length for the subsequent block, into the previous block) If the goal is simply to make it difficult for an adversary to determine how active a wallet is (based on number-of-appends) then it may be easier to fix the maximum length of a wallet before "garbage collection" takes place. In other words, once a wallet has reached N incremental updates, the application would compact the incremental updates down to the "snapshot" (intitial block), where N is chosen so that the number is regularly reached for an average wallet user. In this case the adversary would be able to see that the wallet has up to N-1 appended incremental updates, but because this would be an extremely common state for a wallet to have, the amount of information the adversary could derive from it would be small. "This wallet has 3 appends and a total file size of X KB" is much less significant than "this wallet of X KB has 33 appends", if the probability of finding a wallet with 3 appends is much higher than finding a wallet with 33 appends. If we restrict most wallets to a maximum length of 5 appends, where most users will exceed this in normal usage, then the adversary will generally only encounter wallets with a number of appends in a highly probable (and therefore probably not useful) range. Since Electrum usage is subject to "long tail" effects and there will be high-volume users for whom the default value of N is too small and impacts performance, we can also increase this number for those users. In this case, an adversary who gains access to the encrypted wallet would see that it has an unusually large amount of activity. But would this really reduce privacy? A highly active wallet is very likely to have an unusually large overall filesize, and the overall filesize cannot be hidden. Therefore, since we can't hide the unusually active nature of the wallet anyway, increasing N for heavy users will probably not reduce privacy by very much.

ecdsa · 2021-09-23T08:01:20Z

No, the length of a cipher block does not need to be in the previous block, it is possible to put it inside at the beginning of each block. But actually, we might not need it at all (see below).

What I am concerned about is not only a lack of privacy, but the possibility that an adversary who has access to the encrypted wallet file could remove part of it, and the result would still be seen as valid. For example, they could delete a channel update, which would cause the wallet to use a revoked state.

Note that this attack is also possible with the encrypted blobs method I proposed above. In order to avoid that, @SomberNight suggested that we use a single MAC at the beginning of the file, instead of one per encrypted blob.

To illustrate this, assume that the file contains text1 and we want to append text2.
This is what the current jsonpatch branch is doing, when text2 is added:

before:        [magic | ephemeral_pubkey | cipher(text1) | mac1]
after:         [magic | ephemeral_pubkey | cipher(text1+text2) | mac2]

The new proposal would look like this:

before:         [magic | ephemeral_pubkey | mac1] [cipher(text1)]
after:          [magic | ephemeral_pubkey | mac2] [cipher(text1)][cipher(text2)]

The write method would first append the new cipher, then update the MAC. The new MAC is obtained by hashing the previous MAC and the new ciphertext. When decrypting, one would compare the MAC at the beginning of the file to the MAC obtained at the end of each cipher block, and stop when they match.

Note that each cipher block has its own PKCS7 padding, so its length is a multiple of 16. The length of each cipher block does not need to be included in the block, if we add a separator at the end of the encrypted text (for example \n). We could also use no separator at all and recompute the hash every 16 bytes.

ecdsa · 2021-09-23T08:26:10Z

One of the original considerations was to provide an extensible header
format for encrypted wallets, so that new fields/descriptors/metadata
can be added later while ensuring backwards compatibility.

Indeed, but a header should remain a header (magic bytes at the beginning of the file).
I would not call "header" the operation of embedding every cipherblock in JSON...

SomberNight · 2021-09-23T13:23:09Z

The write method would first append the new cipher, then update the MAC. The new MAC is obtained by hashing the previous MAC and the new ciphertext.

We need to be mindful that the attacker might delete some updates from the end and then try to recalculate and update the MAC.
Note that if the MAC is cleartext and is calculated over the ciphertexts, then the attacker can recalculate and update it.
So maybe the MAC should be over the plaintexts. (EDIT: or it could be encrypted somehow)

ln2max · 2021-09-23T19:25:15Z

On Thu, Sep 23, 2021 at 06:23:20AM -0700, ghost43 wrote: > The write method would first append the new cipher, then update the MAC. The new MAC is obtained by hashing the previous MAC and the new ciphertext. We need to be mindful that the attacker might delete some updates from the end and then try to recalculate and update the MAC. Note that if the MAC is cleartext and is calculated over the ciphertexts, then the attacker can recalculate and update it. So maybe the MAC should be over the plaintexts.

I certainly hope not... as far as I know, HMAC is calculated using a key so that an attacker *cannot* recalculate a plaintext MAC. The idea being that Bob can verify the HMAC before decrypting the ciphertext, in order to ensure the ciphertext has not been modified to e.g exploit the decryption process.

ln2max · 2021-09-23T19:32:01Z

On Thu, Sep 23, 2021 at 01:01:32AM -0700, ThomasV wrote: What I am concerned about is not only a lack of privacy, but the possibility that an adversary who has access to the encrypted wallet file could remove part of it, and the result would still be seen as valid. For example, they could delete a channel update, which would cause the wallet to use a revoked state.

Yes, that makes sense. Unfortunately, there's no easy way to distinguish between "the system crashed while writing the wallet" and "an attacker removed the last wallet entry". Therefore, any solution which is robust against write errors is not going to be robust against deliberate edits by an attacker. The best we could probably do is, if the wallet-MAC fails or does not match, attempt to recover as much of the wallet as possible. In order to retain the security function of the MAC (protecting against malicious ciphertexts) we would need to have both a wallet-MAC and a blob-MAC, where we verify the blob-MAC for each blob when attempting to recover.

The write method would first append the new cipher, then update the MAC. When decrypting, one would compare the MAC at the beginning of the file to the MAC obtained at the end of each cipher block, and stop when they match.

What is the computational overhead associated with computing the MAC many times over, is it going to slow down wallet reads significantly?

ln2max · 2021-09-23T19:33:49Z

On Thu, Sep 23, 2021 at 01:26:21AM -0700, ThomasV wrote: > One of the original considerations was to provide an extensible header > format for encrypted wallets, so that new fields/descriptors/metadata > can be added later while ensuring backwards compatibility. Indeed, but a header should remain a header (magic bytes at the beginning of the file). I would not call "header" the operation of embedding every cipherblock in JSON...

*shrugs* Magic bytes at the beginning of the file are not extensible or modular, at some point you run out of allocated space in the array. I don't understand the downside to JSON. As noted above, the performance impact should be minimal despite the verbosity, due to the availability of highly optimized parsers.

SomberNight · 2021-09-24T12:41:22Z

The write method would first append the new cipher, then update the MAC. The new MAC is obtained by hashing the previous MAC and the new ciphertext.

We need to be mindful that the attacker might delete some updates from the end and then try to recalculate and update the MAC.
Note that if the MAC is cleartext and is calculated over the ciphertexts, then the attacker can recalculate and update it.
So maybe the MAC should be over the plaintexts.

I certainly hope not... as far as I know, HMAC is calculated using a key
so that an attacker cannot recalculate a plaintext MAC. The idea being
that Bob can verify the HMAC before decrypting the ciphertext, in order
to ensure the ciphertext has not been modified to e.g exploit the
decryption process.

You are right, using an HMAC addresses my concern.

The write method would first append the new cipher, then update the
MAC. When decrypting, one would compare the MAC at the beginning of the
file to the MAC obtained at the end of each cipher block, and stop
when they match.

What is the computational overhead associated with computing the MAC
many times over, is it going to slow down wallet reads significantly?

No, reads would be done purely in memory in any of the json-based scemes. Reads from the file are ~only done when opening/loading a wallet file.
Writes would be append-only to the file (apart from maybe updating a fixed position global MAC), with the occasional consolidation ("apply all updates"). Writes would also update the in-memory representation of course.

That is, the whole wallet file is loaded into memory; writes update both the in-memory representation and the file on disk; reads only touch the in-memory representation. (This is btw how it works already, the goal is to optimise the writes updating the file on disk to not have to rewrite the whole file.)

ln2max · 2021-09-24T17:57:07Z

On Fri, Sep 24, 2021 at 05:41:33AM -0700, ghost43 wrote: > What is the computational overhead associated with computing the MAC many times over, is it going to slow down wallet reads significantly? No, reads would be done purely in memory in any of the json-based scemes. Reads from the file are ~only done when opening/loading a wallet file.

Ah, sorry, I meant overhead with respect to computing HMAC `length_of_file_in_bytes / 16` times, every time the file is read. I don't know how fast Python's HMAC implementation is.

Writes would be append-only to the file (apart from maybe updating a fixed position global MAC)

How good is Python's random-access write, i.e altering just the fixed position global MAC? As far as I know, it would be necessary to use the "w" file mode, and thereby re-write the whole file every time we update the fixed position global MAC. That would eliminate ~all the benefits of the append-oriented format. Come to think of it, if in-memory operations are not a bottleneck here, then the only bottleneck which justifies an append-based format would be if the "a" (append-only) file open mode offers better performance than "w". Has anyone benchmarked this to confirm it's the case?

SomberNight · 2021-09-25T01:23:42Z

Ah, sorry, I meant overhead with respect to computing HMAC
length_of_file_in_bytes / 16 times, every time the file is read. I
don't know how fast Python's HMAC implementation is.

Well, every time the file is read (i.e. wallet gets opened), we decrypt it. I imagine decrypting it and hashing it has comparable performace. I would not be concerned about hashing the whole file once when opening it taking too long. In any case, we have been doing this for years (the current encryption scheme has an HMAC over the whole file already).

Writes would be append-only to the file (apart from maybe updating a fixed position global MAC)

How good is Python's random-access write, i.e altering just the fixed
position global MAC?

As far as I know, it would be necessary to use the "w" file mode, and
thereby re-write the whole file every time we update the fixed position
global MAC. That would eliminate ~all the benefits of the
append-oriented format.

We have similar logic that updates short chunks of a raw file at fixed positions for the block header storage. It performs well.
See:

electrum/electrum/blockchain.py

Lines 437 to 444 in b431d8e

    
           with open(filename, 'rb+') as f: 
        
               if truncate and offset != self._size * HEADER_SIZE: 
        
                   f.seek(offset) 
        
                   f.truncate() 
        
               f.seek(offset) 
        
               f.write(data) 
        
               f.flush() 
        
               os.fsync(f.fileno())

Come to think of it, if in-memory operations are not a bottleneck here,
then the only bottleneck which justifies an append-based format would be
if the "a" (append-only) file open mode offers better performance than
"w".

Has anyone benchmarked this to confirm it's the case?

I have not benchmarked the different file open modes;
but I have some stats for writing a large wallet file (~17 MB) to disk on master,
and as you can see, the actual filesystem write is the least of our concerns.

JsonDB.dump 0.4299 sec
It takes a lot of time to json-serialise the whole wallet file for every write.
zlib.compress 0.4546 sec
It also takes a lot of time to zlib.compress the json before encrypting it. Btw this ~halves the file size.
ECPubkey.encrypt_message 0.1312 sec; comprised of:
- aes_encrypt_with_iv 0.0344 sec
- hmac-sha256 0.0359 sec
- base64.b64encode 0.0568 sec
filesystem write 0.0283 sec

Still, I believe, in principle, partial writes (e.g. appending updates to the end) should help with each of these steps.

ln2max · 2021-10-15T21:14:23Z

On Sat, Sep 25, 2021 at 01:24:34AM +0000, ghost43 wrote: I have not benchmarked the different file open modes; but I have some stats for [writing](https://github.com/spesmilo/electrum/blob/b431d8e9b8f75b99a43540ed6672772abfb68834/electrum/wallet_db.py#L1395) a large wallet file (~17 MB) to disk on master, and as you can see, the actual filesystem write is the least of our concerns. - [`JsonDB.dump`](https://github.com/spesmilo/electrum/blob/b431d8e9b8f75b99a43540ed6672772abfb68834/electrum/json_db.py#L175) 0.4299 sec It takes a lot of time to json-serialise the whole wallet file for every write. - [`zlib.compress`](https://github.com/spesmilo/electrum/blob/b431d8e9b8f75b99a43540ed6672772abfb68834/electrum/storage.py#L179) 0.4546 sec

One idea: changing from JSON to CBOR would provide about a 4.5x performance increase at the cost of human readability, according to the benchmarks at [the CBOR repo](https://github.com/whyrusleeping/cbor) A very simple fix for this whole issue would therefore be to (optionally, primarily for large wallets)) encode the wallet file as uncompressed CBOR instead of compressed JSON.

SomberNight added the enhancement ✨ label Nov 2, 2018

SomberNight added this to the Lightning release1 milestone Nov 2, 2018

ecdsa closed this as completed May 16, 2019

SomberNight added the topic-walletstorage 💾 not wallet itself but storage/db-related label Aug 8, 2019

SomberNight changed the title ~~replace json on disk, with leveldb/sql/...?~~ replace json on disk, with leveldb/sql/...? wallet file does not scale. partial writes Aug 8, 2019

SomberNight reopened this Aug 8, 2019

SomberNight mentioned this issue Oct 24, 2019

Handle exceptions raised when saving wallet file #5627

Open

ecdsa modified the milestones: Lightning release1, Lightning release2 Nov 27, 2019

SomberNight mentioned this issue Feb 27, 2020

Add proper header to storage (to expose some data before decryption) #5999

Open

ecdsa modified the milestones: Lightning release2, backlog May 30, 2020

PiRK mentioned this issue Dec 7, 2021

use a binary format for wallet files Bitcoin-ABC/ElectrumABC#144

Open

SomberNight mentioned this issue Jun 10, 2022

[500$ Bounty] Slow disk writes #7850

Open

ecdsa mentioned this issue Jun 27, 2022

storage: only read magic at initialization, not the full wallet contents #7867

Closed

SomberNight mentioned this issue Sep 20, 2022

GnuPG integration for password-less wallet protection #7969

Open

SomberNight mentioned this issue Jun 20, 2023

partial-writes using jsonpatch #8493

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replace json on disk, with leveldb/sql/...? wallet file does not scale. partial writes #4823

replace json on disk, with leveldb/sql/...? wallet file does not scale. partial writes #4823

SomberNight commented Nov 2, 2018

neocogent commented Nov 8, 2018

ecdsa commented May 16, 2019

SomberNight commented Aug 8, 2019 •

edited

Loading

ysangkok commented Oct 24, 2019 •

edited

Loading

ecdsa commented Oct 28, 2019

ysangkok commented Oct 29, 2019

ecdsa commented Oct 30, 2019

rt121212121 commented Nov 7, 2019

ecdsa commented Nov 24, 2019 •

edited

Loading

ecdsa commented Nov 25, 2019

ecdsa commented Nov 27, 2019

SomberNight commented Mar 17, 2020

ln2max commented Sep 14, 2021 •

edited

Loading

ecdsa commented Sep 21, 2021 •

edited

Loading

ecdsa commented Sep 22, 2021

ln2max commented Sep 22, 2021 via email

ecdsa commented Sep 23, 2021 •

edited

Loading

ecdsa commented Sep 23, 2021

SomberNight commented Sep 23, 2021 •

edited

Loading

ln2max commented Sep 23, 2021 via email

ln2max commented Sep 23, 2021 via email

ln2max commented Sep 23, 2021 via email

SomberNight commented Sep 24, 2021

ln2max commented Sep 24, 2021 via email

SomberNight commented Sep 25, 2021

ln2max commented Oct 15, 2021 via email

replace json on disk, with leveldb/sql/...? wallet file does not scale. partial writes #4823

replace json on disk, with leveldb/sql/...? wallet file does not scale. partial writes #4823

Comments

SomberNight commented Nov 2, 2018

neocogent commented Nov 8, 2018

ecdsa commented May 16, 2019

SomberNight commented Aug 8, 2019 • edited Loading

Option2A: SQLCipher

Option2B: RocksDB

ysangkok commented Oct 24, 2019 • edited Loading

ecdsa commented Oct 28, 2019

ysangkok commented Oct 29, 2019

ecdsa commented Oct 30, 2019

rt121212121 commented Nov 7, 2019

ecdsa commented Nov 24, 2019 • edited Loading

ecdsa commented Nov 25, 2019

ecdsa commented Nov 27, 2019

SomberNight commented Mar 17, 2020

ln2max commented Sep 14, 2021 • edited Loading

ecdsa commented Sep 21, 2021 • edited Loading

ecdsa commented Sep 22, 2021

ln2max commented Sep 22, 2021 via email

ecdsa commented Sep 23, 2021 • edited Loading

ecdsa commented Sep 23, 2021

SomberNight commented Sep 23, 2021 • edited Loading

ln2max commented Sep 23, 2021 via email

ln2max commented Sep 23, 2021 via email

ln2max commented Sep 23, 2021 via email

SomberNight commented Sep 24, 2021

ln2max commented Sep 24, 2021 via email

SomberNight commented Sep 25, 2021

ln2max commented Oct 15, 2021 via email

SomberNight commented Aug 8, 2019 •

edited

Loading

ysangkok commented Oct 24, 2019 •

edited

Loading

ecdsa commented Nov 24, 2019 •

edited

Loading

ln2max commented Sep 14, 2021 •

edited

Loading

ecdsa commented Sep 21, 2021 •

edited

Loading

ecdsa commented Sep 23, 2021 •

edited

Loading

SomberNight commented Sep 23, 2021 •

edited

Loading