Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache latest checkpoint in-memory to avoid deserializing and reserializing it too often #2161

Merged
merged 2 commits into from
Sep 25, 2020

Conversation

KtorZ
Copy link
Member

@KtorZ KtorZ commented Sep 22, 2020

Issue Number

Overview

This little change has some drastic performance improvements on
various tasks of the wallet:

Wallet Overview
   number of addresses: 29209
   number of transactions: 48819
   number of utxos: 2778 UTxOs

Bench                   | No Cache | With Cache
---                     | --       | ---
restoreTime             | 63min    | 22min
readWalletTime          | 715ms    | 53.70 ms
listAddressesTime       | 1.863s   | 2.169 ms
listTransactionsTime    | 75.72s   | 12.68 s
importOneAddressTime    | 2.526s   | 260.4 ms
importManyAddressesTime | 3.104s   | 1.690 s
estimateFeesTime        | 2.103s   | 63.11 ms

This cache approach however supposes that there's only a single wallet
per database, which is the case and has been the case for a long time
now. I'll remove the WalletId from the database interface in a next
commit to clean things up a bit.

Comments

@KtorZ KtorZ requested a review from Anviking September 22, 2020 08:30
@KtorZ KtorZ added the IMPROVEMENT Mark a PR as an improvement, for auto-generated CHANGELOG label Sep 22, 2020
@KtorZ KtorZ self-assigned this Sep 22, 2020
:: W.WalletId
-> SqlPersistT IO (Maybe (W.Wallet s))
selectLatestCheckpoint wid =
liftIO (readIORef cache) >>= maybe fromDatabase (pure . Just)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache doesn't live in atomically, in contrast to the rest of the DB layer. Couldn't this lead to race-conditions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can only be a single execution thread in atomically (constructed with an MVar lock), hence the use of a mere IORef here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining something more like:

----------------------------------------------------------------------> time
1: putCheckpoint cp1 
2:           selectLatest >>= f >>= putCheckpoint
                                                     selectLatest >>= g >>= putCheckpoint
  1. Updates cache, but operation is cancelled before the DB is updated. Cache and DB diverges.
  2. DB cp will be overwritten by the cached checkpoint the next time it is modified

But I guess the worst thing that can happen is that a somehow-failing putCheckpoint will end up succeeding 🤔

Copy link
Member Author

@KtorZ KtorZ Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates cache, but operation is cancelled before the DB is updated. Cache and DB diverges.

Cache is updated after the put operation has succeeded though. If it doesn't succeed, the cache isn't updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, Matthias pointed out that "We create an illusion of transaction by having a top-level MVar lock.", so there shouldn't be Sqlite transactions being run and then somehow cancelled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added some notes in the code about this in place where the cache is created.

1600712574.291298178s 5000
1600712574.365636401s 6000
1600712574.503498748s 7000
1600712574.663584257s 8000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 Perhaps worth adding to .gitignore
?
Skärmavbild 2020-09-22 kl  12 13 36

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Sounds like I a wrong git add lib indeed :)

Copy link
Collaborator

@Anviking Anviking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice speedup. LGTM.

clearCache wid = liftIO $ modifyIORef cache $ Map.delete wid

let writeCache :: W.WalletId -> W.Wallet s -> SqlPersistT IO ()
writeCache wid cp = liftIO $ modifyIORef cache $ Map.alter alter wid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance of buildup of thunks here? Though I suppose we'd notice that when testing this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought of it and:

  • Map is strict, so values are evaluated before they are put in the Map
  • We never just write in the cache without reading and evaluating data from it first.

So no, I don't think there's any chance that large thunks will be built here. I could switch to modifyIORef' for the tranquility of mind, we don't need the lazyness anyway since we have in principle already evaluated the cache to WHNF (and that's actually part of why we want to cache it!)

@KtorZ
Copy link
Member Author

KtorZ commented Sep 23, 2020

bors try

iohk-bors bot added a commit that referenced this pull request Sep 23, 2020
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Sep 23, 2020

try

Build failed:

…lizing too often

  This little change has some drastic performance improvements on
  various tasks of the wallet:

  ```
  Wallet Overview
     number of addresses: 29209
     number of transactions: 48819
     number of utxos: 2778 UTxOs

  Bench                   | No Cache | With Cache
  ---                     | --       | ---
  restoreTime             | 63min    | 22min
  readWalletTime          | 715ms    | 53.70 ms
  listAddressesTime       | 1.863s   | 2.169 ms
  listTransactionsTime    | 75.72s   | 12.68 s
  importOneAddressTime    | 2.526s   | 260.4 ms
  importManyAddressesTime | 3.104s   | 1.690 s
  estimateFeesTime        | 2.103s   | 63.11 ms
  ```
@KtorZ
Copy link
Member Author

KtorZ commented Sep 24, 2020

bors r+

iohk-bors bot added a commit that referenced this pull request Sep 24, 2020
2161: cache latest checkpoint in-memory to avoid deserializing and reserializing it too often r=KtorZ a=KtorZ

# Issue Number

<!-- Put here a reference to the issue this PR relates to and which requirements it tackles -->


# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->



  This little change has some drastic performance improvements on
  various tasks of the wallet:

  ```
  Wallet Overview
     number of addresses: 29209
     number of transactions: 48819
     number of utxos: 2778 UTxOs

  Bench                   | No Cache | With Cache
  ---                     | --       | ---
  restoreTime             | 63min    | 22min
  readWalletTime          | 715ms    | 53.70 ms
  listAddressesTime       | 1.863s   | 2.169 ms
  listTransactionsTime    | 75.72s   | 12.68 s
  importOneAddressTime    | 2.526s   | 260.4 ms
  importManyAddressesTime | 3.104s   | 1.690 s
  estimateFeesTime        | 2.103s   | 63.11 ms
  ```

  This cache approach however supposes that there's only a single wallet
  per database, which is the case and has been the case for a long time
  now. I'll remove the WalletId from the database interface in a next
  commit to clean things up a bit.



# Comments

<!-- Additional comments or screenshots to attach if any -->

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Assign the PR to a corresponding milestone
 ✓ Acknowledge any changes required to the Wiki
-->


Co-authored-by: KtorZ <matthias.benkort@gmail.com>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Sep 24, 2020

Timed out.
#2216

@KtorZ
Copy link
Member Author

KtorZ commented Sep 24, 2020

bors retry

iohk-bors bot added a commit that referenced this pull request Sep 24, 2020
2161: cache latest checkpoint in-memory to avoid deserializing and reserializing it too often r=KtorZ a=KtorZ

# Issue Number

<!-- Put here a reference to the issue this PR relates to and which requirements it tackles -->


# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->



  This little change has some drastic performance improvements on
  various tasks of the wallet:

  ```
  Wallet Overview
     number of addresses: 29209
     number of transactions: 48819
     number of utxos: 2778 UTxOs

  Bench                   | No Cache | With Cache
  ---                     | --       | ---
  restoreTime             | 63min    | 22min
  readWalletTime          | 715ms    | 53.70 ms
  listAddressesTime       | 1.863s   | 2.169 ms
  listTransactionsTime    | 75.72s   | 12.68 s
  importOneAddressTime    | 2.526s   | 260.4 ms
  importManyAddressesTime | 3.104s   | 1.690 s
  estimateFeesTime        | 2.103s   | 63.11 ms
  ```

  This cache approach however supposes that there's only a single wallet
  per database, which is the case and has been the case for a long time
  now. I'll remove the WalletId from the database interface in a next
  commit to clean things up a bit.



# Comments

<!-- Additional comments or screenshots to attach if any -->

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Assign the PR to a corresponding milestone
 ✓ Acknowledge any changes required to the Wiki
-->


Co-authored-by: KtorZ <matthias.benkort@gmail.com>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Sep 24, 2020

Build failed:

Log-limit exceeded, perhaps due to running out of faucet funds, which should be fixed on recent master.
#2216

  This was introduced as a hotfix for the Daedalus flight release of April 2019. By now,
  we have got rid of the old cardano-sl implementation for which this was causing an issue
  and can get back to 'normal'. I also removed the associated test.
@KtorZ
Copy link
Member Author

KtorZ commented Sep 24, 2020

bors r+

iohk-bors bot added a commit that referenced this pull request Sep 24, 2020
2161: cache latest checkpoint in-memory to avoid deserializing and reserializing it too often r=KtorZ a=KtorZ

# Issue Number

<!-- Put here a reference to the issue this PR relates to and which requirements it tackles -->


# Overview

<!-- Detail in a few bullet points the work accomplished in this PR -->



  This little change has some drastic performance improvements on
  various tasks of the wallet:

  ```
  Wallet Overview
     number of addresses: 29209
     number of transactions: 48819
     number of utxos: 2778 UTxOs

  Bench                   | No Cache | With Cache
  ---                     | --       | ---
  restoreTime             | 63min    | 22min
  readWalletTime          | 715ms    | 53.70 ms
  listAddressesTime       | 1.863s   | 2.169 ms
  listTransactionsTime    | 75.72s   | 12.68 s
  importOneAddressTime    | 2.526s   | 260.4 ms
  importManyAddressesTime | 3.104s   | 1.690 s
  estimateFeesTime        | 2.103s   | 63.11 ms
  ```

  This cache approach however supposes that there's only a single wallet
  per database, which is the case and has been the case for a long time
  now. I'll remove the WalletId from the database interface in a next
  commit to clean things up a bit.



# Comments

<!-- Additional comments or screenshots to attach if any -->

<!-- 
Don't forget to:

 ✓ Self-review your changes to make sure nothing unexpected slipped through
 ✓ Assign yourself to the PR
 ✓ Assign one or several reviewer(s)
 ✓ Once created, link this PR to its corresponding ticket
 ✓ Assign the PR to a corresponding milestone
 ✓ Acknowledge any changes required to the Wiki
-->


Co-authored-by: KtorZ <matthias.benkort@gmail.com>
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Sep 24, 2020

Build failed:

#2218 ADP-619

  src/Test/Integration/Scenario/API/Byron/Addresses.hs:410:46:
  1) API Specifications, BYRON_ADDRESSES, ADDRESS_IMPORT_05 - I can import 15000 of addresses
       uncaught exception: IOException of type UserError
       user error (Waited longer than 90s (more than 2 epochs) for an action to resolve. Action: "Addresses are imported". Error condition: Just (HUnitFailure (Just (SrcLoc {srcLocPackage = "cardano-wallet-core-integration-2020.9.22-6DPlnrdhceJLXA8CsrMoUY", srcLocModule = "Test.Integration.Scenario.API.Byron.Addresses", srcLocFile = "src/Test/Integration/Scenario/API/Byron/Addresses.hs", srcLocStartLine = 430, srcLocStartCol = 13, srcLocEndLine = 430, srcLocEndCol = 35})) (ExpectedButGot Nothing "15000" "0")))

  To rerun use: --match "/API Specifications/BYRON_ADDRESSES/ADDRESS_IMPORT_05 - I can import 15000 of addresses/"

@KtorZ
Copy link
Member Author

KtorZ commented Sep 25, 2020

bors merge

@iohk-bors
Copy link
Contributor

iohk-bors bot commented Sep 25, 2020

Build succeeded:

@iohk-bors iohk-bors bot merged commit d4d5cf3 into master Sep 25, 2020
@iohk-bors iohk-bors bot deleted the KtorZ/checkpoint-caching branch September 25, 2020 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IMPROVEMENT Mark a PR as an improvement, for auto-generated CHANGELOG
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants