Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Bigger blockchain gets, topAccounts endpoint become less efficient #233

Closed
ghost opened this issue Jul 10, 2016 · 9 comments
Closed

Bigger blockchain gets, topAccounts endpoint become less efficient #233

ghost opened this issue Jul 10, 2016 · 9 comments
Assignees
Milestone

Comments

@ghost
Copy link

ghost commented Jul 10, 2016

https://127.0.0.1:8000/api/accounts/top?&offset=0&limit=100

Currently with current main-net height this function is not usable.
It's getting worse and worse as blockchain grows. This issue has been already reported in lisk-explorer repo but its strictly related to this repository.

@fix
Copy link
Contributor

fix commented Jul 11, 2016

Why do you assume it is gettting worse as the blockchain grows?
Looking at the code this is a bare postgres query...
https://github.com/LiskHQ/lisk/blob/development/modules/accounts.js#L231

@ghost
Copy link
Author

ghost commented Jul 11, 2016

I have been checking this endpoint while blockchain was rebuilding, as height was bigger endpoint become slower and slower, on current height it's impossible to fetch any data. Infinite loading. Check yourself with enabled top accounts functionality. It's not useable.

@fix
Copy link
Contributor

fix commented Jul 11, 2016

ok i'll check it!

@ghost
Copy link
Author

ghost commented Jul 11, 2016

This is why https://explorer.lisk.io/topAccounts does not work currently.

@fix
Copy link
Contributor

fix commented Jul 11, 2016

@karmacoma any idea why? it seems the original dev knows it because there is the TOP switch. But looking at the code i cannot tell.

@karmacoma
Copy link
Contributor

Not sure why the TOP switch is there, probably because in the beginning we only used this feature for the explorer, so we disabled for other use cases. The performance issue was never a factor before, as we did not have the number of accounts. It's likely due to a performance issue related to sorting account balances i.e. column indexing.

@karmacoma
Copy link
Contributor

Here is the query used to SELECT top accounts.

SELECT "username",
  "isDelegate",
  "u_isDelegate",
  "secondSignature",
  "u_secondSignature",
  "u_username",
  UPPER("address") AS "address",
  ENCODE("publicKey", 'hex') AS "publicKey",
  ENCODE("secondPublicKey", 'hex') AS "secondPublicKey",
  ("balance")::BIGINT AS "balance",
  ("u_balance")::BIGINT AS "u_balance",
  ("vote")::BIGINT AS "vote",
  ("rate")::BIGINT AS "rate",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2delegates
    WHERE "accountId" = a."address"
    ) AS "delegates",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2u_delegates
    WHERE "accountId" = a."address"
    ) AS "u_delegates",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2multisignatures
    WHERE "accountId" = a."address"
    ) AS "multisignatures",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2u_multisignatures
    WHERE "accountId" = a."address"
    ) AS "u_multisignatures", "multimin", "u_multimin", "multilifetime", "u_multilifetime", "blockId", "nameexist", "u_nameexist", "producedblocks", "missedblocks", ("fees")::BIGINT AS "fees", ("rewards")::BIGINT AS "rewards"
FROM "mem_accounts" AS "a"
ORDER BY "balance" DESC;

It appears to be missing a LIMIT clause. Therefore, this would be the first step towards improving performance.

@karmacoma
Copy link
Contributor

Here is the query plan without a limit:

                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Sort  (cost=42181551.68..42181592.53 rows=16341 width=203)
   Sort Key: a.balance DESC
   ->  Seq Scan on mem_accounts a  (cost=0.00..42180408.12 rows=16341 width=203)
         SubPlan 1
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 2
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2u_delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 3
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 4
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2u_multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)

Here is the query plan with a limit of 100.

QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Limit  (cost=42181032.66..42181032.91 rows=100 width=203)
   ->  Sort  (cost=42181032.66..42181073.51 rows=16341 width=203)
         Sort Key: a.balance DESC
         ->  Seq Scan on mem_accounts a  (cost=0.00..42180408.12 rows=16341 width=203)
               SubPlan 1
                 ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                       ->  Seq Scan on mem_accounts2delegates  (cost=0.00..1276.21 rows=33 width=65)
                             Filter: (("accountId")::text = (a.address)::text)
               SubPlan 2
                 ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                       ->  Seq Scan on mem_accounts2u_delegates  (cost=0.00..1276.21 rows=33 width=65)
                             Filter: (("accountId")::text = (a.address)::text)
               SubPlan 3
                 ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                       ->  Seq Scan on mem_accounts2multisignatures  (cost=0.00..14.25 rows=2 width=146)
                             Filter: (("accountId")::text = (a.address)::text)
               SubPlan 4
                 ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                       ->  Seq Scan on mem_accounts2u_multisignatures  (cost=0.00..14.25 rows=2 width=146)
                             Filter: (("accountId")::text = (a.address)::text)

Reducing number of output rows from 16341 to 100. Same cost in terms of disk pages.

@karmacoma
Copy link
Contributor

karmacoma commented Aug 2, 2016

Adding an index to mem_accounts.balance

Limit  (cost=0.29..258156.63 rows=100 width=203)
   ->  Index Scan Backward using mem_accounts_balance on mem_accounts a  (cost=0.29..42185328.09 rows=16341 width=203)
         SubPlan 1
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 2
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2u_delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 3
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 4
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2u_multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)

Reduces total cost significantly from 42181032.66..42181032.91 to 0.29..258156.63.

@karmacoma karmacoma added this to the Version 0.3.2 milestone Aug 2, 2016
karmacoma pushed a commit that referenced this issue Aug 2, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants