Bigger blockchain gets, topAccounts endpoint become less efficient #233

ghost · 2016-07-10T14:38:56Z

https://127.0.0.1:8000/api/accounts/top?&offset=0&limit=100

Currently with current main-net height this function is not usable.
It's getting worse and worse as blockchain grows. This issue has been already reported in lisk-explorer repo but its strictly related to this repository.

fix · 2016-07-11T18:37:58Z

Why do you assume it is gettting worse as the blockchain grows?
Looking at the code this is a bare postgres query...
https://github.com/LiskHQ/lisk/blob/development/modules/accounts.js#L231

ghost · 2016-07-11T19:00:45Z

I have been checking this endpoint while blockchain was rebuilding, as height was bigger endpoint become slower and slower, on current height it's impossible to fetch any data. Infinite loading. Check yourself with enabled top accounts functionality. It's not useable.

fix · 2016-07-11T19:02:13Z

ok i'll check it!

ghost · 2016-07-11T19:08:56Z

This is why https://explorer.lisk.io/topAccounts does not work currently.

fix · 2016-07-11T19:11:02Z

@karmacoma any idea why? it seems the original dev knows it because there is the TOP switch. But looking at the code i cannot tell.

karmacoma · 2016-08-02T09:34:10Z

Not sure why the TOP switch is there, probably because in the beginning we only used this feature for the explorer, so we disabled for other use cases. The performance issue was never a factor before, as we did not have the number of accounts. It's likely due to a performance issue related to sorting account balances i.e. column indexing.

karmacoma · 2016-08-02T11:52:25Z

Here is the query used to SELECT top accounts.

SELECT "username",
  "isDelegate",
  "u_isDelegate",
  "secondSignature",
  "u_secondSignature",
  "u_username",
  UPPER("address") AS "address",
  ENCODE("publicKey", 'hex') AS "publicKey",
  ENCODE("secondPublicKey", 'hex') AS "secondPublicKey",
  ("balance")::BIGINT AS "balance",
  ("u_balance")::BIGINT AS "u_balance",
  ("vote")::BIGINT AS "vote",
  ("rate")::BIGINT AS "rate",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2delegates
    WHERE "accountId" = a."address"
    ) AS "delegates",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2u_delegates
    WHERE "accountId" = a."address"
    ) AS "u_delegates",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2multisignatures
    WHERE "accountId" = a."address"
    ) AS "multisignatures",
  (
    SELECT ARRAY_AGG("dependentId")
    FROM mem_accounts2u_multisignatures
    WHERE "accountId" = a."address"
    ) AS "u_multisignatures", "multimin", "u_multimin", "multilifetime", "u_multilifetime", "blockId", "nameexist", "u_nameexist", "producedblocks", "missedblocks", ("fees")::BIGINT AS "fees", ("rewards")::BIGINT AS "rewards"
FROM "mem_accounts" AS "a"
ORDER BY "balance" DESC;

It appears to be missing a LIMIT clause. Therefore, this would be the first step towards improving performance.

karmacoma · 2016-08-02T11:59:49Z

Here is the query plan without a limit:

                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Sort  (cost=42181551.68..42181592.53 rows=16341 width=203)
   Sort Key: a.balance DESC
   ->  Seq Scan on mem_accounts a  (cost=0.00..42180408.12 rows=16341 width=203)
         SubPlan 1
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 2
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2u_delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 3
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 4
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2u_multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)

Here is the query plan with a limit of 100.

QUERY PLAN                                                 
-----------------------------------------------------------------------------------------------------------
 Limit  (cost=42181032.66..42181032.91 rows=100 width=203)
   ->  Sort  (cost=42181032.66..42181073.51 rows=16341 width=203)
         Sort Key: a.balance DESC
         ->  Seq Scan on mem_accounts a  (cost=0.00..42180408.12 rows=16341 width=203)
               SubPlan 1
                 ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                       ->  Seq Scan on mem_accounts2delegates  (cost=0.00..1276.21 rows=33 width=65)
                             Filter: (("accountId")::text = (a.address)::text)
               SubPlan 2
                 ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                       ->  Seq Scan on mem_accounts2u_delegates  (cost=0.00..1276.21 rows=33 width=65)
                             Filter: (("accountId")::text = (a.address)::text)
               SubPlan 3
                 ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                       ->  Seq Scan on mem_accounts2multisignatures  (cost=0.00..14.25 rows=2 width=146)
                             Filter: (("accountId")::text = (a.address)::text)
               SubPlan 4
                 ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                       ->  Seq Scan on mem_accounts2u_multisignatures  (cost=0.00..14.25 rows=2 width=146)
                             Filter: (("accountId")::text = (a.address)::text)

Reducing number of output rows from 16341 to 100. Same cost in terms of disk pages.

karmacoma · 2016-08-02T12:03:09Z

Adding an index to mem_accounts.balance

Limit  (cost=0.29..258156.63 rows=100 width=203)
   ->  Index Scan Backward using mem_accounts_balance on mem_accounts a  (cost=0.29..42185328.09 rows=16341 width=203)
         SubPlan 1
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 2
           ->  Aggregate  (cost=1276.30..1276.31 rows=1 width=65)
                 ->  Seq Scan on mem_accounts2u_delegates  (cost=0.00..1276.21 rows=33 width=65)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 3
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)
         SubPlan 4
           ->  Aggregate  (cost=14.26..14.27 rows=1 width=146)
                 ->  Seq Scan on mem_accounts2u_multisignatures  (cost=0.00..14.25 rows=2 width=146)
                       Filter: (("accountId")::text = (a.address)::text)

Reduces total cost significantly from 42181032.66..42181032.91 to 0.29..258156.63.

fix mentioned this issue Jul 13, 2016

/api/accounts/top setting flag #234

Closed

fix added the type: bug label Jul 13, 2016

karmacoma self-assigned this Aug 2, 2016

karmacoma added this to the Version 0.3.2 milestone Aug 2, 2016

karmacoma pushed a commit that referenced this issue Aug 2, 2016

Adding default limit #233.

4334efa

karmacoma closed this as completed in 9cd1671 Aug 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bigger blockchain gets, topAccounts endpoint become less efficient #233

Bigger blockchain gets, topAccounts endpoint become less efficient #233

ghost commented Jul 10, 2016 •

edited by fix

Loading

fix commented Jul 11, 2016

ghost commented Jul 11, 2016 •

edited by ghost

Loading

fix commented Jul 11, 2016

ghost commented Jul 11, 2016

fix commented Jul 11, 2016

karmacoma commented Aug 2, 2016

karmacoma commented Aug 2, 2016

karmacoma commented Aug 2, 2016

karmacoma commented Aug 2, 2016 •

edited

Loading

Bigger blockchain gets, topAccounts endpoint become less efficient #233

Bigger blockchain gets, topAccounts endpoint become less efficient #233

Comments

ghost commented Jul 10, 2016 • edited by fix Loading

fix commented Jul 11, 2016

ghost commented Jul 11, 2016 • edited by ghost Loading

fix commented Jul 11, 2016

ghost commented Jul 11, 2016

fix commented Jul 11, 2016

karmacoma commented Aug 2, 2016

karmacoma commented Aug 2, 2016

karmacoma commented Aug 2, 2016

karmacoma commented Aug 2, 2016 • edited Loading

ghost commented Jul 10, 2016 •

edited by fix

Loading

ghost commented Jul 11, 2016 •

edited by ghost

Loading

karmacoma commented Aug 2, 2016 •

edited

Loading