New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of GET /api/delegates #2351

Closed
nazarhussain opened this Issue Aug 29, 2018 · 2 comments

Comments

Projects
4 participants
@nazarhussain
Contributor

nazarhussain commented Aug 29, 2018

Expected behavior

Performance of API GET /api/delegates should be satisfactory (under 1 second).

Actual behavior

We run some benchmarks with following spec:

CPU: 1vCPU (Digital Ocean) 
RAM: 1 GB Memory
Disk: 25 GB Disk
Connections: 10 
Time: 30s

And got the following results;

Fastest: 12.07
Slowest: 13.45
Average: 14.31

Steps to reproduce

Query endpoints mentioned above.

Which version(s) does this affect? (Environment, OS, etc...)

1.0.1 (after indexes)

@MaciejBaj MaciejBaj added this to Open Issues in Version 1.0.0 via automation Aug 30, 2018

@MaciejBaj MaciejBaj added this to the Version 1.0.2 milestone Aug 30, 2018

@4miners

This comment has been minimized.

Show comment
Hide comment
@4miners

4miners Aug 30, 2018

Member

When we call api/dlegates endpoint following query is executed:

SELECT "rewards"::bigint AS "rewards",
       "vote"::bigint AS "vote",
       "producedBlocks"::bigint AS "producedBlocks",
       "missedBlocks"::bigint AS "missedBlocks",
       "username" AS "username",
       (encode("publicKey", 'hex')) AS "publicKey",
       (encode("secondPublicKey", 'hex')) AS "secondPublicKey",
       "address" AS "address",
       (
          (SELECT m.row_number
           FROM
             (SELECT row_number() OVER (
                                        ORDER BY r.vote DESC, r."publicKey" ASC), address
              FROM
                (SELECT d."isDelegate",
                        d.vote,
                        d."publicKey",
                        d.address
                 FROM mem_accounts AS d
                 WHERE d."isDelegate" = 1) AS r) m
           WHERE m.address = mem_accounts.address )::bigint) AS "rank"
FROM "mem_accounts"
WHERE "isDelegate"=1::int
ORDER BY "rank" DESC
LIMIT 10
OFFSET 0

We're getting the performance hit because rank is a dynamic field, for which we execute the following query:

SELECT m.row_number FROM (SELECT row_number()
  OVER (ORDER BY r.vote DESC, r."publicKey" ASC), address
    FROM (SELECT d."isDelegate", d.vote, d."publicKey", d.address
      FROM mem_accounts AS d
      WHERE d."isDelegate" = 1) AS r) m
    WHERE m.address = mem_accounts.address

Its get executed as a subquery for every delegate that is registered, so for mainnet: 1715 times. Here is the query plan:

                                                                                        QUERY PLAN                                                                                         
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=556580.64..556580.67 rows=10 width=135) (actual time=7266.040..7266.044 rows=10 loops=1)
   Buffers: shared hit=2678676
   ->  Sort  (cost=556580.64..556584.50 rows=1542 width=135) (actual time=7266.038..7266.040 rows=10 loops=1)
         Sort Key: ((SubPlan 1)) DESC
         Sort Method: top-N heapsort  Memory: 27kB
         Buffers: shared hit=2678676
         ->  Index Scan using mem_accounts_is_delegate on mem_accounts  (cost=0.42..556547.32 rows=1542 width=135) (actual time=3.776..7261.623 rows=1715 loops=1)
               Index Cond: ("isDelegate" = 1)
               Buffers: shared hit=2678676
               SubPlan 1
                 ->  Subquery Scan on m  (cost=310.65..360.77 rows=1 width=8) (actual time=3.554..4.213 rows=1 loops=1715)
                       Filter: ((m.address)::text = (mem_accounts.address)::text)
                       Rows Removed by Filter: 1714
                       Buffers: shared hit=2677115
                       ->  WindowAgg  (cost=310.65..341.49 rows=1542 width=70) (actual time=2.889..3.941 rows=1715 loops=1715)
                             Buffers: shared hit=2677115
                             ->  Sort  (cost=310.65..314.51 rows=1542 width=62) (actual time=2.884..3.204 rows=1715 loops=1715)
                                   Sort Key: d.vote DESC, d."publicKey"
                                   Sort Method: quicksort  Memory: 290kB
                                   Buffers: shared hit=2677115
                                   ->  Index Scan using mem_accounts_is_delegate on mem_accounts d  (cost=0.42..229.00 rows=1542 width=62) (actual time=0.015..1.975 rows=1715 loops=1715)
                                         Index Cond: ("isDelegate" = 1)
                                         Buffers: shared hit=2677115
 Planning time: 0.208 ms
 Execution time: 7266.108 ms
(25 rows)

I already proposed solution for it in the past, please check my comments in #1382. I still think that this solution will be the most efficient one, please confirm @nazarhussain.

Member

4miners commented Aug 30, 2018

When we call api/dlegates endpoint following query is executed:

SELECT "rewards"::bigint AS "rewards",
       "vote"::bigint AS "vote",
       "producedBlocks"::bigint AS "producedBlocks",
       "missedBlocks"::bigint AS "missedBlocks",
       "username" AS "username",
       (encode("publicKey", 'hex')) AS "publicKey",
       (encode("secondPublicKey", 'hex')) AS "secondPublicKey",
       "address" AS "address",
       (
          (SELECT m.row_number
           FROM
             (SELECT row_number() OVER (
                                        ORDER BY r.vote DESC, r."publicKey" ASC), address
              FROM
                (SELECT d."isDelegate",
                        d.vote,
                        d."publicKey",
                        d.address
                 FROM mem_accounts AS d
                 WHERE d."isDelegate" = 1) AS r) m
           WHERE m.address = mem_accounts.address )::bigint) AS "rank"
FROM "mem_accounts"
WHERE "isDelegate"=1::int
ORDER BY "rank" DESC
LIMIT 10
OFFSET 0

We're getting the performance hit because rank is a dynamic field, for which we execute the following query:

SELECT m.row_number FROM (SELECT row_number()
  OVER (ORDER BY r.vote DESC, r."publicKey" ASC), address
    FROM (SELECT d."isDelegate", d.vote, d."publicKey", d.address
      FROM mem_accounts AS d
      WHERE d."isDelegate" = 1) AS r) m
    WHERE m.address = mem_accounts.address

Its get executed as a subquery for every delegate that is registered, so for mainnet: 1715 times. Here is the query plan:

                                                                                        QUERY PLAN                                                                                         
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=556580.64..556580.67 rows=10 width=135) (actual time=7266.040..7266.044 rows=10 loops=1)
   Buffers: shared hit=2678676
   ->  Sort  (cost=556580.64..556584.50 rows=1542 width=135) (actual time=7266.038..7266.040 rows=10 loops=1)
         Sort Key: ((SubPlan 1)) DESC
         Sort Method: top-N heapsort  Memory: 27kB
         Buffers: shared hit=2678676
         ->  Index Scan using mem_accounts_is_delegate on mem_accounts  (cost=0.42..556547.32 rows=1542 width=135) (actual time=3.776..7261.623 rows=1715 loops=1)
               Index Cond: ("isDelegate" = 1)
               Buffers: shared hit=2678676
               SubPlan 1
                 ->  Subquery Scan on m  (cost=310.65..360.77 rows=1 width=8) (actual time=3.554..4.213 rows=1 loops=1715)
                       Filter: ((m.address)::text = (mem_accounts.address)::text)
                       Rows Removed by Filter: 1714
                       Buffers: shared hit=2677115
                       ->  WindowAgg  (cost=310.65..341.49 rows=1542 width=70) (actual time=2.889..3.941 rows=1715 loops=1715)
                             Buffers: shared hit=2677115
                             ->  Sort  (cost=310.65..314.51 rows=1542 width=62) (actual time=2.884..3.204 rows=1715 loops=1715)
                                   Sort Key: d.vote DESC, d."publicKey"
                                   Sort Method: quicksort  Memory: 290kB
                                   Buffers: shared hit=2677115
                                   ->  Index Scan using mem_accounts_is_delegate on mem_accounts d  (cost=0.42..229.00 rows=1542 width=62) (actual time=0.015..1.975 rows=1715 loops=1715)
                                         Index Cond: ("isDelegate" = 1)
                                         Buffers: shared hit=2677115
 Planning time: 0.208 ms
 Execution time: 7266.108 ms
(25 rows)

I already proposed solution for it in the past, please check my comments in #1382. I still think that this solution will be the most efficient one, please confirm @nazarhussain.

@MaciejBaj MaciejBaj removed this from the Version 1.0.2 milestone Aug 30, 2018

@MaciejBaj MaciejBaj added this to New Issues in Version 1.1.0 via automation Aug 30, 2018

@MaciejBaj MaciejBaj removed this from Open Issues in Version 1.0.0 Aug 30, 2018

@nazarhussain

This comment has been minimized.

Show comment
Hide comment
@nazarhussain

nazarhussain Aug 30, 2018

Contributor

Yes the solution you suggested is absolutely correct, but do we release it in a patch, that's the question. It seems to be a lot of refactoring. @MaciejBaj Confirm if you want to do it in current patch release.

Contributor

nazarhussain commented Aug 30, 2018

Yes the solution you suggested is absolutely correct, but do we release it in a patch, that's the question. It seems to be a lot of refactoring. @MaciejBaj Confirm if you want to do it in current patch release.

@MaciejBaj MaciejBaj added this to the Version 1.1.0 milestone Aug 30, 2018

@diego-G diego-G added this to Sprint Backlog in Lisk Pipelines Sep 4, 2018

@diego-G diego-G added the *medium label Sep 4, 2018

@shuse2 shuse2 closed this in 9e9c624 Sep 4, 2018

Version 1.1.0 automation moved this from New Issues to Closed Issues Sep 4, 2018

@diego-G diego-G removed this from Sprint Backlog in Lisk Pipelines Sep 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment