Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize NftRepository queries in web3 #8070

Conversation

xin-hedera
Copy link
Collaborator

@xin-hedera xin-hedera commented Apr 11, 2024

Description:

This PR optimizes queries in web3 NftRepository

  • Add a gist index on account_id, timestamp_range and drop the timestamp_range index
  • Change historical nft balance by account query to utilize the new index
  • Change historical nft balance by account and token to utilize the new index
  • Rewrite historical nft total supply query
  • Fix contract call integration test setup
  • Update init.sh to create btree_gist extension for v1
  • Update rosetta entrypoint.sh to ensure btree_gist extension in existing db
  • Update rosetta integration test to use init.sh

Related issue(s):

Fixes #8064

Notes for reviewer:

Performance comparison, the first explain analyze is from main, the second is from the PR branch:

countByAccountIdAndTimestampNotDeleted

mirror_node=> explain analyze             select count(*)
mirror_node->             from (
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft
mirror_node(>                     where account_id = :accountId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 union all
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft_history
mirror_node(>                     where account_id = :accountId
mirror_node(>                         and :blockTimestamp >= lower(timestamp_range) and :blockTimestamp < upper(timestamp_range)
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>             ) as n
mirror_node->             left join entity e on e.id = n.token_id
mirror_node->             where (e.deleted is not true or lower(e.timestamp_range) > :blockTimestamp);
                                                                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=512950.13..512950.14 rows=1 width=8) (actual time=2033.338..2107.948 rows=1 loops=1)
   ->  Gather  (cost=512949.91..512950.12 rows=2 width=8) (actual time=2023.366..2107.939 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=511949.91..511949.92 rows=1 width=8) (actual time=2016.530..2016.536 rows=1 loops=3)
               ->  Parallel Hash Left Join  (cost=328022.80..511548.82 rows=160436 width=0) (actual time=2015.777..2016.525 rows=42 loops=3)
                     Hash Cond: (nft.token_id = e.id)
                     Filter: ((e.deleted IS NOT TRUE) OR (lower(e.timestamp_range) > '1712803311992101593'::bigint))
                     ->  Parallel Append  (cost=0.00..166414.86 rows=385964 width=8) (actual time=296.089..443.987 rows=42 loops=3)
                           ->  Index Scan using nft__account_token_serialnumber on nft  (cost=0.56..271.26 rows=235 width=8) (actual time=0.230..0.580 rows=126 loops=1)
                                 Index Cond: (account_id = 3960334)
                                 Filter: ((deleted IS NOT TRUE) AND (timestamp_range @> '1712803311992101593'::bigint))
                           ->  Parallel Seq Scan on nft_history  (cost=0.00..160354.83 rows=160720 width=8) (actual time=443.782..443.782 rows=0 loops=3)
                                 Filter: ((deleted IS NOT TRUE) AND (account_id = 3960334) AND ('1712803311992101593'::bigint >= lower(timestamp_range)) AND ('1712803311992101593'::bigint < upper(timestamp_range)))
                                 Rows Removed by Filter: 1795656
                     ->  Parallel Hash  (cost=287033.69..287033.69 rows=2232569 width=23) (actual time=1237.044..1237.045 rows=1793153 loops=3)
                           Buckets: 1048576  Batches: 8  Memory Usage: 45088kB
                           ->  Parallel Seq Scan on entity e  (cost=0.00..287033.69 rows=2232569 width=23) (actual time=0.013..720.068 rows=1793153 loops=3)
 Planning Time: 0.467 ms
 Execution Time: 2108.069 ms

mirror_node=> explain analyze             select count(*)
mirror_node->             from (
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft
mirror_node(>                     where account_id = :accountId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 union all
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft_history
mirror_node(>                     where account_id = cast(:accountId as bigint) -- cast to utilize the btree_gist index
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>             ) as n
mirror_node->             left join entity e on e.id = n.token_id
mirror_node->             where (e.deleted is not true or lower(e.timestamp_range) > :blockTimestamp);
                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=905.87..905.88 rows=1 width=8) (actual time=0.633..0.635 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.99..905.28 rows=235 width=0) (actual time=0.042..0.619 rows=126 loops=1)
         Filter: ((e.deleted IS NOT TRUE) OR (lower(e.timestamp_range) > '1712803311992101593'::bigint))
         ->  Append  (cost=0.56..277.44 rows=236 width=8) (actual time=0.027..0.355 rows=126 loops=1)
               ->  Index Scan using nft__account_token_serialnumber on nft  (cost=0.56..271.26 rows=235 width=8) (actual time=0.026..0.286 rows=126 loops=1)
                     Index Cond: (account_id = 3960334)
                     Filter: ((deleted IS NOT TRUE) AND (timestamp_range @> '1712803311992101593'::bigint))
               ->  Index Scan using nft_history__account_timestamp_range on nft_history  (cost=0.41..2.63 rows=1 width=8) (actual time=0.047..0.047 rows=0 loops=1)
                     Index Cond: ((account_id = '3960334'::bigint) AND (timestamp_range @> '1712803311992101593'::bigint))
                     Filter: (deleted IS NOT TRUE)
         ->  Index Scan using entity_pkey on entity e  (cost=0.43..2.65 rows=1 width=23) (actual time=0.002..0.002 rows=1 loops=126)
               Index Cond: (id = nft.token_id)
 Planning Time: 0.352 ms
 Execution Time: 0.680 ms

nftBalanceByAccountIdTokenIdAndTimestamp

mirror_node=> explain analyze             select count(*)
mirror_node->             from (
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft
mirror_node(>                     where account_id = :accountId
mirror_node(>                         and token_id = :tokenId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 union all
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft_history
mirror_node(>                     where account_id = :accountId
mirror_node(>                         and token_id = :tokenId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 ) as n
mirror_node->             join entity e on e.id = n.token_id
mirror_node->             where (e.deleted is not true or lower(e.timestamp_range) > :blockTimestamp);
                                                                         QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=186.40..186.41 rows=1 width=8) (actual time=1.593..1.595 rows=1 loops=1)
   ->  Nested Loop  (cost=0.99..186.28 rows=48 width=0) (actual time=0.065..1.565 rows=126 loops=1)
         ->  Append  (cost=0.56..58.84 rows=48 width=8) (actual time=0.036..1.150 rows=126 loops=1)
               ->  Index Scan using nft__account_token_serialnumber on nft  (cost=0.56..55.48 rows=47 width=8) (actual time=0.035..0.735 rows=126 loops=1)
                     Index Cond: ((account_id = 3960334) AND (token_id = 4189968))
                     Filter: ((deleted IS NOT TRUE) AND (timestamp_range @> '1712803311992101593'::bigint))
               ->  Index Scan using nft_history__timestamp_range on nft_history  (cost=0.41..2.64 rows=1 width=8) (actual time=0.390..0.390 rows=0 loops=1)
                     Index Cond: (timestamp_range @> '1712803311992101593'::bigint)
                     Filter: ((deleted IS NOT TRUE) AND (account_id = '3960334'::bigint) AND (token_id = '4189968'::bigint))
                     Rows Removed by Filter: 33
         ->  Index Scan using entity_pkey on entity e  (cost=0.43..2.66 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=126)
               Index Cond: (id = nft.token_id)
               Filter: ((deleted IS NOT TRUE) OR (lower(timestamp_range) > '1712803311992101593'::bigint))
 Planning Time: 0.751 ms
 Execution Time: 1.689 ms


mirror_node=> explain analyze             select count(*)
mirror_node->             from (
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft
mirror_node(>                     where account_id = :accountId
mirror_node(>                         and token_id = :tokenId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 union all
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft_history
mirror_node(>                     where account_id = cast(:accountId as bigint)
mirror_node(>                         and token_id = :tokenId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 ) as n
mirror_node->             join entity e on e.id = n.token_id
mirror_node->             where (e.deleted is not true or lower(e.timestamp_range) > :blockTimestamp);
                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=186.40..186.41 rows=1 width=8) (actual time=1.013..1.015 rows=1 loops=1)
   ->  Nested Loop  (cost=0.99..186.28 rows=48 width=0) (actual time=0.035..0.998 rows=126 loops=1)
         ->  Append  (cost=0.56..58.84 rows=48 width=8) (actual time=0.020..0.689 rows=126 loops=1)
               ->  Index Scan using nft__account_token_serialnumber on nft  (cost=0.56..55.48 rows=47 width=8) (actual time=0.019..0.332 rows=126 loops=1)
                     Index Cond: ((account_id = 3960334) AND (token_id = 4189968))
                     Filter: ((deleted IS NOT TRUE) AND (timestamp_range @> '1712803311992101593'::bigint))
               ->  Index Scan using nft_history__account_timestamp_range on nft_history  (cost=0.41..2.64 rows=1 width=8) (actual time=0.341..0.341 rows=0 loops=1)
                     Index Cond: ((account_id = '3960334'::bigint) AND (timestamp_range @> '1712803311992101593'::bigint))
                     Filter: ((deleted IS NOT TRUE) AND (token_id = 4189968))
         ->  Index Scan using entity_pkey on entity e  (cost=0.43..2.66 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=126)
               Index Cond: (id = nft.token_id)
               Filter: ((deleted IS NOT TRUE) OR (lower(timestamp_range) > '1712803311992101593'::bigint))
 Planning Time: 0.602 ms
 Execution Time: 1.074 ms

findNftTotalSupplyByTokenIdAndTimestamp

mirror_node=> explain analyze select count(*)
mirror_node->             from (
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft
mirror_node(>                     where token_id = :tokenId
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>                 union all
mirror_node(>                 (
mirror_node(>                     select token_id
mirror_node(>                     from nft_history
mirror_node(>                     where token_id = cast(:tokenId as bigint)
mirror_node(>                         and timestamp_range @> :blockTimestamp
mirror_node(>                         and deleted is not true
mirror_node(>                 )
mirror_node(>             ) as n
mirror_node->             join entity e on e.id = n.token_id
mirror_node->             where (e.deleted is not true or lower(e.timestamp_range) > :blockTimestamp);
                                                                               QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=508402.27..508402.28 rows=1 width=8) (actual time=1875.175..1945.054 rows=1 loops=1)
   ->  Gather  (cost=508402.06..508402.27 rows=2 width=8) (actual time=1869.355..1945.042 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=507402.06..507402.07 rows=1 width=8) (actual time=1864.376..1864.382 rows=1 loops=3)
               ->  Parallel Hash Join  (cost=334738.70..506293.06 rows=443598 width=0) (actual time=1755.216..1836.312 rows=362558 loops=3)
                     Hash Cond: (nft_history.token_id = e.id)
                     ->  Parallel Append  (cost=0.41..151714.45 rows=1067173 width=8) (actual time=58.158..305.202 rows=362558 loops=3)
                           ->  Index Scan using nft_history__timestamp_range on nft_history  (cost=0.41..2.64 rows=1 width=8) (actual time=0.167..0.204 rows=8 loops=1)
                                 Index Cond: (timestamp_range @> '1712803311992101593'::bigint)
                                 Filter: ((deleted IS NOT TRUE) AND (token_id = '4189968'::bigint))
                                 Rows Removed by Filter: 25
                           ->  Parallel Bitmap Heap Scan on nft  (cost=16246.63..135704.22 rows=444655 width=8) (actual time=87.073..264.435 rows=362555 loops=3)
                                 Recheck Cond: (token_id = 4189968)
                                 Filter: ((deleted IS NOT TRUE) AND (timestamp_range @> '1712803311992101593'::bigint))
                                 Rows Removed by Filter: 3
                                 Heap Blocks: exact=17964
                                 ->  Bitmap Index Scan on nft_pkey  (cost=0.00..15979.84 rows=1084414 width=0) (actual time=74.780..74.780 rows=1100943 loops=1)
                                       Index Cond: (token_id = 4189968)
                     ->  Parallel Hash  (cost=298196.54..298196.54 rows=2227260 width=8) (actual time=1059.342..1059.343 rows=1786862 loops=3)
                           Buckets: 2097152  Batches: 8  Memory Usage: 42656kB
                           ->  Parallel Seq Scan on entity e  (cost=0.00..298196.54 rows=2227260 width=8) (actual time=0.015..657.498 rows=1786862 loops=3)
                                 Filter: ((deleted IS NOT TRUE) OR (lower(timestamp_range) > '1712803311992101593'::bigint))
                                 Rows Removed by Filter: 6275
 Planning Time: 0.617 ms
 Execution Time: 1945.199 ms


mirror_node=> explain analyze             select count(*)
            from nft as n
            left join entity e on e.id = n.token_id
            where token_id = :tokenId and
              n.created_timestamp <= :blockTimestamp and
              (n.deleted is false or lower(n.timestamp_range) > :blockTimestamp) and
              (e.deleted is not true or lower(e.timestamp_range) > :blockTimestamp);
                                                                                  QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=148463.90..148463.91 rows=1 width=8) (actual time=496.078..499.523 rows=1 loops=1)
   ->  Gather  (cost=148463.69..148463.90 rows=2 width=8) (actual time=495.181..499.510 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=147463.69..147463.70 rows=1 width=8) (actual time=489.136..489.141 rows=1 loops=3)
               ->  Hash Left Join  (cost=16250.70..146348.83 rows=445942 width=0) (actual time=110.099..459.435 rows=362558 loops=3)
                     Hash Cond: (n.token_id = e.id)
                     Filter: ((e.deleted IS NOT TRUE) OR (lower(e.timestamp_range) > '1712803311992101593'::bigint))
                     ->  Parallel Bitmap Heap Scan on nft n  (cost=16248.04..137964.82 rows=447005 width=8) (actual time=109.901..334.903 rows=362558 loops=3)
                           Recheck Cond: (token_id = 4189968)
                           Filter: ((created_timestamp <= '1712803311992101593'::bigint) AND ((deleted IS FALSE) OR (lower(timestamp_range) > '1712803311992101593'::bigint)))
                           Heap Blocks: exact=18413
                           ->  Bitmap Index Scan on nft_pkey  (cost=0.00..15979.84 rows=1084414 width=0) (actual time=99.893..99.894 rows=1100943 loops=1)
                                 Index Cond: (token_id = 4189968)
                     ->  Hash  (cost=2.65..2.65 rows=1 width=23) (actual time=0.074..0.075 rows=1 loops=3)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Index Scan using entity_pkey on entity e  (cost=0.43..2.65 rows=1 width=23) (actual time=0.054..0.056 rows=1 loops=3)
                                 Index Cond: (id = 4189968)
 Planning Time: 0.267 ms
 Execution Time: 499.607 ms

Checklist

  • Documented (Code comments, README, etc.)
  • Tested (unit, integration, etc.)

Signed-off-by: Xin Li <xin@swirldslabs.com>
@xin-hedera xin-hedera self-assigned this Apr 11, 2024
@xin-hedera xin-hedera linked an issue Apr 11, 2024 that may be closed by this pull request
@codecov-commenter
Copy link

codecov-commenter commented Apr 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.32%. Comparing base (844b4f3) to head (c6c5fbc).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #8070      +/-   ##
============================================
- Coverage     92.33%   92.32%   -0.02%     
+ Complexity     7182     7180       -2     
============================================
  Files           893      893              
  Lines         29100    29100              
  Branches       3508     3508              
============================================
- Hits          26870    26867       -3     
- Misses         1447     1449       +2     
- Partials        783      784       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,3 @@
drop index if exists nft_history__timestamp_range;
create index if not exists nft_history__account_timestamp_range
on nft_history using gist (account_id, timestamp_range);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btree_gist index gives us the best performance and it's much better than different btree indexes tested.

There may be concern if requiring btree_gist index will break other mirror node operators, I believe it should be fine as it's a trusted extension in default installation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding breaking label. Let's remember to mention in the release notes upgrading section.

@xin-hedera xin-hedera added bug Type: Something isn't working performance web3 Area: Web3 API labels Apr 11, 2024
@xin-hedera xin-hedera added this to the 0.103.0 milestone Apr 11, 2024
}

func (d dbParams) toDsn() string {
return fmt.Sprintf("postgres://%s:%s@%s/%s?sslmode=disable", d.username, d.password, d.endpoint, d.name)
return fmt.Sprintf("postgres://%s:%s@%s/%s?sslmode=disable", ownerUsername, d.ownerPassword, d.endpoint, dbName)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resort to use mirror_node user in rosetta integration tests due to the cleanup script and deleting data in some test case setup.

ideally since now it's changed to use the same init.sh as other modules, we should really use mirror_rosetta which is readonly to run the integration tests. Will create a ticket as follow-up.

Signed-off-by: Xin Li <xin@swirldslabs.com>
@xin-hedera xin-hedera force-pushed the 8064-slow-nft-related-db-queries-cause-web3-contractcall-to-return-500 branch from 3c47825 to 84251dc Compare April 11, 2024 05:10
}

options := &dockertest.RunOptions{
Name: getDbHostname(network.Network),
Repository: "postgres",
Tag: "14-alpine",
Env: env,
Mounts: []string{fmt.Sprintf("%s/../hedera-mirror-importer/src/main/resources/db/scripts/init.sh:/docker-entrypoint-initdb.d/init.sh", moduleRoot)},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the module dockertest requires host path to be absolute

Signed-off-by: Xin Li <xin@swirldslabs.com>
Signed-off-by: Xin Li <xin@swirldslabs.com>
@xin-hedera xin-hedera marked this pull request as ready for review April 11, 2024 13:46
@xin-hedera xin-hedera requested a review from a team April 11, 2024 13:47
@@ -2196,6 +2193,32 @@ private EntityId nftPersistHistorical(
.timestampRange(Range.openClosed(
historicalBlock.lowerEndpoint() - 1, historicalBlock.upperEndpoint() + 1)))
.persist();

// nft table
domainBuilder
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for any nft in the nft_history table, there must be the same nft by (token_id, serial) in nft table.

Signed-off-by: Xin Li <xin@swirldslabs.com>
Copy link

sonarcloud bot commented Apr 11, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Member

@steven-sheehy steven-sheehy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@steven-sheehy steven-sheehy added the breaking Contains a breaking change that warrants mention in the release notes label Apr 11, 2024
Copy link
Contributor

@edwin-greene edwin-greene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@xin-hedera xin-hedera merged commit 56321ab into main Apr 12, 2024
26 checks passed
@xin-hedera xin-hedera deleted the 8064-slow-nft-related-db-queries-cause-web3-contractcall-to-return-500 branch April 12, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Contains a breaking change that warrants mention in the release notes bug Type: Something isn't working performance web3 Area: Web3 API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slow NFT related db queries cause web3 /contract/call to return 500
4 participants