Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

payout sweep performance degrades quickly #76

Closed
roadscape opened this issue Feb 2, 2018 · 1 comment
Closed

payout sweep performance degrades quickly #76

roadscape opened this issue Feb 2, 2018 · 1 comment

Comments

@roadscape
Copy link
Contributor

Breaking this task out of #16 (index state management)
Also related to #52 (db performance)

The post payout sweep query quickly becomes the heaviest by far. Need move the payout sweep to in-memory payout queue and/or examine db perf. Currently it's the only read query against hive_posts_cache in the indexer, and may be a sign of performance issues the API layer will face.

The issue with the table is that it's large, wide, and every update is a full one. #16 has other subtasks ('events which affect payout/votes vs. content of posts', 'flush inserts asap, edits over an n-block period') which will keep writes smaller and help prevent fragmentation. Besides this, may need to look into partitioning the tables, definitely tune autovac, and more.

[DEBUG] total SQL time: 1800s
 85.2%  1533237ms    521.33avg     2941x -- SELECT post_id, author, permlink FROM hive_posts_cache WHERE is_paidout = '0' AND payout_at <= :date
  5.3%   94818ms      1.01avg    93974x -- UPDATE hive_posts_cache SET author = :author, permlink = :permlink, category = :category, depth = :depth, children = :children, title = :title, preview = :preview, body = :body, im
  1.5%   26630ms      0.57avg    46829x -- UPDATE hive_accounts SET proxy = :proxy, post_count = :post_count, reputation = :reputation, proxy_weight = :proxy_weight, vote_weight = :vote_weight, kb_used = :kb_used, active_at
  1.4%   25183ms      0.35avg    72779x -- SELECT tag FROM hive_post_tags WHERE post_id = :id
  1.2%   22499ms      0.56avg    40379x -- SELECT id FROM hive_posts WHERE author = :a AND permlink = :p
  1.1%   19431ms      0.73avg    26566x -- INSERT INTO hive_posts_cache (post_id, author, permlink, category, depth, children, title, preview, body, img_url, payout, promoted, payout_at, updated_at, created_at, rshares, vot
@roadscape
Copy link
Contributor Author

Moving to index-only scan on hpc and pulling the rest of the data from hp is holding up well.. after 12 hrs the total cost for each sweep averages 3.44ms.

  2.7%   44330ms      2.99avg    14819x -- SELECT post_id FROM hive_posts_cache WHERE is_paidout = '0' AND payout_at <= :date
  0.4%    6603ms      0.45avg    14770x -- SELECT id, author, permlink FROM hive_posts WHERE id IN :ids

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant