Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate reaping sql timeouts in horizon #5320

Open
aditya1702 opened this issue May 16, 2024 · 1 comment
Open

Investigate reaping sql timeouts in horizon #5320

aditya1702 opened this issue May 16, 2024 · 1 comment
Labels

Comments

@aditya1702
Copy link
Contributor

aditya1702 commented May 16, 2024

What version are you using?

Horizon: 2.30.0
Go: 1.22

What did you do?

What did you expect to see?

If history-retention is set, reaping should succeed without any issues:

time="2024-05-16T19:45:41.770Z" level=info msg="reaper succeeded" new_elder=51525683 pid=633297

What did you see instead?

We have notice reports of reaping failing due to sql timeout errors:

time="2024-05-09T18:50:31.614+08:00" level=warning msg="Error reaping lookup tables" err="error running query: delete from history_assets where id IN (select id from (select id, (select 1 from history_trades where base_asset_id = hcb.id limit 1) as c0, (select 1 from history_trades where counter_asset_id = hcb.id limit 1) as c1, (select 1 from history_trades_60000 where base_asset_id = hcb.id limit 1) as c2, (select 1 from history_trades_60000 where counter_asset_id = hcb.id limit 1) as c3, 1 as cx from history_assets hcb where id >= 1001 order by id limit 1000) as sub where c0 IS NULL and c1 IS NULL and c2 IS NULL and c3 IS NULL and 1=1);: canceling statement due to lack of response within timeout period" pid=261239 service=ingest
@airstring
Copy link

airstring commented May 17, 2024

stellar-core, stellar-horizon, and postgres share the same host, which is 8c16GB , and a ssd with a maximum of 5000 iops.

# curl http://localhost:8000
"core_version": "stellar-core 20.4.0 (7fc7671b8bc1ccc3b1f16a6ab83bc9f671db8b70)",
  "ingest_latest_ledger": 51707393,
  "history_latest_ledger": 51707393,
  "history_latest_ledger_closed_at": "2024-05-17T06:44:26Z",
  "history_elder_ledger": 51655384,
# grep "reap" stellar-horizon.log
May 17 07:47:22 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T07:47:22.126+08:00" level=warning msg="Error reaping lookup tables" err="error running query: delete from history_accounts where id IN (select id from (select id, (select 1 from history_effects where history_account_id = hcb.id limit 1) as c0, (select 1 from history_operation_participants where history_account_id = hcb.id limit 1) as c1, (select 1 from history_trades where base_account_id = hcb.id limit 1) as c2, (select 1 from history_trades where counter_account_id = hcb.id limit 1) as c3, (select 1 from history_transaction_participants where history_account_id = hcb.id limit 1) as c4, 1 as cx from history_accounts hcb where id >= 0 order by id limit 1000) as sub where c0 IS NULL and c1 IS NULL and c2 IS NULL and c3 IS NULL and c4 IS NULL and 1=1);: canceling statement due to lack of response within timeout period" pid=747541 service=ingest
May 17 08:25:26 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T08:25:26.291+08:00" level=info msg="reaper: clearing" end_ledger=51651784 pid=747541 start_ledger=51651179
May 17 08:25:33 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T08:25:33.840+08:00" level=info msg="reaper succeeded" new_elder=51651785 pid=747541
May 17 09:25:33 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T09:25:33.842+08:00" level=info msg="reaper: clearing" end_ledger=51652383 pid=747541 start_ledger=51651785
May 17 09:25:46 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T09:25:46.519+08:00" level=info msg="reaper succeeded" new_elder=51652384 pid=747541
May 17 10:25:46 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T10:25:46.519+08:00" level=info msg="reaper: clearing" end_ledger=51652985 pid=747541 start_ledger=51652384
May 17 10:26:06 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T10:26:06.641+08:00" level=info msg="reaper succeeded" new_elder=51652986 pid=747541
May 17 10:52:12 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T10:52:12.671+08:00" level=warning msg="Error reaping lookup tables" err="error running query: delete from history_accounts where id IN (select id from (select id, (select 1 from history_effects where history_account_id = hcb.id limit 1) as c0, (select 1 from history_operation_participants where history_account_id = hcb.id limit 1) as c1, (select 1 from history_trades where base_account_id = hcb.id limit 1) as c2, (select 1 from history_trades where counter_account_id = hcb.id limit 1) as c3, (select 1 from history_transaction_participants where history_account_id = hcb.id limit 1) as c4, 1 as cx from history_accounts hcb where id >= 0 order by id limit 1000) as sub where c0 IS NULL and c1 IS NULL and c2 IS NULL and c3 IS NULL and c4 IS NULL and 1=1);: canceling statement due to lack of response within timeout period" pid=747541 service=ingest
May 17 11:26:06 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T11:26:06.641+08:00" level=info msg="reaper: clearing" end_ledger=51653587 pid=747541 start_ledger=51652986
May 17 11:26:31 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T11:26:31.489+08:00" level=info msg="reaper succeeded" new_elder=51653588 pid=747541
May 17 11:53:46 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T11:53:46.428+08:00" level=warning msg="Error reaping lookup tables" err="error running query: delete from history_accounts where id IN (select id from (select id, (select 1 from history_effects where history_account_id = hcb.id limit 1) as c0, (select 1 from history_operation_participants where history_account_id = hcb.id limit 1) as c1, (select 1 from history_trades where base_account_id = hcb.id limit 1) as c2, (select 1 from history_trades where counter_account_id = hcb.id limit 1) as c3, (select 1 from history_transaction_participants where history_account_id = hcb.id limit 1) as c4, 1 as cx from history_accounts hcb where id >= 0 order by id limit 1000) as sub where c0 IS NULL and c1 IS NULL and c2 IS NULL and c3 IS NULL and c4 IS NULL and 1=1);: canceling statement due to lack of response within timeout period" pid=747541 service=ingest
May 17 12:26:31 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T12:26:31.491+08:00" level=info msg="reaper: clearing" end_ledger=51654186 pid=747541 start_ledger=51653588
May 17 12:26:59 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T12:26:59.427+08:00" level=info msg="reaper succeeded" new_elder=51654187 pid=747541
May 17 12:56:00 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T12:56:00.591+08:00" level=warning msg="Error reaping lookup tables" err="error running query: delete from history_accounts where id IN (select id from (select id, (select 1 from history_effects where history_account_id = hcb.id limit 1) as c0, (select 1 from history_operation_participants where history_account_id = hcb.id limit 1) as c1, (select 1 from history_trades where base_account_id = hcb.id limit 1) as c2, (select 1 from history_trades where counter_account_id = hcb.id limit 1) as c3, (select 1 from history_transaction_participants where history_account_id = hcb.id limit 1) as c4, 1 as cx from history_accounts hcb where id >= 0 order by id limit 1000) as sub where c0 IS NULL and c1 IS NULL and c2 IS NULL and c3 IS NULL and c4 IS NULL and 1=1);: canceling statement due to lack of response within timeout period" pid=747541 service=ingest
May 17 13:26:59 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T13:26:59.430+08:00" level=info msg="reaper: clearing" end_ledger=51654786 pid=747541 start_ledger=51654187
May 17 13:27:07 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T13:27:07.481+08:00" level=info msg="reaper succeeded" new_elder=51654787 pid=747541
May 17 13:57:21 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T13:57:21.901+08:00" level=warning msg="Error reaping lookup tables" err="error running query: delete from history_accounts where id IN (select id from (select id, (select 1 from history_effects where history_account_id = hcb.id limit 1) as c0, (select 1 from history_operation_participants where history_account_id = hcb.id limit 1) as c1, (select 1 from history_trades where base_account_id = hcb.id limit 1) as c2, (select 1 from history_trades where counter_account_id = hcb.id limit 1) as c3, (select 1 from history_transaction_participants where history_account_id = hcb.id limit 1) as c4, 1 as cx from history_accounts hcb where id >= 0 order by id limit 1000) as sub where c0 IS NULL and c1 IS NULL and c2 IS NULL and c3 IS NULL and c4 IS NULL and 1=1);: canceling statement due to lack of response within timeout period" pid=747541 service=ingest
May 17 14:27:07 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T14:27:07.481+08:00" level=info msg="reaper: clearing" end_ledger=51655383 pid=747541 start_ledger=51654787
May 17 14:27:26 ip-172-24-103-136 stellar-horizon[747541]: time="2024-05-17T14:27:26.726+08:00" level=info msg="reaper succeeded" new_elder=51655384 pid=747541

@sreuland sreuland modified the milestone: platform sprint 48 May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants