Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Following restart node continues to log "Commitlog shutdown complete" messages #11508

Closed
KnifeyMoloko opened this issue Sep 8, 2022 · 6 comments
Assignees
Labels
Milestone

Comments

@KnifeyMoloko
Copy link
Contributor

Installation details

Kernel Version: 5.15.0-1019-aws
Scylla version (or git commit hash): 2022.2.0~rc1-20220902.a9bc6d191031 with build-id 074a0cb9e6a5ab36ba5e7f81385e68079ab6eeda
Cluster size: 5 nodes (i3.4xlarge)

Scylla Nodes used in this run:

  • longevity-mv-si-4d-2022-2-db-node-371f29c2-7 (18.202.36.168 | 10.4.10.185) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-371f29c2-6 (34.243.200.187 | 10.4.8.244) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-371f29c2-5 (54.216.151.114 | 10.4.9.31) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-371f29c2-4 (34.249.121.120 | 10.4.9.199) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-371f29c2-3 (54.217.149.108 | 10.4.11.117) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-371f29c2-2 (63.34.22.20 | 10.4.11.206) (shards: 14)
  • longevity-mv-si-4d-2022-2-db-node-371f29c2-1 (3.249.105.79 | 10.4.8.237) (shards: 14)

OS / Image: ami-03d21402486bbce67 (aws: eu-west-1)

Test: longevity-mv-si-4days-test
Test id: 371f29c2-5d87-4a5a-a487-a974ebf7edd1
Test name: enterprise-2022.2/longevity/longevity-mv-si-4days-test
Test config file(s):

Issue description

>>>>>>>
During RollingConfigChangeInternodeCompression we change the internode compression settings (duh!) and restart the nodes in the cluster in a rolling fashion.

During the shutdown process we get the expected log line informing about the commitlog shutdown:

node 1 db log:

Sep 07 10:57:44 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[7791]:  [shard  3] commitlog - Commitlog shutdown complete
Sep 07 10:57:44 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[7791]:  [shard  5] commitlog - Commitlog shutdown complete
Sep 07 10:57:44 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[7791]:  [shard 11] commitlog - Commitlog shutdown complete

However this message gets logged after the restart every few seconds:

node 1 db log:

Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  5] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  2] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 11] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 11] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  4] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  7] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 11] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 13] commitlog - Commitlog shutdown complete
Sep 07 10:59:57 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 13] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  8] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 12] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  8] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  8] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard 12] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  3] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  2] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  4] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  2] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  1] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  1] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  9] commitlog - Commitlog shutdown complete
Sep 07 11:00:07 longevity-mv-si-4d-2022-2-db-node-371f29c2-1 scylla[11779]:  [shard  9] commitlog - Commitlog shutdown complete

This continues on all the db nodes, through 7+ hours of the test, taking up ~85% of the db nodes log output:

╰─○ cat system.log | wc -l && cat system.log | grep "Commitlog shutdown complete" | wc -l
338698
288716

<<<<<<<

  • Restore Monitor Stack command: $ hydra investigate show-monitor 371f29c2-5d87-4a5a-a487-a974ebf7edd1
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 371f29c2-5d87-4a5a-a487-a974ebf7edd1

Logs:

Jenkins job URL

@michoecho
Copy link
Contributor

2e1ec36 added shutdown logging to the commitlog. It was probably overlooked that hinted handoff shuts down a commitlog every 10s. At:

return cptr->shutdown().finally([cptr] {

I have no idea about hints; I just followed the code up from the log message. So I don't have any fix ideas.

cc @xemul

@xemul
Copy link
Contributor

xemul commented Sep 9, 2022

@michoecho , I probably mis-understood the commitlog call sequence and put the logging into wrong place. The intention was to print something into logs when database shuts down its commitlog on scylla stop/abort. Feel free to remove this log line if you think it's not applicable there.

@kostja
Copy link
Contributor

kostja commented Sep 9, 2022

I see it in my tests as well.

@michoecho
Copy link
Contributor

@xemul I think you put it in the place you intended. The problem is, the main database commitlog isn't the only commitlog in the program. Hinted handoff reuses the implemention for it's own little transient commitlogs, which are destroyed and recreated (or just shut down and reopened, I don't know how exactly it works) every 10s, and this causes the log spam.

michoecho added a commit to michoecho/scylla that referenced this issue Sep 13, 2022
The intention was for these logs to be printed during the
database shutdown sequence, but it was overlooked that it's not
the only place where commitlog::shutdown is called.
Commitlogs are started and shut down periodically by hinted handoff.
When that happens, these messages spam the log.

Fix that by adding INFO commitlog shutdown logs to database::stop,
and change the level of the commitlog::shutdown log call to DEBUG.

Fixes scylladb#11508
@roydahan roydahan added this to the 5.1 milestone Sep 14, 2022
@michoecho
Copy link
Contributor

A fix (#11536) is already queued.

avikivity pushed a commit that referenced this issue Sep 18, 2022
The intention was for these logs to be printed during the
database shutdown sequence, but it was overlooked that it's not
the only place where commitlog::shutdown is called.
Commitlogs are started and shut down periodically by hinted handoff.
When that happens, these messages spam the log.

Fix that by adding INFO commitlog shutdown logs to database::stop,
and change the level of the commitlog::shutdown log call to DEBUG.

Fixes #11508

Closes #11536

(cherry picked from commit 9b6fc55)
@avikivity
Copy link
Member

Backported to 5.1. Earlier releases are unaffected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants