Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend_sync.cpp(342): copy blocks too long, flush #1105

Open
Cavinlz opened this issue Jul 6, 2017 · 1 comment
Open

backend_sync.cpp(342): copy blocks too long, flush #1105

Cavinlz opened this issue Jul 6, 2017 · 1 comment

Comments

@Cavinlz
Copy link

Cavinlz commented Jul 6, 2017

今天数据在同步过程中, 主库服务器卡住了, 客户端链接超时, 服务器CPU暴涨。查看了下ssdb.log发现如下信息

2017-07-06 17:29:33.292 [INFO ] backend_sync.cpp(342): copy blocks too long, flush
2017-07-06 17:29:33.293 [WARN ] server.cpp(319): long loop time: 3.520
2017-07-06 17:29:33.294 [INFO ] backend_sync.cpp(130): (服务器IP):53243 fd: 570, send error: Broken pipe
2017-07-06 17:29:33.294 [INFO ] backend_sync.cpp(139): Sync Client quit,  (服务器IP):53243 fd: 570, delete link
2017-07-06 17:29:33.294 [INFO ] backend_sync.cpp(342): copy blocks too long, flush

进一步看下系统日志 ,有如下 报错:

Jul  6 17:33:35  kernel: INFO: task ssdb-server:39772 blocked for more than 120 seconds.
Jul  6 17:33:35  kernel:      Not tainted 2.6.32-431.el6.x86_64 #1
Jul  6 17:33:35  kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  6 17:33:35  kernel: ssdb-server   D 000000000000000a     0 39772      1 0x00000080
Jul  6 17:33:35  kernel: ffff8808ec519cf0 0000000000000086 ffffffff810792c7 0000000000000000
Jul  6 17:33:35  kernel: ffff881071845408 ffff8808ec519ca8 ffffffff81054839 ffffea000000002a
Jul  6 17:33:35  kernel: ffff8808ec93e5f8 ffff8808ec519fd8 000000000000fbc8 ffff8808ec93e5f8
Jul  6 17:33:35  kernel: Call Trace:
Jul  6 17:33:35  kernel: [<ffffffff810792c7>] ? current_fs_time+0x27/0x30
Jul  6 17:33:35  kernel: [<ffffffff81054839>] ? __wake_up_common+0x59/0x90
Jul  6 17:33:35  kernel: [<ffffffff81529f85>] rwsem_down_failed_common+0x95/0x1d0
Jul  6 17:33:35  kernel: [<ffffffff81193d0a>] ? pipe_write+0x31a/0x6a0
Jul  6 17:33:35  kernel: [<ffffffff8152a116>] rwsem_down_read_failed+0x26/0x30
Jul  6 17:33:35  kernel: [<ffffffff8128e854>] call_rwsem_down_read_failed+0x14/0x30
Jul  6 17:33:35  kernel: [<ffffffff81529614>] ? down_read+0x24/0x30
Jul  6 17:33:35  kernel: [<ffffffff8104a92e>] __do_page_fault+0x18e/0x480
Jul  6 17:33:35  kernel: [<ffffffff8152d45e>] do_page_fault+0x3e/0xa0
Jul  6 17:33:35  kernel: [<ffffffff8152a815>] page_fault+0x25/0x30
Jul  6 17:33:35  kernel: INFO: task ssdb-server:39773 blocked for more than 120 seconds.
Jul  6 17:33:35  kernel:      Not tainted 2.6.32-431.el6.x86_64 #1
Jul  6 17:33:35  kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

请教下, 不知道什么原因 ? ssdb版本 1.9.4

@ideawu
Copy link
Owner

ideawu commented Jul 18, 2017

这个日志表示 master 在复制的过程中,读取硬盘速度太慢。主要原因是

  1. master 收到的写请求较大,写压力太大
  2. master 的数据库读取太慢

1的解决方案很明显。2的解决方案是找时间在 ssdb-cli 执行 compact 命令进行数据整理(可能会影响正常服务)。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants