Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ignore PreVote because the leader's lease is still valid #79

Closed
marsbible opened this issue Mar 29, 2019 · 12 comments
Closed

ignore PreVote because the leader's lease is still valid #79

marsbible opened this issue Mar 29, 2019 · 12 comments
Labels
bug Something isn't working question Further information is requested
Projects
Milestone

Comments

@marsbible
Copy link

hi,
当前拿两个节点操作,经过几次添加删除节点操作后,出现了下面的日志,请问这种情况如何恢复?谢谢。

2019-03-29 15:34:27,010 INFO
Bolt-default-executor-6-thread-16 - Node <test/127.0.0.1:8080> ignore PreVote from 127.0.0.1:8081 in term 5 currTerm 4, because the leader 127.0.0.1:8080's lease is still valid.

@fengjiachun
Copy link
Contributor

fengjiachun commented Mar 29, 2019

你好,这条日志的意思是当前 leader 节点拒绝了一个 pre-vote, 因为当前 leader 的 lease 依然有效,这是正常的情况,不需要恢复什么

@fengjiachun fengjiachun added the question Further information is requested label Mar 29, 2019
@marsbible
Copy link
Author

你好,这条日志的意思是当前 leader 节点拒绝了一个 pre-vote, 因为当前 leader 的 lease 依然有效,这是正常的情况,不需要恢复什么

嗯,貌似这个lease时间没找到配置项,确实不是问题,不过在同步快照的时候抛异常了,麻烦看看对吗?
我来描述下我的步骤,我的目标是由1个节点扩展到2个节点,并且让第二个节点自动跟上节点1的进度。

  1. 第一个节点的initialConf只配置它自己,然后启动写入一些数据
  2. 第二个节点从零开始,initialConf配置第一个节点和它自己,
  3. 第一个节点使用addPeer增加第二个节点,这时候第二个节点来下载快照,但是抛出异常了(而且是连续持续的抛出同一个异常,貌似在疯狂重试)。

1111

@fengjiachun
Copy link
Contributor

能不能详细说下,你的OS是? 你的 jraft 详细配置可以贴一下?

@fengjiachun
Copy link
Contributor

@marsbible https://github.com/alipay/sofa-jraft/tree/fixbug/get_file_eof
可以用这个分支再试一下吗?

@fengjiachun
Copy link
Contributor

猜测可能是快照分段安装的场景我们测试没覆盖到,方便的话 jraft 详细配置贴一下

@marsbible
Copy link
Author

能不能详细说下,你的OS是? 你的 jraft 详细配置可以贴一下?

os是windows,我也可以在linux上试试。
配置很简单,没有特殊的配置,快照就是一个自定义的二进制文件,两三M左右。
data_dir: './raft/'
snapshot_interval: 60 # in second
election_timeout: 5 # in second
peers: ['127.0.0.1:8080','127.0.0.1:8081']
local_addr: '127.0.0.1:8080'
rpc_timeout: 5 # in second

@fengjiachun
Copy link
Contributor

OK, 多谢,我正在重现你这个场景,建议你也使用我上面的分支测一下,应该是解决问题了
另外建议再 linux 下测试 jraft, 我们目前还没有在 window 下对 jraft 进行测试,详情见 #55

@marsbible
Copy link
Author

OK, 多谢,我正在重现你这个场景,建议你也使用我上面的分支测一下,应该是解决问题了
另外建议再 linux 下测试 jraft, 我们目前还没有在 window 下对 jraft 进行测试,详情见 #55

刚在win10上的wsl-ubuntu-18.04测试,这个问题依旧存在,稍后我用你的分支测试一下。

@fengjiachun
Copy link
Contributor

@marsbible 恩, 这是个 bug,在快照分段传输时会触发,我在上面的分支修复了,应该可以了

@fengjiachun fengjiachun added the bug Something isn't working label Mar 29, 2019
@killme2008
Copy link
Contributor

很惭愧,这个 bug 我们的测试竟然没有覆盖到,争取尽快发一个版本,感谢反馈。

@marsbible
Copy link
Author

marsbible commented Mar 29, 2019

简单测试了下fixbug的版本,暂时没有发现问题了,谢谢

@fengjiachun
Copy link
Contributor

非常惭愧,我们会尽快发布v1.2.5版本,预计会在下周一发布,这之前我们要仔细测试一下

@fengjiachun fengjiachun added this to the 1.2.5 milestone Mar 30, 2019
@fengjiachun fengjiachun added this to To do in v1.2.5 via automation Mar 30, 2019
@fengjiachun fengjiachun mentioned this issue Apr 1, 2019
4 tasks
v1.2.5 automation moved this from To do to Done Apr 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
No open projects
v1.2.5
  
Done
Development

No branches or pull requests

3 participants