-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Standalone restarted once during test #27145
Comments
also failed for standalone with Kafka as mq log: |
not see any errors in the logs, except the following one:
|
/assign @czs007 |
I don't see a complete log. what is the test case? |
/assign @yiwangdr |
|
Timeline:
This segment |
Note that
^ this log is 1 sec before crash. So the datanode was trying to flushBufferData for a sealed segment. |
complete log from standalone fyi |
The datanode failed to fetch meta data for a sealed segment. The error This issue is probably caused by #27063, which introduced changes to both flush and channel meta. More investigation is needed. |
DataNode is going to sync this segment: however, before sync executing, this segment was been compacted: leed to |
Before #27063, sealed segment only be sync every 10min, after #27063, sealed segment will be sync when A possible quick fix: |
I tried to revert most of #27063 and the tests passed.
It prevents false positives but the downside is that we may introduce true negatives. @yanliang567 @bigsheeper @czs007 Is this PR critical to 2.3.1 release? If not, I'd prefer pr revert to unblock release. |
[2023/09/15 16:49:02.277 +00:00] [DEBUG] [datacoord/meta.go:1110] ["meta update: alter meta store for compaction updates"] ["compact from segments (segments to be updated as dropped)"="[444280681153630609,444280681155031143,444280681155031012,444280681155030896]"] ["new segmentID"=444280681153633878] [binlog=7] ["stats log"=1] ["delta logs"=4] ["compact to segment"=444280681153633878] |
on 16:49, the segment is compacted. I agreed thttps://github.com//pull/27152 might not be rootcasue, but flush is triggered more often than before |
/assign @zhuwenxing |
Is there an existing issue for this?
Environment
Current Behavior
some tests failed due to the standalone pod restarted
Expected Behavior
Steps To Reproduce
No response
Milvus Log
failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_for_release_cron/detail/deploy_test_for_release_cron/909/pipeline
log:
artifacts-rocksmq-standalone-reinstall-909-pytest-logs.tar.gz
artifacts-rocksmq-standalone-reinstall-909-server-logs (1).tar.gz
Anything else?
No response
The text was updated successfully, but these errors were encountered: