-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RATIS-523 RATIS-524 RATIS-525 RATIS-526 RATIS-527 RATIS-533 Lots of cleanup on the LogService #18
Conversation
b59d0a7
to
f0d8ea8
Compare
Was getting following exception in the service May 02, 2019 3:22:56 AM org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue While the verification tool is saying that writes are successful. |
RaftClientConfigKeys.Rpc.setRequestTimeout(properties, TimeDuration.valueOf(100, TimeUnit.SECONDS)); | ||
|
||
// Increase the segment size to avoid rolling so quickly | ||
SizeInBytes segmentSize = SizeInBytes.valueOf("32MB"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why hardcoding? can't we do it through some config? or it is just default and will be overridden in later step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one was weird. I was playing with making it configurable, but changes to it had some really harmful implications (things were just hanging when I increased this, for example).
Edit: was thinking about the write-buffer size. I don't think I played around with this.
I left it here as something for us to come back to later, but not something I wanted to have people tweak. Maybe it's OK to still let this be configurable? (with a big-fat-warning :P)
TimeDuration leaderElectionMaxTimeout = TimeDuration.valueOf( | ||
leaderElectionTimeoutMin.toLong(TimeUnit.MILLISECONDS) + 200, | ||
TimeUnit.MILLISECONDS); | ||
RaftServerConfigKeys.Rpc.setTimeoutMax(properties, leaderElectionMaxTimeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, why are we hardcoding these values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a copy-paste out of Ozone. I don't have a good understand of why they chose these.
Was going for just calling out things we want to set in Ratis, but leaving the actual configuration to the work Vlad is going to get to.
a snapshot is an equivalent representation of all of the updates from the log. However, the LogService | ||
is written to expect that we maintain these records. As such, we must not allow snapshots to automatically | ||
happen as we may lose records from teh RAFT log. `RaftServerConfigKeys.Snapshot.setAutoTriggerEnabled()` | ||
defaults to `false` and should not be set to `true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good tuning guide.
But Snapshot.setAutoTriggerEnabled() config doesn't look like a tuning, as it impacts the log service, shouldn't we throw an exception if it is set to true or explicitly set to false in code(just to avoid human errors)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we throw an exception if it is set to true or explicitly set to false in code(just to avoid human errors)
Very good suggestion. Let me change that.
Yeah, I've seen this too, but I don't know what group this is actually referring to. Something else to dig into. I was ignoring it for now.
I'm guessing something around gRPC in the Ratis client code isn't closing stuff correctly in all situations. Another thing to come back to as it doesn't seem to be an immediate problem :) I've seen both of these since doing stuff in the LogService. I don't think they're new. |
* Add lots of new options * Fix LOG/stdout use * Remove logs if they already exist before writing
Include brief summary from the Jira issue as to what/why we changed, things we might want to tune in the future, and validate that config isn't set which would break the log service.
f0d8ea8
to
538ef5e
Compare
Rebased the branch. 538ef5e has some basic validation to make sure we don't have |
The refactor and new docker scripts for docker handling are good @joshelser. |
Thanks for looking, Rajeshbabu! |
Include brief summary from the Jira issue as to what/why we changed, things we might want to tune in the future, and validate that config isn't set which would break the log service. Closes apache#18
Lots of various things that help clean up the logservice docker integration. Things seem to be repeatable now that I'm using docker on a linux machine (instead of docker for mac).
cc/ @ankitsinghal @VladRodionov @chrajeshbabu