-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redefining SURVIVABLE_NONATOMIC
to use atomic writes but to skip fsync(2)
#163
Conversation
src/main/java/org/jenkinsci/plugins/workflow/flow/FlowDurabilityHint.java
Outdated
Show resolved
Hide resolved
There is a problem with core Javadoc. I am looking into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the title is incorrect - writes are definatly not atomic with this change (writes never are atomic)
The move of the file after writing may be atomic, however it also may not be. This is both OperatingSystem and file system dependant, and the atomicity of the move (overwrite/replace) is best effort.
thus this not correct to say it will tollerate JVM crashes. It may for some users of some operating systems on some filesystems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update comments to reflect what is actually happening and that this is both OS and filesystem dependant
@@ -1,6 +1,6 @@ | |||
FlowDurabilityHint.PERFORMANCE_OPTIMIZED.description=Performance-optimized: much faster (requires clean shutdown to save running pipelines) | |||
FlowDurabilityHint.PERFORMANCE_OPTIMIZED.tooltip=Avoids writing data with every step, avoids atomic writes of data. Pipelines can resume if Jenkins shuts down cleanly, but running pipelines lose step information and cannot resume if Jenkins unexpectedly fails. | |||
FlowDurabilityHint.SURVIVABLE_NONATOMIC.description=Less durability, a bit faster (specialty use only) | |||
FlowDurabilityHint.SURVIVABLE_NONATOMIC.tooltip=Writes data with every step but avoids atomic writes. On some filesytems this is faster than maximum durability mode, but running pipeline data may be lost if disk writes are interrupted or fail. | |||
FlowDurabilityHint.SURVIVABLE_NONATOMIC.description=Less durability, a bit faster (requires stable OS and storage but tolerates dirty JVM shutdown) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FlowDurabilityHint.SURVIVABLE_NONATOMIC.description=Less durability, a bit faster (requires stable OS and storage but tolerates dirty JVM shutdown) | |
FlowDurabilityHint.SURVIVABLE_NONATOMIC.description=Less durability, a bit faster (requires stable OS and storage but may tolerate a dirty JVM shutdown if certain OS/Filesystem combinations are met) |
it is important to let users know that this will not always tollerate a JVM shutdown. it is dependant on the OS/Filessytem support for Atomic moves (POSIX compliance). Not going to provide a list of combinations that do / do not support this. If some guidance is necessary then the requirement is generally a local block device with a reasonably modern file system on a Unix-like operating system.
It may be because it is pipeline and these small files are only ever written and not modified, that things are better than if updating something, but if anything else in pipeline is written and would assume that the file is there (program.dat?) then I would expect you are still in the realms of dragons.
FlowDurabilityHint.SURVIVABLE_NONATOMIC.description=Less durability, a bit faster (specialty use only) | ||
FlowDurabilityHint.SURVIVABLE_NONATOMIC.tooltip=Writes data with every step but avoids atomic writes. On some filesytems this is faster than maximum durability mode, but running pipeline data may be lost if disk writes are interrupted or fail. | ||
FlowDurabilityHint.SURVIVABLE_NONATOMIC.description=Less durability, a bit faster (requires stable OS and storage but tolerates dirty JVM shutdown) | ||
FlowDurabilityHint.SURVIVABLE_NONATOMIC.tooltip=Writes data with every step but avoids flushing the page cache to the storage device. On some filesytems this is faster than maximum durability mode, but running pipeline data may be lost if disk writes are interrupted or fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FlowDurabilityHint.SURVIVABLE_NONATOMIC.tooltip=Writes data with every step but avoids flushing the page cache to the storage device. On some filesytems this is faster than maximum durability mode, but running pipeline data may be lost if disk writes are interrupted or fail. | |
FlowDurabilityHint.SURVIVABLE_NONATOMIC.tooltip=Writes data with every step but avoids flushing the page cache for the specific file to the storage device. On some filesytems this is faster than maximum durability mode, but running pipeline data may be lost if disk writes are interrupted or fail, or if the JVM terminates abruptly in some OS and filesystem combinations. |
@basil are you still interested in this PR, do you want to look at the suggestions? |
FYI I have recently been checking behavior of Jenkins when restored from EBS snapshots (using default durability settings) and do occasionally see weird cases of stray |
I do not accept your suggestions. |
See JENKINS-66001 and jenkinsci/jenkins#5599. Part 2 of a 5-part series to make Pipeline's
SURVIVABLE_NONATOMIC
mode behave as advertised in the documentation (parts 3 through 5 are in jenkinsci/workflow-support-plugin#120, jenkinsci/workflow-cps-plugin#452, and jenkinsci/workflow-job-plugin#199 respectively).I tested this by running a Pipeline job with 100 steps in
SURVIVABLE_NONATOMIC
mode while simultaneously attaching a remote debugger toXmlFile
and monitoringfsync(2)
calls withsyncsnoop.bt
. I confirmed thatAtomicFileWriter
was being used in the Java debugger, and I confirmed thatfsync(2)
was not being used withsyncsnoop.bt
.