Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gce: make append_or_replace.. atomic #49897

Merged
merged 1 commit into from
Aug 2, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 5 additions & 2 deletions cluster/gce/gci/configure-helper.sh
Original file line number Diff line number Diff line change
Expand Up @@ -239,10 +239,13 @@ function append_or_replace_prefixed_line {
local -r file="${1:-}"
local -r prefix="${2:-}"
local -r suffix="${3:-}"
local -r dirname="$(dirname ${file})"
local -r tmpfile="$(mktemp -t filtered.XXXX --tmpdir=${dirname})"

touch "${file}"
awk "substr(\$0,0,length(\"${prefix}\")) != \"${prefix}\" { print }" "${file}" > "${file}.filtered" && mv "${file}.filtered" "${file}"
echo "${prefix}${suffix}" >> "${file}"
awk "substr(\$0,0,length(\"${prefix}\")) != \"${prefix}\" { print }" "${file}" > "${tmpfile}"
echo "${prefix}${suffix}" >> "${tmpfile}"
mv "${tmpfile}" "${file}"
Copy link
Member

@liggitt liggitt Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still not threadsafe... if two processes both write to different tmp files, one's mv back to the original can stomp the other's change. where are we calling this from multiple threads on a single file?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said in #49895 we're seeing corruption in this file - how exactly that happens is still unclear - this should at least prevent corruption. You're right in assessing that we now might loose changes to this file, I tend to believe that is a lesser evil.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's one theory. We are scrambling to find a fix for #49895. Copied from that bug:

Very rarely we are seeing known tokens files filled with the NULL character, with ASCII value 000. Restarting the master doesn't fix the issue because the known_tokens.csv is saved between restarts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, as long as we're clear that this isn't actually making the bash function atomic. the other thing this change does is make it so that a partially failed run of the awk command will still overwrite the file. previously, the mv command would only run if the awk command succeeded.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, as long as we're clear that this isn't actually making the bash function atomic

I'm considering adding an flock around "${file}".

previously, the mv command would only run if the awk command succeeded.

This behavior should still be preserved since the script runs with set -o errexit.

}

function create-node-pki {
Expand Down