Skip to content

Rename atomicWrite to writeToCompletion#5273

Merged
fuweid merged 1 commit intocontainerd:masterfrom
wzshiming:fix/local-store-atomic-write
May 27, 2021
Merged

Rename atomicWrite to writeToCompletion#5273
fuweid merged 1 commit intocontainerd:masterfrom
wzshiming:fix/local-store-atomic-write

Conversation

@wzshiming
Copy link
Contributor

@wzshiming wzshiming commented Mar 26, 2021

The function does not guarantee atomicity, because the same temporary file will be opened multiple times.

func atomicWrite(path string, data []byte, mode os.FileMode) error {
tmp := fmt.Sprintf("%s.tmp", path)
f, err := os.OpenFile(tmp, os.O_RDWR|os.O_CREATE|os.O_TRUNC|os.O_SYNC, mode)
if err != nil {
return errors.Wrap(err, "create tmp file")
}
_, err = f.Write(data)
f.Close()
if err != nil {
return errors.Wrap(err, "write atomic data")
}
return os.Rename(tmp, path)
}

Should be changed to write a random temporary file Instead of writing a fixed temporary file.
Like this https://github.com/containerd/continuity/blob/93e15499afd51cdc830773dc804c89c11673fc63/ioutils.go#L34-L63
Signed-off-by: Shiming Zhang wzshiming@foxmail.com

@k8s-ci-robot
Copy link

Hi @wzshiming. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wzshiming wzshiming force-pushed the fix/local-store-atomic-write branch from 163efd8 to 6bff768 Compare March 26, 2021 10:03
@fuweid
Copy link
Member

fuweid commented Mar 26, 2021

Sorry I didn't get your point. could you mind to explain more? thanks

@wzshiming
Copy link
Contributor Author

wzshiming commented Mar 26, 2021

Sorry I didn't get your point. could you mind to explain more? thanks

@fuweid
Should be changed to write a random temporary file Instead of writing a fixed temporary file.
Like this https://github.com/containerd/continuity/blob/93e15499afd51cdc830773dc804c89c11673fc63/ioutils.go#L34-L63

@theopenlab-ci
Copy link

theopenlab-ci bot commented Mar 26, 2021

Build succeeded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be changed to write a random temporary file Instead of writing a fixed temporary file.

The path will be locked by content store. In content store, I think it is safe to use %s.tmp in here.

Anyway, I am ok to align with continuity package. :) And why not just use the existing function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it does not concurrent writing of %s.tmp, then there is actually no need to atomic write, right?
I found that it is to fix some failures in Windows CI #3324
However, this modification cannot guarantee Linux atomic writing, nor does it guarantee windows atomic writing.
It's just a temporary solution that can pass CI.
I think it is necessary to implement the atomic write of Windows and then use it here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The atomic write is to make sure that the content is completeness. Sometimes, the disk has pressure and the content will be truncated somehow. The atomic write is like transaction, not just for concurrent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I understand, i found the original PR #2580.
it was forcibly interrupted when the file was just opened and no data was written. error caused by empty data when reading.
Although I think we should deal with empty data in this case, instead of guaranteeing to write data here, but thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The atomicWrite is used to ensure the content should be completeness, not empty or part of it. :)

@wzshiming wzshiming requested a review from fuweid March 28, 2021 14:00
@mikebrow
Copy link
Member

mikebrow commented Mar 29, 2021

The function does not guarantee atomicity, because the same temporary file will be opened multiple times

Any two attempts to write to the same path (for example: async close operations with updated timestamps for the close operation) should already be atomic using sem locks. As Fu Wei says the "atomic" part of this call is to ensure if the renamed file (path) exists it has the contents (timestamp). I don't think it would hurt to use a unique random file name extension here, but I also don't believe it is necessary, given the callers are in charge of making sure the write to path is atomic. Maybe we should rename the call to writeToCompletion()

@dmcgowan
Copy link
Member

Renaming is fine is to prevent unintended use of the function. I don't think this change is necessary and if it is, then there is a larger problem to solve with the ref lock. If the locks aren't working and there is a race at the filesystem level, the behavior should already be considered undefined.

If there was a "bad resume" detected as indicated by the above Info level log message, then this change would necessitate cleaning up any of the relevant tmp files since the new attempt to recreate would no longer handle it (now it just truncs and retries). We should only rely on cleanup (via Abort or Commit) to handle removal of known files or anything unexpectedly placed in the directory, our goal is always to keep the filesystem as clean as possible at all times.

@wzshiming wzshiming force-pushed the fix/local-store-atomic-write branch 2 times, most recently from 6d871f2 to 648e0db Compare March 30, 2021 02:14
@wzshiming wzshiming changed the title Fix local/store atomicWrite Rename atomicWrite to writeToCompletion Mar 30, 2021
@wzshiming
Copy link
Contributor Author

wzshiming commented Mar 30, 2021

@fuweid @mikebrow @dmcgowan
Thanks, i got it. Updated this PR to rename atomicWrite to writeToCompletion.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Mar 30, 2021

Build succeeded.

@theopenlab-ci
Copy link

theopenlab-ci bot commented Mar 30, 2021

Build succeeded.

@wzshiming wzshiming force-pushed the fix/local-store-atomic-write branch from 648e0db to c097a66 Compare April 15, 2021 10:15
@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 15, 2021

Build succeeded.

@fuweid
Copy link
Member

fuweid commented May 27, 2021

please rebase, thanks~

Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
@wzshiming wzshiming force-pushed the fix/local-store-atomic-write branch from c097a66 to 9fcea1d Compare May 27, 2021 10:36
@theopenlab-ci
Copy link

theopenlab-ci bot commented May 27, 2021

Build succeeded.

@fuweid fuweid merged commit c8cbf79 into containerd:master May 27, 2021
@wzshiming wzshiming deleted the fix/local-store-atomic-write branch May 28, 2021 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants