sync --preserve checks mtime #35

mdomsch · 2012-03-02T15:43:55Z

This causes an extra HEAD request for each remote file, which greatly
slows down execution, and increases monetary cost $(0.01/10000
requests), but guarantees files whose mtime has changed will get
resync'd.

This is necessary for yum repositories, where repodata/* files may be
updated but not change size. It also correctly handles large files
whose md5 values as returned by S3 are incorrect having their content
(and thus mtime) changed, perhaps by RPM signing.

This causes an extra HEAD request for each remote file, which greatly slows down execution, and increases monetary cost $(0.01/10000 requests), but guarantees files whose mtime has changed will get resync'd. This is necessary for yum repositories, where repodata/* files may be updated but not change size. It also correctly handles large files whose md5 values as returned by S3 are incorrect having their content (and thus mtime) changed, perhaps by RPM signing.

mdomsch · 2012-03-02T16:38:29Z

we're trading a ton of local disk I/O to calculate md5 on each file, for a HEAD call to S3 for each file
we do get the LastModified (uploaded) time from S3 w/o the HEAD call
I wonder if we can simply look at files with mtimes newer than LastModified...
and assume if file mtime is newer than LastModified, then it needs to be updated.

For regular occuring sync runs, I think that's valid...

mdomsch · 2012-03-02T18:45:38Z

With this patch, syncing takes 10x longer. Probably the wrong approach then. Maybe LastModified as a proxy for mtime is good enough...

mdomsch · 2012-07-14T13:54:17Z

Killing this pull request. What I've done elsewhere in my tree works better w/o the I/O penalty.

mdomsch closed this Jul 14, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync --preserve checks mtime #35

sync --preserve checks mtime #35

mdomsch commented Mar 2, 2012

mdomsch commented Mar 2, 2012

mdomsch commented Mar 2, 2012

mdomsch commented Jul 14, 2012

sync --preserve checks mtime #35

sync --preserve checks mtime #35

Conversation

mdomsch commented Mar 2, 2012

mdomsch commented Mar 2, 2012

mdomsch commented Mar 2, 2012

mdomsch commented Jul 14, 2012