New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend/local: Avoid polluting page cache when uploading local files to remote backends #3413
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very neat - thank you :-)
I put some comments inline.
What do you think the possible negative consequences of this are? I see fadvise is a POSIX standard so likely this works well on all unix-like platforms.
How did you measure the memory usage improvement with pcstat
- running it repeatedly on a file you were copying?
Any chance you could move the excellent documentation you wrote in the commit to the relevant places in the source code?
Thank you
backend/local/fadvise_unix.go
Outdated
l = limit | ||
} | ||
if err := unix.Fadvise(f.fd, f.curPos, l, unix.FADV_SEQUENTIAL); err != nil { | ||
fs.Infof(f.o, "fadvise sequential failed on file descriptor %d: %s", f.fd, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be fs.Errorf
if we really want the user to see it, or fs.Debugf
if we don't care. I think probably the latter - what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing that to fs.Debugf
.
backend/local/fadvise_unix.go
Outdated
|
||
func (f *fadvise) freePages() { | ||
if err := unix.Fadvise(f.fd, f.lastPos, f.curPos-f.lastPos, unix.FADV_DONTNEED); err != nil { | ||
fs.Infof(f.o, "fadvise dontneed failed on file descriptor %d: %s", f.fd, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
backend/local/fadvise_unix.go
Outdated
}, | ||
inner: f, | ||
} | ||
r.sequential(limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this maybe be bombing out here and just return f
if r.sequential
fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good idea.
…to remote backends This patch makes rclone keep linux page cache usage under control when uploading local files to remote backends. When opening a file it issues FADV_SEQUENTIAL to configure read ahead strategy. While reading the file it issues FADV_DONTNEED every 128kB to free page cache from already consumed pages. ``` fadvise64(5, 0, 0, POSIX_FADV_SEQUENTIAL) = 0 read(5, "\324\375\251\376\213\361\240\224>\t5E\301\331X\274^\203oA\353\303.2'\206z\177N\27fB"..., 32768) = 32768 read(5, "\361\311\vW!\354_\317hf\276t\307\30L\351\272T\342C\243\370\240\213\355\210\v\221\201\177[\333"..., 32768) = 32768 read(5, ":\371\337Gn\355C\322\334 \253f\373\277\301;\215\n\240\347\305\6N\257\313\4\365\276ANq!"..., 32768) = 32768 read(5, "\312\243\360P\263\242\267H\304\240Y\310\367sT\321\256\6[b\310\224\361\344$Ms\234\5\314\306i"..., 32768) = 32768 fadvise64(5, 0, 131072, POSIX_FADV_DONTNEED) = 0 read(5, "m\251\7a\306\226\366-\v~\"\216\353\342~0\fht\315DK0\236.\\\201!A#\177\320"..., 32768) = 32768 read(5, "\7\324\207,\205\360\376\307\276\254\250\232\21G\323n\255\354\234\257P\322y\3502\37\246\21\334^42"..., 32768) = 32768 read(5, "e{*\225\223R\320\212EG:^\302\377\242\337\10\222J\16A\305\0\353\354\326P\336\357A|-"..., 32768) = 32768 read(5, "n\23XA4*R\352\234\257\364\355Y\204t9T\363\33\357\333\3674\246\221T\360\226\326G\354\374"..., 32768) = 32768 fadvise64(5, 131072, 131072, POSIX_FADV_DONTNEED) = 0 read(5, "SX\331\251}\24\353\37\310#\307|h%\372\34\310\3070YX\250s\2269\242\236\371\302z\357_"..., 32768) = 32768 read(5, "\177\3500\236Y\245\376NIY\177\360p!\337L]\2726\206@\240\246pG\213\254N\274\226\303\357"..., 32768) = 32768 read(5, "\242$*\364\217U\264]\221Y\245\342r\t\253\25Hr\363\263\364\336\322\t\325\325\f\37z\324\201\351"..., 32768) = 32768 read(5, "\2305\242\366\370\203tM\226<\230\25\316(9\25x\2\376\212\346Q\223 \353\225\323\264jf|\216"..., 32768) = 32768 fadvise64(5, 262144, 131072, POSIX_FADV_DONTNEED) = 0 ``` Page cache consumption per file can be checked with tools like [pcstat](https://github.com/tobert/pcstat). This patch does not have a performance impact. Please find below results of an experiment comparing local copy of 1GB file with and without this patch. With the patch: ``` (mmt/fadvise)$ pcstat 1GB.bin.1 +-----------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |-----------+----------------+------------+-----------+---------| | 1GB.bin.1 | 1073741824 | 262144 | 0 | 000.000 | +-----------+----------------+------------+-----------+---------+ (mmt/fadvise)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone" User time (seconds): 13.19 System time (seconds): 1.12 Percent of CPU this job got: 96% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.81 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 27660 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 2212 Voluntary context switches: 5755 Involuntary context switches: 9782 Swaps: 0 File system inputs: 4155264 File system outputs: 2097152 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 (mmt/fadvise)$ pcstat 1GB.bin.1 +-----------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |-----------+----------------+------------+-----------+---------| | 1GB.bin.1 | 1073741824 | 262144 | 0 | 000.000 | +-----------+----------------+------------+-----------+---------+ ``` Without the patch: ``` (master)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone" User time (seconds): 14.46 System time (seconds): 0.81 Percent of CPU this job got: 93% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.41 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 27600 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 2228 Voluntary context switches: 7190 Involuntary context switches: 1980 Swaps: 0 File system inputs: 2097152 File system outputs: 2097152 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 (master)$ pcstat 1GB.bin.1 +-----------+----------------+------------+-----------+---------+ | Name | Size (bytes) | Pages | Cached | Percent | |-----------+----------------+------------+-----------+---------| | 1GB.bin.1 | 1073741824 | 262144 | 262144 | 100.000 | +-----------+----------------+------------+-----------+---------+ ```
That looks great - will merge now, thank you very much. |
What is the purpose of this change?
This patch makes rclone keep linux page cache usage under control when
uploading local files to remote backends. When opening a file it issues
FADV_SEQUENTIAL to configure read ahead strategy. While reading
the file it issues FADV_DONTNEED every 128kB to free page cache from
already consumed pages.
The read pattern is the following, the 32768 buffer size comes from io.Copy function.
Page cache consumption per file can be checked with tools like pcstat.
Was the change discussed in an issue or in the forum before?
No.
Checklist