forked from torvalds/linux
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Now that all callers of get_user_pages*() have been updated to use put_user_page() instead of put_page(), add tracking of such "gup-pinned" pages. The purpose of this tracking is to answer the question "has this page been pinned by a call to get_user_pages()?" In order to answer that, refcounting is required. get_user_pages() and all its variants increment a reference count, and put_user_page() and its variants decrement that reference count. If the net count is *effectively* non-zero (see below), then the page is considered gup-pinned. What to do in response to encountering such a page, is left to later patchsets. There is discussion about this in [1], and in an upcoming patch that adds: Documentation/vm/get_user_pages.rst So, this patch simply adds tracking of such pages. In order to achieve this without using up any more bits or fields in struct page, the page->_refcount field is overloaded. gup pins are incremented by adding a large chunk (1024) instead of 1. This provides a way to say, "either this page is gup-pinned, or you have a *lot* of references on it, and thus this is a false positive". False positives are generally OK, as long as they are expected to be rare: taking action for a page that looks gup-pinned, but is not, is not going to be a problem. It's false negatives (failing to detect a gup-pinned page) that would be a problem, and those won't happen with this approach. This takes advantage of two distinct, pre-existing lock-free algorithms: a) get_user_pages() and things such as page_mkclean(), both operate on page table entries, without taking locks. This relies partly on just letting the CPU hardware (which of course also never takes locks to use its own page tables) just take page faults if something has changed. b) page_cache_get_speculative(), called by get_user_pages(), is a way to avoid having pages get freed out from under get_user_pages() or other things that want to pin pages. As a result, performance is expected to be unchanged in any noticeable way, by this patch. This includes the following fix from Ira Weiny: DAX requires detection of a page crossing to a ref count of 1. Fix this for GUP pages by introducing put_devmap_managed_user_page() which accounts for GUP_PIN_COUNTING_BIAS now used by GUP. Tracking: Add several new /proc/vmstat items, to provide some visibility into what get_user_pages() and put_user_page() are doing. $ cat /proc/vmstat |grep gup nr_gup_slow_pages_requested 4842 nr_gup_fast_pages_requested 262718 nr_gup_fast_page_backoffs 0 nr_gup_page_count_overflows 0 nr_gup_page_count_neg_overflows 0 nr_gup_pages_returned 267560 Interpretation of the above: Total gup requests (slow + fast): 267560 Total put_user_page calls: 267560 Normally, those last two numbers should be equal, but a couple of things may cause them to differ: 1) Inherent race condition in reading /proc/vmstat values. 2) Bugs at any of the get_user_pages*() call sites. Those sites need to match get_user_pages() and put_user_page() calls. [1] https://lwn.net/Articles/753027/ "The trouble with get_user_pages()" Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Jérôme Glisse <jglisse@redhat.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Cc: Christian Benvenuti <benve@cisco.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christopher Lameter <cl@linux.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Tom Talpey <tom@talpey.com>
- Loading branch information
1 parent
d5e8e65
commit a0fb73c
Showing
11 changed files
with
224 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.