@@ -716,19 +716,24 @@ calls :c:func:`!rcu_read_lock` to ensure that the VMA is looked up in an RCU
716716critical section, then attempts to VMA lock it via :c:func: `!vma_start_read `,
717717before releasing the RCU lock via :c:func: `!rcu_read_unlock `.
718718
719- VMA read locks hold the read lock on the :c:member: `!vma->vm_lock ` semaphore for
720- their duration and the caller of :c:func: `!lock_vma_under_rcu ` must release it
721- via :c:func: `!vma_end_read `.
719+ In cases when the user already holds mmap read lock, :c:func: `!vma_start_read_locked `
720+ and :c:func: `!vma_start_read_locked_nested ` can be used. These functions do not
721+ fail due to lock contention but the caller should still check their return values
722+ in case they fail for other reasons.
723+
724+ VMA read locks increment :c:member: `!vma.vm_refcnt ` reference counter for their
725+ duration and the caller of :c:func: `!lock_vma_under_rcu ` must drop it via
726+ :c:func: `!vma_end_read `.
722727
723728VMA **write ** locks are acquired via :c:func: `!vma_start_write ` in instances where a
724729VMA is about to be modified, unlike :c:func: `!vma_start_read ` the lock is always
725730acquired. An mmap write lock **must ** be held for the duration of the VMA write
726731lock, releasing or downgrading the mmap write lock also releases the VMA write
727732lock so there is no :c:func: `!vma_end_write ` function.
728733
729- Note that a semaphore write lock is not held across a VMA lock. Rather, a
730- sequence number is used for serialisation, and the write semaphore is only
731- acquired at the point of write lock to update this .
734+ Note that when write-locking a VMA lock, the :c:member: ` !vma.vm_refcnt ` is temporarily
735+ modified so that readers can detect the presense of a writer. The reference counter is
736+ restored once the vma sequence number used for serialisation is updated .
732737
733738This ensures the semantics we require - VMA write locks provide exclusive write
734739access to the VMA.
@@ -738,7 +743,7 @@ Implementation details
738743
739744The VMA lock mechanism is designed to be a lightweight means of avoiding the use
740745of the heavily contended mmap lock. It is implemented using a combination of a
741- read/write semaphore and sequence numbers belonging to the containing
746+ reference counter and sequence numbers belonging to the containing
742747:c:struct: `!struct mm_struct ` and the VMA.
743748
744749Read locks are acquired via :c:func: `!vma_start_read `, which is an optimistic
@@ -779,28 +784,31 @@ release of any VMA locks on its release makes sense, as you would never want to
779784keep VMAs locked across entirely separate write operations. It also maintains
780785correct lock ordering.
781786
782- Each time a VMA read lock is acquired, we acquire a read lock on the
783- :c:member: ` !vma->vm_lock ` read/write semaphore and hold it, while checking that
784- the sequence count of the VMA does not match that of the mm.
787+ Each time a VMA read lock is acquired, we increment :c:member: ` !vma.vm_refcnt `
788+ reference counter and check that the sequence count of the VMA does not match
789+ that of the mm.
785790
786- If it does, the read lock fails. If it does not, we hold the lock, excluding
787- writers, but permitting other readers, who will also obtain this lock under RCU.
791+ If it does, the read lock fails and :c:member: `!vma.vm_refcnt ` is dropped.
792+ If it does not, we keep the reference counter raised, excluding writers, but
793+ permitting other readers, who can also obtain this lock under RCU.
788794
789795Importantly, maple tree operations performed in :c:func: `!lock_vma_under_rcu `
790796are also RCU safe, so the whole read lock operation is guaranteed to function
791797correctly.
792798
793- On the write side, we acquire a write lock on the :c:member: `!vma->vm_lock `
794- read/write semaphore, before setting the VMA's sequence number under this lock,
795- also simultaneously holding the mmap write lock.
799+ On the write side, we set a bit in :c:member: `!vma.vm_refcnt ` which can't be
800+ modified by readers and wait for all readers to drop their reference count.
801+ Once there are no readers, the VMA's sequence number is set to match that of
802+ the mm. During this entire operation mmap write lock is held.
796803
797804This way, if any read locks are in effect, :c:func: `!vma_start_write ` will sleep
798805until these are finished and mutual exclusion is achieved.
799806
800- After setting the VMA's sequence number, the lock is released, avoiding
801- complexity with a long-term held write lock.
807+ After setting the VMA's sequence number, the bit in :c:member: `!vma.vm_refcnt `
808+ indicating a writer is cleared. From this point on, VMA's sequence number will
809+ indicate VMA's write-locked state until mmap write lock is dropped or downgraded.
802810
803- This clever combination of a read/write semaphore and sequence count allows for
811+ This clever combination of a reference counter and sequence count allows for
804812fast RCU-based per-VMA lock acquisition (especially on page fault, though
805813utilised elsewhere) with minimal complexity around lock ordering.
806814
0 commit comments