Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add write_pcid_no_flush #472

Merged
merged 2 commits into from Mar 19, 2024
Merged

add write_pcid_no_flush #472

merged 2 commits into from Mar 19, 2024

Conversation

Freax13
Copy link
Contributor

@Freax13 Freax13 commented Mar 18, 2024

If Cr4.PCID is set and the 63rd bit is clear when moving into Cr3, the processor flushes all TLB entries for the given PCID (Intel SDM, Volume 3, 4.10.4.1 and AMD APM, Volume 2, 5.5.1). This is similar to the operation without PCID in which the processor also flushes all TLB entries every time a new value is moved into Cr3. Given that the entire idea behind PCIDs is to not flush all entries when Cr3 is changed, we should also provide a write method that sets the 63rd bit when moving into Cr3. This PR adds write_pcid_no_flush to do just that.

A very quick and dirty comparison between write_pcid and write_pcid_no_flush showed a 10% 25+% performance improvement.

It could be argued that setting the 63rd bit should be the default, however, I'm a bit concerned that users rely on the flushing behavior (either intentionally or unintentionally) and so suddenly changing to non-flushing behavior could break code (This seems to be the case for a codebase I'm working on). In a future breaking release, we may want to change write_pcid to take an additional parameter specifying whether flushing behavior is intended.

@phil-opp
Copy link
Member

Took me a bit to find this behavior in the Intel/AMD manuals because it is not mentioned in the CR3 section. The relevant info can be found in section 5.5.1 in the AMD manual:

The current PCID is the value in CR3[11:0]. When PCIDs are enabled the system software can store
12-bit Process Context Identifiers in CR3 for different address spaces. Subsequently, when system
software switches address spaces (by writing the page table base pointer in CR3[62:12]), the processor
may use TLB mappings previously stored for that address space and PCID, providing that bit 63 of the
source operand is set to 1
. If bit 63 is set to 0, the legacy behavior of a move to CR3 is maintained,
invalidating TLB entries but only non-global entries for the specified PCID. Note that this bit is not
stored in the CR3 register itself
.

(Emphasis mine)

For Intel, the behavior is documented in section 4.10.4.1:

  • MOV to CR3. The behavior of the instruction depends on the value of CR4.PCIDE:
    • If CR4.PCIDE = 0, the instruction invalidates all TLB entries associated with PCID 000H except those for
      global pages. It also invalidates all entries in all paging-structure caches associated with PCID 000H.
    • If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 0, the instruction invalidates all TLB
      entries associated with the PCID specified in bits 11:0 of the instruction’s source operand except those for
      global pages. It also invalidates all entries in all paging-structure caches associated with that PCID. It is not
      required to invalidate entries in the TLBs and paging-structure caches that are associated with other PCIDs.
    • If CR4.PCIDE = 1 and bit 63 of the instruction’s source operand is 1, the instruction is not required to
      invalidate any TLB entries or entries in paging-structure caches.

Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

I agree that a separate method is better than changing the behavior of write_pcid.

@Freax13 Freax13 merged commit 306b3e1 into master Mar 19, 2024
12 checks passed
@Freax13 Freax13 deleted the fix/pcid-no-flush branch March 19, 2024 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants