Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CachePadded: use 128-byte alignment on x86-64 #331

Merged
merged 2 commits into from Feb 20, 2019
Merged

CachePadded: use 128-byte alignment on x86-64 #331

merged 2 commits into from Feb 20, 2019

Conversation

ghost
Copy link

@ghost ghost commented Feb 15, 2019

/// The size of `CachePadded<T>` is the smallest multiple of 64 bytes large enough to accommodate
/// Cache lines are assumed to be of N bytes, depending on the architecture:
///
/// * On x86-64, N = 128.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it more the case that x64 CPUs fetch two cache lines at a time to align them to 128-bit chunks rather than that a single cache line is 128-bit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Maybe the wording is not great, but N is just a pessimistic guess of how long a cache line is. So N is at least the length of a cache line, but may be larger than that in order to accommodate architectures with longer cache lines or ones that have unusual prefetching strategies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, perhaps this could be reworded slightly to avoid confusion later? In any case LGTM.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - just added an additional comment for clarification.

@@ -49,7 +52,14 @@ use core::ops::{Deref, DerefMut};
/// }
/// ```
#[derive(Clone, Copy, Default, Hash, PartialEq, Eq)]
#[repr(align(64))]
// Starting from Intel's Sandy Bridge, spatial prefetcher is now pulling pairs of 64-byte cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also the case for AMD?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging by a quick skim through AMD's optimization manual, I'd say no.
https://developer.amd.com/wordpress/media/2013/12/55723_SOG_Fam_17h_Processors_3.00.pdf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sad, that there are only intel processors addressed here.

From what I've found Rust has flags to indicate CPU features
https://rust-lang.github.io/packed_simd/perf-guide/target-feature/rustflags.html

Copy link
Member

@taiki-e taiki-e Aug 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mart-Bogdan I would accept a PR to adjust the alignment of CachePadded based on the CPU when the user specifies -C target-cpu in rustflag.

@ghost
Copy link
Author

ghost commented Feb 20, 2019

bors r+

bors bot added a commit that referenced this pull request Feb 20, 2019
331: CachePadded: use 128-byte alignment on x86-64 r=stjepang a=stjepang

Motivation: https://www.reddit.com/r/rust/comments/akzas0/lockfree_rust_crossbeam_in_2019/efa26r1/

Co-authored-by: Stjepan Glavina <stjepang@gmail.com>
@bors
Copy link
Contributor

bors bot commented Feb 20, 2019

Build succeeded

@bors bors bot merged commit 711cd8c into crossbeam-rs:master Feb 20, 2019
bors bot added a commit that referenced this pull request Jan 3, 2021
636: CachePadded: update alignment based on golang's CacheLinePadSize r=jeehoonkang a=taiki-e

This updates alignment of `CachePadded` on the following architectures based on golang's [`CacheLinePadSize`](https://github.com/golang/go/blob/3510a1e32cbc86b73db143aefcc00aadc44c27bd/src/internal/cpu/cpu.go#L20).

- On [powerpc64](https://github.com/golang/go/blob/3dd58676054223962cd915bb0934d1f9f489d4d2/src/internal/cpu/cpu_ppc64x.go#L9), use 128-byte alignment.
- On [arm](https://github.com/golang/go/blob/3dd58676054223962cd915bb0934d1f9f489d4d2/src/internal/cpu/cpu_arm.go#L7), [mips](https://github.com/golang/go/blob/3dd58676054223962cd915bb0934d1f9f489d4d2/src/internal/cpu/cpu_mips.go#L7), [mips64](https://github.com/golang/go/blob/3dd58676054223962cd915bb0934d1f9f489d4d2/src/internal/cpu/cpu_mips64x.go#L9), and [riscv64](https://github.com/golang/go/blob/3dd58676054223962cd915bb0934d1f9f489d4d2/src/internal/cpu/cpu_riscv64.go#L7), use 32-byte alignment.
- On [s390x](https://github.com/golang/go/blob/3dd58676054223962cd915bb0934d1f9f489d4d2/src/internal/cpu/cpu_s390x.go#L7), use 256-byte alignment.

On x86_64 and aarch64, use 128-byte alignment as before. See #331 and #499 for reasons.
On all others, use 64-byte alignment as before.

Closes #427

cc @stjepang (when this merged, I'll send a same patch to smol-rs/cache-padded)
FYI @yoshuawuyts @cuviper (because you all were [discussing about adding `CachePadded` to the standard library](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/.60std.3A.3Amem.3A.3ACachePadded.60))

Co-authored-by: Taiki Endo <te316e89@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants