[Deepin-Kernel-SIG] [linux 6.18-y] [Deepin] deepin: arm64: cpufeature: disable LSE on some cpus#1564
Conversation
Reviewer's GuideIntroduces an arm64 runtime check to selectively disable LSE atomics on known slow microarchitectures (currently HiSilicon TSV110), gated by a new boot parameter, and wires this check into the existing LSE capability detection. Class diagram for updated arm64 LSE capability detectionclassDiagram
class arm64_cpu_capabilities {
+const char* desc
+int capability
+int type
+bool (*matches)(const arm64_cpu_capabilities* cap, int scope)
+unsigned int field
+unsigned int sign
}
class arm64_features_entry_LSE_ATOMICS {
+desc = "LSE atomic instructions"
+capability = ARM64_HAS_LSE_ATOMICS
+type = ARM64_CPUCAP_SYSTEM_FEATURE
+matches = has_lse_capability_check
}
class cpufeature_module {
+bool lse_disable_check
+int arm64_lse_disable_check(char* str)
+bool has_lse_capability_check(const arm64_cpu_capabilities* cap, int scope)
+bool has_cpuid_feature(const arm64_cpu_capabilities* cap, int scope)
}
class midr_range {
+unsigned long min_midr
+unsigned long max_midr
}
class lse_disable_list {
+midr_range entries
+MIDR_ALL_VERSIONS(MIDR_HISI_TSV110)
+sentinel
}
class early_param_system {
+void early_param(const char* name, int (*fn)(char* str))
}
cpufeature_module "1" o-- "many" arm64_cpu_capabilities : defines
arm64_features_entry_LSE_ATOMICS --|> arm64_cpu_capabilities : instance_of
cpufeature_module --> lse_disable_list : queries
cpufeature_module --> midr_range : uses
cpufeature_module --> early_param_system : registers arm64_lse_disable_check
cpufeature_module ..> arm64_features_entry_LSE_ATOMICS : assigns matches = has_lse_capability_check
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
lse_disable_checkflag has inverted semantics relative to its name (0 = perform the check/possibly disable LSE, 1 = skip the check and keep LSE), which is easy to misinterpret; consider renaming it (e.g. tolse_force_enableor similar) to better reflect what a true value does. - The
lse_disable_listarray is defined as a localstatic constinsidehas_lse_capability_check; moving it to file scope would avoid re-declaring it on each call and make it easier to extend or share with other feature checks in the future.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `lse_disable_check` flag has inverted semantics relative to its name (0 = perform the check/possibly disable LSE, 1 = skip the check and keep LSE), which is easy to misinterpret; consider renaming it (e.g. to `lse_force_enable` or similar) to better reflect what a true value does.
- The `lse_disable_list` array is defined as a local `static const` inside `has_lse_capability_check`; moving it to file scope would avoid re-declaring it on each call and make it easier to extend or share with other feature checks in the future.
## Individual Comments
### Comment 1
<location path="arch/arm64/kernel/cpufeature.c" line_range="1727-1731" />
<code_context>
+static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap,
+ int scope)
+{
+ /* List of CPUs that LSE are slow more than llsc */
+ static const struct midr_range lse_disable_list[] = {
+ MIDR_ALL_VERSIONS(MIDR_HISI_TSV110),
</code_context>
<issue_to_address>
**nitpick (typo):** Clarify the English in this comment for readability.
The phrase "LSE are slow more than llsc" is unclear. Consider rephrasing to something like "List of CPUs where LSE atomics are slower than LL/SC" or "List of CPUs on which LSE atomics are slower than LL/SC" for readability.
```suggestion
static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap,
int scope)
{
/* List of CPUs on which LSE atomics are slower than LL/SC */
static const struct midr_range lse_disable_list[] = {
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap, | ||
| int scope) | ||
| { | ||
| /* List of CPUs that LSE are slow more than llsc */ | ||
| static const struct midr_range lse_disable_list[] = { |
There was a problem hiding this comment.
nitpick (typo): Clarify the English in this comment for readability.
The phrase "LSE are slow more than llsc" is unclear. Consider rephrasing to something like "List of CPUs where LSE atomics are slower than LL/SC" or "List of CPUs on which LSE atomics are slower than LL/SC" for readability.
| static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap, | |
| int scope) | |
| { | |
| /* List of CPUs that LSE are slow more than llsc */ | |
| static const struct midr_range lse_disable_list[] = { | |
| static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap, | |
| int scope) | |
| { | |
| /* List of CPUs on which LSE atomics are slower than LL/SC */ | |
| static const struct midr_range lse_disable_list[] = { |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Avenger-285714 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR adds an arm64 CPU-specific capability check to selectively disable LSE atomic instructions on known slow microarchitectures (currently HiSilicon TSV110), with an early boot parameter override and corresponding documentation.
Changes:
- Add
lse_disable_checkearly kernel parameter and a new LSE capability matcher inarch/arm64/kernel/cpufeature.c. - Disable the LSE capability on TSV110 by default (unless overridden) and emit an informational log when it triggers.
- Document
lse_disable_checkinDocumentation/admin-guide/kernel-parameters.txt.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| arch/arm64/kernel/cpufeature.c | Introduces lse_disable_check and routes LSE capability matching through a TSV110-based disable list. |
| Documentation/admin-guide/kernel-parameters.txt | Documents the new boot parameter and the resulting kernel log message. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| The default value is 0 (enabled), which automatically | ||
| disables LSE on some systems. Set to 1 to bypass | ||
| the automatic disabling of LSE on affected systems. |
| the automatic disabling of LSE on affected systems. | ||
|
|
||
| When this feature is active, the kernel logs: | ||
| "LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature." |
arch/arm64/kernel/cpufeature.c
Outdated
| }; | ||
|
|
||
| /* Disable LSE when lse_disable_check is 0 and in lse_disable_list */ | ||
| if (lse_disable_check == 0 && is_midr_in_range_list(read_cpuid_id(), lse_disable_list)) { |
| early_param("lse_disable_check", arm64_lse_disable_check); | ||
|
|
||
| static bool has_lse_capability_check(const struct arm64_cpu_capabilities *cap, | ||
| int scope) |
| /* Disable LSE when lse_disable_check is 0 and in lse_disable_list */ | ||
| if (lse_disable_check == 0 && is_midr_in_range_list(read_cpuid_id(), lse_disable_list)) { | ||
| if (scope == SCOPE_SYSTEM) | ||
| pr_info("LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature.\n"); |
deepin inclusion category: performance Disable LSE (Large System Extensions) atomic instructions on some systems to improve performance of per-CPU atomic operations. LSE atomics can exhibit significant overhead on certain microarchitectures (e.g., TSV110) due to "far atomic" implementations bypassing L1 cache. LL/SC (Load-Link/Store-Conditional) is substantially faster. The default value is 0 (enabled), which automatically disables LSE on some systems. Set to 1 to skip the check enablement on our test systems regardless of performance impact. When this feature is active, the kernel logs: "LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature." PS: Test with byte-unixbench6 in kp920 24c and 64GB memory, improve whole scores by 3.8%. Test with https://github.com/leitao/debug/tree/main/LSE percpu_bench CPU0: LSE (stadd) (c 0, d 0): p50: 029.00 ns p95: 029.09 ns p99: 029.12 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.55 ns p95: 069.57 ns p99: 069.71 ns CPU1: LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 029.00 ns p99: 029.01 ns LL/SC (c 0, d 0): p50: 006.56 ns p95: 006.58 ns p99: 006.58 ns LDADD (c 0, d 0): p50: 010.04 ns p95: 010.06 ns p99: 010.06 ns CPU2: LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.53 ns p95: 069.56 ns p99: 069.58 ns ... CPU23: LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 064.93 ns p95: 064.95 ns p99: 064.97 ns Link: https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop [1] Link: deepin-community#1320 Link: deepin-community#1302 Signed-off-by: Wentao Guan <guanwentao@uniontech.com> (cherry picked from commit 6afecf6) Conflicts: Documentation/admin-guide/kernel-parameters.txt [Modified for ("arm64: Modify _midr_range() functions to read MIDR/REVIDR internally")] (cherry picked from commit 192bed85ce35986c892192933148fd743d775ddf) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
6833b40 to
fb71850
Compare
deepin inclusion
category: performance
Disable LSE (Large System Extensions) atomic instructions on some systems to improve performance of per-CPU atomic operations. LSE atomics can exhibit significant overhead on certain microarchitectures (e.g., TSV110) due
to "far atomic" implementations bypassing L1 cache. LL/SC (Load-Link/Store-Conditional) is substantially faster.
The default value is 0 (enabled), which automatically disables LSE on some systems. Set to 1 to skip the check enablement on our test systems regardless of performance impact.
When this feature is active, the kernel logs:
"LSE atomics: use llsc for performance, use lse_disable_check=1 to disable the feature."
PS:
Test with byte-unixbench6 in kp920 24c and 64GB memory, improve whole scores by 3.8%.
Test with https://github.com/leitao/debug/tree/main/LSE percpu_bench CPU0:
LSE (stadd) (c 0, d 0): p50: 029.00 ns p95: 029.09 ns p99: 029.12 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.55 ns p95: 069.57 ns p99: 069.71 ns CPU1:
LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 029.00 ns p99: 029.01 ns LL/SC (c 0, d 0): p50: 006.56 ns p95: 006.58 ns p99: 006.58 ns LDADD (c 0, d 0): p50: 010.04 ns p95: 010.06 ns p99: 010.06 ns CPU2:
LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 069.53 ns p95: 069.56 ns p99: 069.58 ns ...
CPU23:
LSE (stadd) (c 0, d 0): p50: 005.79 ns p95: 005.79 ns p99: 005.79 ns LL/SC (c 0, d 0): p50: 006.57 ns p95: 006.57 ns p99: 006.57 ns LDADD (c 0, d 0): p50: 064.93 ns p95: 064.95 ns p99: 064.97 ns
Link: https://lore.kernel.org/r/e7d539ed-ced0-4b96-8ecd-048a5b803b85@paulmck-laptop [1]
Link: #1320
Link: #1302
(cherry picked from commit 6afecf6) Conflicts:
Documentation/admin-guide/kernel-parameters.txt
(cherry picked from commit 192bed85ce35986c892192933148fd743d775ddf)
Summary by Sourcery
Gate arm64 LSE atomic support behind a CPU-specific capability check that can be overridden via a new kernel parameter.
New Features:
Enhancements: