plan9asm: infer target features for ISA-specific asm functions by cpunion · Pull Request #8 · xgo-dev/plan9asm

cpunion · 2026-03-07T10:53:42Z

Summary

infer function-level target-features from ISA-specific asm instructions instead of emitting unused static attribute groups
cover x86 CRC32, PCLMULQDQ, SSSE3 PSHUFB, AESNI, and arm64 CRC instructions
add a stdlib hash/crc32 amd64 compile regression test

Why

hash/crc32/crc32_amd64.s was translated without attaching target-features to functions using llvm.x86.sse42.crc32.* / llvm.x86.pclmulqdq. On Ubuntu x86_64 with LLVM 19 this led to backend selection failures during llgo CI.

Validation

go test ./...
validated from llgo in Linux amd64 docker with this plan9asm tree wired in:
- llgo test -timeout=20m github.com/goplus/llgo/test/std/archive/tar
- llgo test -timeout=20m -tags testGC github.com/goplus/llgo/test/std/hash/crc32

gemini-code-assist · 2026-03-07T10:54:01Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors how target-features are handled for ISA-specific assembly functions during LLVM IR generation. Instead of relying on predefined static attribute groups, the system now dynamically infers the necessary target-features directly from the assembly instructions within each function. This change resolves backend selection failures encountered in llgo CI for hash/crc32_amd64.s and ensures that the generated LLVM IR correctly specifies required CPU features, improving compatibility and robustness across different LLVM versions and target architectures.

Highlights

Dynamic Target Feature Inference: Implemented dynamic inference of target-features for ISA-specific assembly functions based on the instructions they contain, replacing static attribute groups.
Expanded Feature Coverage: Extended feature detection to cover x86 CRC32, PCLMULQDQ, SSSE3 PSHUFB, AESNI, and ARM64 CRC instructions.
Attribute Group Management: Introduced a new featureAttrRegistry to efficiently manage and emit unique target-features attribute groups as needed.
Regression Test Added: Added a new regression test for hash/crc32 on AMD64 to validate the compilation of standard library assembly with the newly inferred target features.
Removed Static Attributes: Removed hardcoded static target-features attribute groups from AMD64 and ARM64 translation logic, streamlining the code.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

amd64_translate.go
- Removed hardcoded static attribute groups for SSE4.2 CRC32 and PCLMULQDQ + SSE4.1.
arm64_translate.go
- Removed the hardcoded static attribute group for ARM64 CRC features.
feature_attrs.go
- Added a new file defining featureAttrRegistry to manage unique target-features attributes.
- Added inferFuncTargetFeatures to detect required ISA features from assembly instructions (e.g., CRC32, PCLMULQDQ, PSHUFB, AES instructions for AMD64, and CRC for ARM64).
stdlib_crc32_amd64_test.go
- Added a new test case to compile hash/crc32_amd64.s using the new translation logic.
- Verified that the generated LLVM IR contains the expected target-features attributes.
translate.go
- Integrated the featureAttrRegistry and inferFuncTargetFeatures to dynamically assign target-features to functions based on their content.
- Replaced the previous static approach for applying target-features.

Activity

The pull request was created by cpunion.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov · 2026-03-07T10:55:13Z

Codecov Report

❌ Patch coverage is 51.79856% with 402 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@a5f1e3b). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
amd64_lower_vec.go	48.14%	194 Missing and 72 partials ⚠️
cmd/plan9asmscan/main.go	39.31%	68 Missing and 3 partials ⚠️
amd64_ctx.go	75.60%	10 Missing and 10 partials ⚠️
amd64_lower_fp.go	17.39%	18 Missing and 1 partial ⚠️
amd64_lower_branch.go	28.57%	9 Missing and 1 partial ⚠️
translate.go	44.44%	5 Missing ⚠️
arm64_lower_arith.go	50.00%	4 Missing ⚠️
translate_module_direct.go	57.14%	2 Missing and 1 partial ⚠️
amd64_lower_mov.go	60.00%	1 Missing and 1 partial ⚠️
feature_attrs.go	96.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main       #8   +/-   ##
=======================================
  Coverage        ?   32.44%           
=======================================
  Files           ?       41           
  Lines           ?    12640           
  Branches        ?        0           
=======================================
  Hits            ?     4101           
  Misses          ?     7806           
  Partials        ?      733

Flag	Coverage Δ
unittests	`32.44% <51.79%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist

Code Review

This pull request introduces a mechanism to dynamically infer required CPU features from assembly instructions, which is a significant improvement over hardcoding them. The changes are well-structured, and the inclusion of a regression test is commendable. I have one suggestion to improve the implementation of the feature set for better performance and code clarity.

gemini-code-assist · 2026-03-07T10:55:18Z

+func inferFuncTargetFeatures(arch Arch, fn Func) string {
+	var featureSet []string
+	add := func(features ...string) {
+		for _, feature := range features {
+			if feature == "" {
+				continue
+			}
+			exists := false
+			for _, v := range featureSet {
+				if v == feature {
+					exists = true
+					break
+				}
+			}
+			if !exists {
+				featureSet = append(featureSet, feature)
+			}
+		}
+	}
+
+	for _, ins := range fn.Instrs {
+		op := strings.ToUpper(string(ins.Op))
+		switch arch {
+		case ArchAMD64:
+			switch {
+			case strings.HasPrefix(op, "CRC32"):
+				add("+crc32", "+sse4.2")
+			case op == "PCLMULQDQ":
+				add("+pclmul", "+sse4.1")
+			case op == "PSHUFB" || op == "VPSHUFB":
+				add("+ssse3")
+			case op == "AESENC" || op == "AESENCLAST" || op == "AESDEC" || op == "AESDECLAST" || op == "AESIMC" || op == "AESKEYGENASSIST":
+				add("+aes")
+			}
+		case ArchARM64:
+			if strings.HasPrefix(op, "CRC32") {
+				add("+crc")
+			}
+		}
+	}
+
+	if len(featureSet) == 0 {
+		return ""
+	}
+	sort.Strings(featureSet)
+	return strings.Join(featureSet, ",")
+}


The current implementation of inferFuncTargetFeatures uses a slice to store unique features, which leads to an inefficient O(n) lookup for each feature addition. A more idiomatic and performant approach in Go for handling sets of unique items is to use a map[string]struct{}. This improves the feature addition to an average of O(1) time complexity and makes the code cleaner.

func inferFuncTargetFeatures(arch Arch, fn Func) string { featureSet := make(map[string]struct{}) add := func(features ...string) { for _, feature := range features { if feature != "" { featureSet[feature] = struct{}{} } } } for _, ins := range fn.Instrs { op := strings.ToUpper(string(ins.Op)) switch arch { case ArchAMD64: switch { case strings.HasPrefix(op, "CRC32"): add("+crc32", "+sse4.2") case op == "PCLMULQDQ": add("+pclmul", "+sse4.1") case op == "PSHUFB" || op == "VPSHUFB": add("+ssse3") case op == "AESENC" || op == "AESENCLAST" || op == "AESDEC" || op == "AESDECLAST" || op == "AESIMC" || op == "AESKEYGENASSIST": add("+aes") } case ArchARM64: if strings.HasPrefix(op, "CRC32") { add("+crc") } } } if len(featureSet) == 0 { return "" } features := make([]string, 0, len(featureSet)) for feature := range featureSet { features = append(features, feature) } sort.Strings(features) return strings.Join(features, ",") }

fennoai · 2026-03-07T10:59:03Z

@@ -27,7 +27,7 @@ func emitARM64Prelude(b *strings.Builder) {
 	b.WriteString("\n")
 	// Attribute group used by some functions to enable optional ISA features.
 	// (Example: "+crc" for hash/crc32 arm64 fast paths.)


The comment on lines 28–29 describes behavior that was removed by this PR. attributes #0 is no longer emitted here — the comment now floats above a bare b.WriteString("\n") and is misleading. Consider removing it.

fennoai · 2026-03-07T10:59:05Z

+	"strings"
+)
+
+const featureAttrBase = 200


featureAttrBase = 200 is a magic number with no rationale documented. The intent is presumably to avoid collisions with manually-assigned attribute group numbers (#0, #1, etc.), but there's no comment, and no enforcement. If a caller passes a sig.Attrs like "#200", LLVM will silently see two definitions of attributes #200 in the module (which is invalid IR). A short comment explaining the choice would help future contributors.

fennoai · 2026-03-07T10:59:07Z

+				add("+pclmul", "+sse4.1")
+			case op == "PSHUFB" || op == "VPSHUFB":
+				add("+ssse3")
+			case op == "AESENC" || op == "AESENCLAST" || op == "AESDEC" || op == "AESDECLAST" || op == "AESIMC" || op == "AESKEYGENASSIST":


SHA-NI instructions (SHA256MSG1, SHA256MSG2, SHA256RNDS2) appear to be handled in the lowering layer but are not covered here — a function using only SHA instructions would receive no target-features attribute, likely causing LLVM backend failures. Either add the +sha case or add a comment marking this as a known gap.

fennoai · 2026-03-07T10:59:09Z

+		}
+	}
+
+	for _, ins := range fn.Instrs {


strings.ToUpper(string(ins.Op)) allocates two strings per instruction (type conversion + ToUpper). Since this is called for every instruction in every function, normalising Op to uppercase at parse time (or storing it normalised) would eliminate this per-instruction allocation cost.

fennoai · 2026-03-07T10:59:11Z

 		if sig.Ret == "" {
 			return "", fmt.Errorf("missing return type for %q", name)
 		}
+		if sig.Attrs == "" {


The guard sig.Attrs == "" means there's no way for a caller to explicitly opt out of feature inference while keeping an empty Attrs. An empty string from the caller and "no manual override" are indistinguishable. If a future caller wants a function with no attribute group, it can't express that. A sentinel value or an explicit DisableFeatureInference bool field on FuncSig would make the intent unambiguous.

fennoai · 2026-03-07T10:59:13Z

+	}
+	if !strings.Contains(ll, `"target-features"="+pclmul,+sse4.1"`) {
+		t.Fatalf("missing pclmul target-features attr:\n%s", ll)
+	}


The test checks that the expected feature strings appear somewhere in the generated IR, but not that they're attached to the correct functions. A regression where both attribute sets are applied to every function would still pass. Consider asserting the per-function define line contains the expected #NNN reference, matching against what attrRegistry emits for each function.

fennoai · 2026-03-07T10:59:29Z

Good approach — moving from static prelude attributes to per-function inference eliminates spurious attribute groups and fixes real CI failures. The core logic in feature_attrs.go is clean and correct for the covered cases. A few items worth addressing: SHA-NI instructions appear to be lowered but aren't covered by the inference (would cause LLVM failures on SHA-heavy code); the stale comment in arm64_translate.go should be removed; and the test asserts feature strings exist in the IR but not that they're bound to the right functions.

plan9asm: infer target features for ISA-specific asm

1ec655c

gemini-code-assist Bot reviewed Mar 7, 2026

View reviewed changes

fennoai Bot reviewed Mar 7, 2026

View reviewed changes

cpunion added 6 commits March 7, 2026 20:44

build: bump llvm to v0.8.6

957755b

test: cover ISA feature inference in CI

5027b03

build: sync cmd/plan9asm llvm dependency

36627ce

ci: gate stdlib asm corpus compilation

d9b7dc5

plan9asm: cover Go 1.26 stdlib asm families

4f17884

ci: validate stdlib corpus with Go 1.26

f1146f5

cpunion merged commit 822503d into xgo-dev:main Mar 7, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plan9asm: infer target features for ISA-specific asm functions#8

plan9asm: infer target features for ISA-specific asm functions#8
cpunion merged 7 commits into
xgo-dev:mainfrom
cpunion:fix/x86-target-feature-attrs

cpunion commented Mar 7, 2026

Uh oh!

gemini-code-assist Bot commented Mar 7, 2026

Uh oh!

codecov Bot commented Mar 7, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 7, 2026

Uh oh!

fennoai Bot Mar 7, 2026

Uh oh!

fennoai Bot Mar 7, 2026

Uh oh!

fennoai Bot Mar 7, 2026

Uh oh!

fennoai Bot Mar 7, 2026

Uh oh!

fennoai Bot Mar 7, 2026

Uh oh!

fennoai Bot Mar 7, 2026

Uh oh!

fennoai Bot commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cpunion commented Mar 7, 2026

Summary

Why

Validation

Uh oh!

gemini-code-assist Bot commented Mar 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

codecov Bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

fennoai Bot commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Mar 7, 2026 •

edited

Loading