Test kill capability directly instead of parsing privilege labels by aparajon · Pull Request #658 · block/spirit

aparajon · 2026-03-12T17:32:01Z

Problem

Spirit's preflight privilege check string-matches SHOW GRANTS for CONNECTION_ADMIN. This fails on managed MySQL services where the capability is granted via roles without exposing the label in SHOW GRANTS output.

On RDS MySQL 8.4, CONNECTION_ADMIN was removed from rds_superuser_role. The user can kill connections (the capability is inherited through the role), but:

GRANT CONNECTION_ADMIN ON *.* fails — the RDS admin user lacks GRANT OPTION for it
Granting rds_superuser_role gives the kill capability, but SHOW GRANTS only shows the role name, not expanded privileges
SET ROLE ALL + SHOW GRANTS still doesn't surface CONNECTION_ADMIN because it's not a standard grant within the role on 8.4

Result: Spirit rejects users who can actually KILL connections because the string CONNECTION_ADMIN never appears in their grants.

Fix

Two-layer approach: grant parsing (with role activation) as a fast path, with a direct capability probe as fallback.

1. `showGrantsWithRoles()` — role-aware grant parsing

Runs SET ROLE ALL on a pinned connection before SHOW GRANTS, so role-inherited privileges (like REPLICATION CLIENT) are visible. This handles most managed service cases where privileges are assigned via roles that aren't set as DEFAULT ROLE.

CONNECTION_ADMIN is still parsed from grants as a first-pass check. When visible (normal MySQL, or after SET ROLE ALL), the probe is skipped entirely.

2. `canKillConnections()` — direct capability probe (fallback)

When CONNECTION_ADMIN isn't visible in grants (e.g. RDS 8.4), we test the actual capability by spawning a victim connection:

Create a temporary _spirit_kill_probe MySQL user
Connect as that user to create a victim connection
Attempt KILL <victim_id> from the caller's connection
Clean up the probe user

This works because MySQL only checks kill privileges when the target thread belongs to a different user — same-user kills always succeed, and non-existent thread IDs skip the privilege check entirely (returning ER_NO_SUCH_THREAD regardless of privilege level). By targeting a real connection from a different user, we get a definitive answer:

ER_KILL_DENIED_ERROR (1095) → caller lacks the privilege
Success or ER_NO_SUCH_THREAD (1094) → caller has the privilege

The victim connection is purpose-built for this test — no risk to active workloads.

Additional changes

PROCESS is still string-matched from grants rather than capability-tested. Unlike kill capability, PROCESS is always directly granted (never hidden behind roles), and there's no cheap side-effect-free probe for it.
TestCanKillConnections — new test covering root (has privilege), unprivileged user with CREATE USER but no kill (lacks privilege), and user after granting CONNECTION_ADMIN

Tested

Deployed to AWS App Runner + RDS MySQL 8.4 with a user that has kill capability via rds_superuser_role — deployment succeeds with force-kill enabled

Spirit's privilege preflight check string-matches SHOW GRANTS for CONNECTION_ADMIN. This fails on managed MySQL services (e.g. RDS MySQL 8.4) where the capability is granted via roles without exposing the label in SHOW GRANTS output. Replace the CONNECTION_ADMIN string check with a KILL 0 probe: MySQL returns error 1094 (Unknown thread id) when the user has the privilege, or error 1095 (Access denied) when they don't. This tests the actual capability regardless of how it was granted. Also activate granted roles (SET ROLE ALL) before SHOW GRANTS so role-inherited privileges like REPLICATION CLIENT are visible.

Copilot

Pull request overview

Updates Spirit’s MySQL privilege preflight to avoid relying solely on string-matching SHOW GRANTS for CONNECTION_ADMIN, improving compatibility with managed MySQL services where effective privileges can be inherited via roles and/or not shown in grants output.

Changes:

Add showGrantsWithRoles() to run SET ROLE ALL (session-scoped) before SHOW GRANTS, improving role-aware grant parsing.
Add canKillConnections() fallback probe that verifies kill capability by creating a disposable “victim” connection from another user and attempting KILL.
Add TestCanKillConnections coverage for the kill capability probe behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
pkg/migration/check/privileges.go	Adds role-aware grant retrieval and a direct kill-capability probe fallback for force-kill preflight checks.
pkg/migration/check/privileges_test.go	Extends privilege tests and adds a new test validating the kill-capability probe.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-12T18:58:08Z

+	if _, err := db.ExecContext(ctx, fmt.Sprintf("CREATE USER %s", probeUser)); err != nil {
+		return fmt.Errorf("cannot verify kill capability (CREATE USER failed): %w", err)
+	}
+	defer func() {
+		_, _ = db.ExecContext(context.Background(), fmt.Sprintf("DROP USER IF EXISTS %s", probeUser))
+	}()
+
+	// Connect as the probe user to create the victim connection.
+	victimCfg := gmysql.NewConfig()
+	victimCfg.User = probeUser
+	victimCfg.Net = "tcp"
+	victimCfg.Addr = host
+	victimDB, err := sql.Open("mysql", victimCfg.FormatDSN())


The probe user is created without any authentication (CREATE USER _spirit_kill_probe) and then Spirit tries to connect with an empty password. Some MySQL configurations/policies disallow passwordless accounts, which would make the capability probe fail even when the caller has kill capability. Consider creating the probe user with a strong random password (and using it in the DSN) to make the probe work reliably across environments.

Copilot · 2026-03-12T18:58:08Z

+	errStr := killErr.Error()
+	if strings.Contains(errStr, "1094") || strings.Contains(errStr, "Unknown thread id") {


canKillConnections() determines whether the error is ER_NO_SUCH_THREAD by string-matching the error text ("1094" / "Unknown thread id"). This is brittle across drivers/locales and can misclassify errors. Prefer checking the concrete MySQL error type (e.g., *mysql.MySQLError) and comparing the numeric error code (1094/1095).

Suggested change

errStr := killErr.Error()

if strings.Contains(errStr, "1094") || strings.Contains(errStr, "Unknown thread id") {

var mysqlErr *gmysql.MySQLError

if errors.As(killErr, &mysqlErr) && mysqlErr.Number == 1094 {

Copilot · 2026-03-12T18:58:09Z

+	config, err := mysql.ParseDSN(testutils.DSN())
+	require.NoError(t, err)
+	config.User = "root"
+	db, err := sql.Open("mysql", fmt.Sprintf("%s:%s@tcp(%s)/%s", config.User, config.Passwd, config.Addr, config.DBName))
+	require.NoError(t, err)


TestCanKillConnections rebuilds DSNs with fmt.Sprintf instead of using the parsed mysql.Config’s FormatDSN(). This drops any parameters present in testutils.DSN() (e.g., TLS, timeouts, params), which can make the test flaky if the test DSN changes. Prefer updating the mysql.Config fields (User/Passwd/DBName) and calling FormatDSN().

Copilot · 2026-03-12T18:58:09Z

+	unprivDB, err := sql.Open("mysql", fmt.Sprintf("testkillprobe:@tcp(%s)/", host))
+	require.NoError(t, err)


In TestCanKillConnections the unprivileged connection DSN is hard-coded as testkillprobe:@tcp(%s)/, which again bypasses mysql.Config parsing/escaping and omits any required DSN params (notably TLS). Building this DSN via mysql.Config (and using the same host/params as testutils.DSN()) will make the test more robust across environments.

Copilot · 2026-03-12T18:58:09Z

+	const probeUser = "_spirit_kill_probe"
+
+	// Create a temporary user to own the victim connection.
+	if _, err := db.ExecContext(ctx, fmt.Sprintf("DROP USER IF EXISTS %s", probeUser)); err != nil {
+		return fmt.Errorf("cannot verify kill capability (DROP USER failed): %w", err)
+	}
+	if _, err := db.ExecContext(ctx, fmt.Sprintf("CREATE USER %s", probeUser)); err != nil {
+		return fmt.Errorf("cannot verify kill capability (CREATE USER failed): %w", err)
+	}
+	defer func() {
+		_, _ = db.ExecContext(context.Background(), fmt.Sprintf("DROP USER IF EXISTS %s", probeUser))
+	}()


canKillConnections() uses a fixed global username ("_spirit_kill_probe") and unconditionally runs DROP USER IF EXISTS / CREATE USER. This can (a) clobber a legitimate existing user with that name, and (b) race if multiple Spirit instances (or tests) run concurrently against the same server. Use a per-run unique probe username (random suffix / connection id) and only drop the specific user you created; avoid deleting a pre-existing account that you didn't create.

morgo · 2026-03-16T00:18:24Z

Implemented in #659 instead

aparajon force-pushed the rds-superuser-privilege-check branch from 53c367a to 63a6dee Compare March 12, 2026 18:00

aparajon force-pushed the rds-superuser-privilege-check branch from 63a6dee to 248c823 Compare March 12, 2026 18:50

aparajon requested a review from Copilot March 12, 2026 18:52

Copilot started reviewing on behalf of aparajon March 12, 2026 18:52 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

morgo mentioned this pull request Mar 13, 2026

Support privileges coming from roles #659

Merged

morgo closed this Mar 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test kill capability directly instead of parsing privilege labels#658

Test kill capability directly instead of parsing privilege labels#658
aparajon wants to merge 1 commit into
block:mainfrom
aparajon:rds-superuser-privilege-check

aparajon commented Mar 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

morgo commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		errStr := killErr.Error()
		if strings.Contains(errStr, "1094") \|\| strings.Contains(errStr, "Unknown thread id") {

		unprivDB, err := sql.Open("mysql", fmt.Sprintf("testkillprobe:@tcp(%s)/", host))
		require.NoError(t, err)

Conversation

aparajon commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

1. showGrantsWithRoles() — role-aware grant parsing

2. canKillConnections() — direct capability probe (fallback)

Additional changes

Tested

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

morgo commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aparajon commented Mar 12, 2026 •

edited

Loading

1. `showGrantsWithRoles()` — role-aware grant parsing

2. `canKillConnections()` — direct capability probe (fallback)