Skip to content

Show recovery lock password if available#41913

Closed
mostlikelee wants to merge 114 commits intomainfrom
41779-expedited
Closed

Show recovery lock password if available#41913
mostlikelee wants to merge 114 commits intomainfrom
41779-expedited

Conversation

@mostlikelee
Copy link
Contributor

@mostlikelee mostlikelee commented Mar 17, 2026

Related issue: Resolves #41779

Summary by CodeRabbit

  • Refactor

    • Updated Recovery Lock password availability tracking from status-based to direct availability-based detection, providing clearer indication of password status in the interface and improving state management accuracy.
  • Tests

    • Updated test cases for Recovery Lock password availability verification to align with the new availability-based approach.

mostlikelee and others added 15 commits March 16, 2026 22:10
- Fix ActivityTypeRotatedRecoveryLockPassword to ActivityTypeRotatedHostRecoveryLockPassword in ee/server/service/hosts.go
- Remove duplicate testClaimHostsForRecoveryLockClear function in apple_mdm_test.go

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClearRecoveryLockRotation was unconditionally setting status = NULL,
which would corrupt the recovery lock state. Now it:
- Only affects rows modified by InitiateRecoveryLockRotation
  (status = pending AND pending_encrypted_password IS NOT NULL)
- Restores status to 'verified' instead of NULL

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
# Conflicts:
#	server/datastore/mysql/apple_mdm_test.go
@mostlikelee
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

✅ Actions performed

Full review triggered.

@mostlikelee mostlikelee changed the title 41779 expedited Show recovery lock password if available Mar 17, 2026
@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 88.88889% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.40%. Comparing base (3756a8e) to head (7980121).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
.../hosts/details/HostDetailsPage/HostDetailsPage.tsx 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #41913      +/-   ##
==========================================
- Coverage   66.41%   66.40%   -0.01%     
==========================================
  Files        2509     2510       +1     
  Lines      200785   201158     +373     
  Branches     9055     8905     -150     
==========================================
+ Hits       133350   133580     +230     
- Misses      55375    55508     +133     
- Partials    12060    12070      +10     
Flag Coverage Δ
backend 68.19% <100.00%> (-0.02%) ⬇️
frontend 54.38% <60.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

Walkthrough

This PR adds a password_available boolean field to the Recovery Lock password feature across backend and frontend layers. The field is introduced to the IOSSettings interface, propagated through the HostMDMRecoveryLockPassword struct, queried from the database, and consumed by UI components to determine action availability. The change replaces status-based conditional logic with a simple availability boolean check.

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description only contains a 'Related issue' reference with no details about implementation, changes, or testing performed. Complete the PR description following the template: add implementation details, summarize key changes, describe testing performed, and check relevant items in the provided checklist.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The linked issue #41779 contains only TODOs without concrete acceptance criteria, making compliance validation impossible. Complete issue #41779 with specific acceptance criteria and technical requirements to properly validate that PR objectives are met.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Show recovery lock password if available' directly aligns with the main change: adding a boolean field to indicate password availability and using it to control password display.
Out of Scope Changes check ✅ Passed All changes are focused on the recovery lock password availability feature: adding the PasswordAvailable field, updating logic to check availability instead of status, and updating tests accordingly.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 41779-expedited
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (4)
server/datastore/mysql/secret_variables.go (1)

453-467: Optional: dedupe host secret types before fetching values.

If the same host placeholder appears multiple times in one document, this loop can re-fetch/decrypt the same secret unnecessarily. A small dedupe step would avoid redundant calls.

♻️ Optional refactor
-	for _, secretType := range hostSecrets {
+	seen := make(map[string]struct{}, len(hostSecrets))
+	for _, secretType := range hostSecrets {
+		if _, ok := seen[secretType]; ok {
+			continue
+		}
+		seen[secretType] = struct{}{}
 		switch secretType {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/datastore/mysql/secret_variables.go` around lines 453 - 467, The loop
over hostSecrets may call
getHostRecoveryLockPasswordDecrypted/getHostRecoveryLockPendingPasswordDecrypted
multiple times for the same secret type; before the switch in the function
containing hostSecrets and secretValues, dedupe hostSecrets (e.g., build a
set/map of seen secret types) and iterate only unique types so you avoid
redundant fetch/decrypt calls and only populate secretValues once per
secretType.
server/fleet/apple_mdm.go (1)

1155-1155: Consider stronger typing for OperationType.

OperationType on HostRecoveryLockRotationStatus can use MDMOperationType instead of string to match surrounding model conventions and reduce drift.

♻️ Suggested refinement
 type HostRecoveryLockRotationStatus struct {
 	HostUUID            string  // Host UUID
 	HasPassword         bool    // encrypted_password is not null and deleted=0
 	Status              *string // current status (verified, failed, pending, NULL)
-	OperationType       string  // install or remove
+	OperationType       MDMOperationType // install or remove
 	HasPendingRotation  bool    // pending_encrypted_password is not null
 	PendingErrorMessage *string // error from failed rotation
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/fleet/apple_mdm.go` at line 1155, Change the OperationType field on
the HostRecoveryLockRotationStatus struct from string to the MDMOperationType
type to match surrounding models; locate the HostRecoveryLockRotationStatus
definition and replace the field declaration (OperationType string) with
OperationType MDMOperationType, keeping any existing tags/field name intact and
update any assignments/usages (e.g., where OperationType is set or compared) to
use MDMOperationType values so the code compiles with the stronger type.
server/datastore/mysql/apple_mdm_test.go (2)

11087-11111: Use generated recovery-lock passwords in the rejection cases.

"another-password" and "new-password" make these tests depend on there being no password validation before the pending/eligibility guard. Using apple_mdm.GenerateRecoveryLockPassword() keeps the failure reason pinned to the rotation-state check you actually want to exercise.

♻️ Suggested tweak
-		err = ds.InitiateRecoveryLockRotation(ctx, host.UUID, "another-password")
+		err = ds.InitiateRecoveryLockRotation(ctx, host.UUID, apple_mdm.GenerateRecoveryLockPassword())
@@
-		err = ds.InitiateRecoveryLockRotation(ctx, host.UUID, "new-password")
+		err = ds.InitiateRecoveryLockRotation(ctx, host.UUID, apple_mdm.GenerateRecoveryLockPassword())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/datastore/mysql/apple_mdm_test.go` around lines 11087 - 11111, Replace
the hard-coded passwords in the two failing test cases so they use a valid
generated recovery-lock password; specifically, in the
"InitiateRecoveryLockRotation rejects if already pending" and
"InitiateRecoveryLockRotation rejects pending status" tests call
apple_mdm.GenerateRecoveryLockPassword() for the second
InitiateRecoveryLockRotation attempt (instead of "another-password" /
"new-password") so that the call to InitiateRecoveryLockRotation fails due to
rotation-state checks and not password validation; keep the existing
SetHostsRecoveryLockPasswords usage and the first rotation password generation
unchanged.

11163-11215: Assert that the active password survives failed and cleared rotations.

These subtests only check pending columns and status restoration. If FailRecoveryLockRotation or ClearRecoveryLockRotation accidentally cleared encrypted_password, the “show password if available” behavior would regress and this suite would still pass.

💡 Suggested assertions
 	t.Run("FailRecoveryLockRotation preserves pending password", func(t *testing.T) {
 		host := setupHostWithVerifiedPassword(t, "fail-rotate-host", "failrotuuid")
+		origPw, err := ds.GetHostRecoveryLockPassword(ctx, host.UUID)
+		require.NoError(t, err)

 		// Initiate rotation
 		newPassword := apple_mdm.GenerateRecoveryLockPassword()
-		err := ds.InitiateRecoveryLockRotation(ctx, host.UUID, newPassword)
+		err = ds.InitiateRecoveryLockRotation(ctx, host.UUID, newPassword)
 		require.NoError(t, err)

 		// Fail rotation
 		err = ds.FailRecoveryLockRotation(ctx, host.UUID, "rotation failed due to device error")
 		require.NoError(t, err)
+
+		currentPw, err := ds.GetHostRecoveryLockPassword(ctx, host.UUID)
+		require.NoError(t, err)
+		assert.Equal(t, origPw.Password, currentPw.Password)

 		// Verify pending password is still there (for potential retry)
 		hasPending, pendingErr := getPendingRotationState(t, host.UUID)
@@
 	t.Run("ClearRecoveryLockRotation removes pending", func(t *testing.T) {
 		host := setupHostWithVerifiedPassword(t, "clear-rotate-host", "clearrotuuid")
+		origPw, err := ds.GetHostRecoveryLockPassword(ctx, host.UUID)
+		require.NoError(t, err)

 		// Initiate rotation
 		newPassword := apple_mdm.GenerateRecoveryLockPassword()
-		err := ds.InitiateRecoveryLockRotation(ctx, host.UUID, newPassword)
+		err = ds.InitiateRecoveryLockRotation(ctx, host.UUID, newPassword)
 		require.NoError(t, err)

 		// Clear rotation
 		err = ds.ClearRecoveryLockRotation(ctx, host.UUID)
 		require.NoError(t, err)
+
+		currentPw, err := ds.GetHostRecoveryLockPassword(ctx, host.UUID)
+		require.NoError(t, err)
+		assert.Equal(t, origPw.Password, currentPw.Password)

 		// Verify pending is cleared
 		hasPending, _ := getPendingRotationState(t, host.UUID)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/datastore/mysql/apple_mdm_test.go` around lines 11163 - 11215, Add
assertions that the active encrypted password on the host is preserved after
failing or clearing a rotation: before calling InitiateRecoveryLockRotation
capture the host's encrypted password (from the host returned by
setupHostWithVerifiedPassword), then after FailRecoveryLockRotation and after
ClearRecoveryLockRotation reload the host record (using the existing datastore
lookup helper or ds methods) and assert that the encrypted password field (e.g.,
host.EncryptedRecoveryLockPassword / host.EncryptedPassword) still equals the
original value; keep these checks alongside the existing pending/status
assertions in the "FailRecoveryLockRotation preserves pending password" and
"ClearRecoveryLockRotation removes pending" subtests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ee/server/service/hosts.go`:
- Around line 543-556: The RotateRecoveryLockPassword mutation must require and
validate an auditable viewer before performing any state changes or queuing
commands; move the viewer lookup/requirement to the start of
RotateRecoveryLockPassword (before writing the pending password and before
authorizing/queuing the MDM command) following the same pattern used by
LockHost, UnlockHost, and WipeHost so an actor is recorded, and ensure the
viewer is returned/validated prior to calling svc.ds.Host modifications and
before the authz Authorize(fleet.MDMCommandAuthz{...}, fleet.ActionWrite) check.
- Around line 652-660: When RotateRecoveryLock(ctx, host.UUID, cmdUUID) fails
and it's not an APNSDeliveryError, don't ignore errors from
svc.ds.ClearRecoveryLockRotation; call ClearRecoveryLockRotation and if it
returns an error, combine or wrap that cleanup error with the original
RotateRecoveryLock error and return the combined error (instead of discarding
the cleanup failure). Locate the block around
svc.mdmAppleCommander.RotateRecoveryLock and replace the current silent discard
of ClearRecoveryLockRotation with logic that captures its error (e.g., wrap both
errors or return a new error mentioning both RotateRecoveryLock and
ClearRecoveryLockRotation failures) while preserving the existing
APNSDeliveryError check via errors.As and apnsErr.

In `@server/datastore/mysql/apple_mdm.go`:
- Around line 7350-7356: The SELECT in the stmt constant reads only
error_message so rotation failures recorded to pending_error_message can yield
an empty detail; change the projection to prefer error_message but fall back to
pending_error_message when error_message is empty (for example use
COALESCE(NULLIF(error_message, ''), pending_error_message, '') AS detail) and
apply the same change to the other similar query that reads
host_recovery_key_passwords so failed status returns a populated detail.
- Around line 7621-7644: The retry path is unreachable because
FailRecoveryLockRotation leaves pending_encrypted_password set and
InitiateRecoveryLockRotation rejects rows with pending_encrypted_password IS NOT
NULL while ClearRecoveryLockRotation only handles status='pending'; add a
ResetRecoveryLockForRetry Datastore method that clears
pending_encrypted_password and pending_error_message and resets status (or sets
status to a retryable value) for rows where status='failed', then call this
ResetRecoveryLockForRetry from the service (hosts.go) before calling
InitiateRecoveryLockRotation when HasPendingRotation / status='failed' is
detected; alternatively, allow InitiateRecoveryLockRotation to accept rows with
status='failed' by adjusting its WHERE clause to permit retryable failed
state—modify either ResetRecoveryLockForRetry (recommended) or the WHERE in
InitiateRecoveryLockRotation and update service logic accordingly (references:
InitiateRecoveryLockRotation, FailRecoveryLockRotation,
ClearRecoveryLockRotation and the hosts.go service call sites).

In `@server/service/apple_mdm.go`:
- Around line 7359-7398: The handler currently acts on any pending rotation
without verifying the originating command; persist the rotation's command UUID
when creating a rotation (e.g., in InitiateRecoveryLockRotation) and, here in
the SetRecoveryLock result path, fetch/compare that stored UUID against
results.UUID() before calling CompleteRecoveryLockRotation or
FailRecoveryLockRotation; if they differ, log/ignore the out-of-order result and
return nil, otherwise proceed to complete/fail as now. Ensure the datastore API
used for pending rotation (HasPendingRecoveryLockRotation /
InitiateRecoveryLockRotation) is extended to store or return the pending
command_uuid so you can compare it to results.UUID() prior to mutating rotation
state.

In `@server/service/hosts.go`:
- Around line 3804-3810: Replace the unconditional auth bypass and immediate
fleet.ErrMissingLicense return in RotateRecoveryLockPassword: remove
svc.authz.SkipAuthorization(ctx), perform a host-scoped write authorization
check via the service authz (e.g. call the authz Authorize method for the Host
resource identified by hostID and the write action), then perform a conditional
license check and only return fleet.ErrMissingLicense when the server/license
state indicates rotation is not allowed, and finally invoke the real rotation
flow (call the existing rotation implementation or extract the rotation logic
into a helper and call it) to perform the password rotation and return its
error/result instead of returning immediately.

---

Nitpick comments:
In `@server/datastore/mysql/apple_mdm_test.go`:
- Around line 11087-11111: Replace the hard-coded passwords in the two failing
test cases so they use a valid generated recovery-lock password; specifically,
in the "InitiateRecoveryLockRotation rejects if already pending" and
"InitiateRecoveryLockRotation rejects pending status" tests call
apple_mdm.GenerateRecoveryLockPassword() for the second
InitiateRecoveryLockRotation attempt (instead of "another-password" /
"new-password") so that the call to InitiateRecoveryLockRotation fails due to
rotation-state checks and not password validation; keep the existing
SetHostsRecoveryLockPasswords usage and the first rotation password generation
unchanged.
- Around line 11163-11215: Add assertions that the active encrypted password on
the host is preserved after failing or clearing a rotation: before calling
InitiateRecoveryLockRotation capture the host's encrypted password (from the
host returned by setupHostWithVerifiedPassword), then after
FailRecoveryLockRotation and after ClearRecoveryLockRotation reload the host
record (using the existing datastore lookup helper or ds methods) and assert
that the encrypted password field (e.g., host.EncryptedRecoveryLockPassword /
host.EncryptedPassword) still equals the original value; keep these checks
alongside the existing pending/status assertions in the
"FailRecoveryLockRotation preserves pending password" and
"ClearRecoveryLockRotation removes pending" subtests.

In `@server/datastore/mysql/secret_variables.go`:
- Around line 453-467: The loop over hostSecrets may call
getHostRecoveryLockPasswordDecrypted/getHostRecoveryLockPendingPasswordDecrypted
multiple times for the same secret type; before the switch in the function
containing hostSecrets and secretValues, dedupe hostSecrets (e.g., build a
set/map of seen secret types) and iterate only unique types so you avoid
redundant fetch/decrypt calls and only populate secretValues once per
secretType.

In `@server/fleet/apple_mdm.go`:
- Line 1155: Change the OperationType field on the
HostRecoveryLockRotationStatus struct from string to the MDMOperationType type
to match surrounding models; locate the HostRecoveryLockRotationStatus
definition and replace the field declaration (OperationType string) with
OperationType MDMOperationType, keeping any existing tags/field name intact and
update any assignments/usages (e.g., where OperationType is set or compared) to
use MDMOperationType values so the code compiles with the stronger type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 947d832a-4664-4085-8f92-9bda2e7818c7

📥 Commits

Reviewing files that changed from the base of the PR and between cc02191 and 0a2c5d4.

📒 Files selected for processing (24)
  • ee/server/service/hosts.go
  • frontend/interfaces/host.ts
  • frontend/pages/hosts/details/HostDetailsPage/HostActionsDropdown/HostActionsDropdown.tests.tsx
  • frontend/pages/hosts/details/HostDetailsPage/HostActionsDropdown/HostActionsDropdown.tsx
  • frontend/pages/hosts/details/HostDetailsPage/HostActionsDropdown/helpers.tsx
  • frontend/pages/hosts/details/HostDetailsPage/HostDetailsPage.tsx
  • server/datastore/mysql/apple_mdm.go
  • server/datastore/mysql/apple_mdm_test.go
  • server/datastore/mysql/migrations/tables/20260317120000_AddRecoveryLockPasswordRotation.go
  • server/datastore/mysql/schema.sql
  • server/datastore/mysql/secret_variables.go
  • server/fleet/activities.go
  • server/fleet/apple_mdm.go
  • server/fleet/datastore.go
  • server/fleet/hosts.go
  • server/fleet/secrets.go
  • server/fleet/service.go
  • server/mdm/apple/commander.go
  • server/mock/datastore_mock.go
  • server/mock/service/service_mock.go
  • server/service/apple_mdm.go
  • server/service/apple_mdm_cmd_results_test.go
  • server/service/handler.go
  • server/service/hosts.go

Comment on lines +543 to +556
func (svc *Service) RotateRecoveryLockPassword(ctx context.Context, hostID uint) error {
if err := svc.authz.Authorize(ctx, &fleet.Host{}, fleet.ActionList); err != nil {
return err
}
host, err := svc.ds.Host(ctx, hostID)
if err != nil {
return ctxerr.Wrap(ctx, err, "get host")
}

// Authorize again with team loaded now that we have the host's team_id.
// Authorize as "execute mdm_command", which is the correct access requirement.
if err := svc.authz.Authorize(ctx, fleet.MDMCommandAuthz{TeamID: host.TeamID}, fleet.ActionWrite); err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Require a viewer before starting this mutation.

LockHost, UnlockHost, and WipeHost all reject missing viewer context because they need an auditable actor. Here the viewer lookup happens only after the pending password is written and the command may already be queued, so a bypassed/internal context can rotate the recovery lock without any activity trail.

Suggested fix
 func (svc *Service) RotateRecoveryLockPassword(ctx context.Context, hostID uint) error {
+	vc, ok := viewer.FromContext(ctx)
+	if !ok {
+		return fleet.ErrNoContext
+	}
+
 	if err := svc.authz.Authorize(ctx, &fleet.Host{}, fleet.ActionList); err != nil {
 		return err
 	}
@@
-	vc, ok := viewer.FromContext(ctx)
-	if ok {
-		if err := svc.NewActivity(
-			ctx,
-			vc.User,
-			fleet.ActivityTypeRotatedHostRecoveryLockPassword{
-				HostID:          host.ID,
-				HostDisplayName: host.DisplayName(),
-			},
-		); err != nil {
-			return ctxerr.Wrap(ctx, err, "create activity for rotate recovery lock password")
-		}
-	}
+	if err := svc.NewActivity(
+		ctx,
+		vc.User,
+		fleet.ActivityTypeRotatedHostRecoveryLockPassword{
+			HostID:          host.ID,
+			HostDisplayName: host.DisplayName(),
+		},
+	); err != nil {
+		return ctxerr.Wrap(ctx, err, "create activity for rotate recovery lock password")
+	}

Also applies to: 645-676

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ee/server/service/hosts.go` around lines 543 - 556, The
RotateRecoveryLockPassword mutation must require and validate an auditable
viewer before performing any state changes or queuing commands; move the viewer
lookup/requirement to the start of RotateRecoveryLockPassword (before writing
the pending password and before authorizing/queuing the MDM command) following
the same pattern used by LockHost, UnlockHost, and WipeHost so an actor is
recorded, and ensure the viewer is returned/validated prior to calling
svc.ds.Host modifications and before the authz
Authorize(fleet.MDMCommandAuthz{...}, fleet.ActionWrite) check.

Comment on lines +652 to +660
if err := svc.mdmAppleCommander.RotateRecoveryLock(ctx, host.UUID, cmdUUID); err != nil {
// Only clear the pending rotation if the enqueue itself failed.
// If it's an APNS delivery error, the command was successfully enqueued
// and will be delivered when the device checks in.
var apnsErr *apple_mdm.APNSDeliveryError
if !errors.As(err, &apnsErr) {
_ = svc.ds.ClearRecoveryLockRotation(ctx, host.UUID)
}
return ctxerr.Wrap(ctx, err, "enqueue recovery lock rotation command")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't hide rollback failures after a failed enqueue.

Line 658 ignores ClearRecoveryLockRotation errors. If the enqueue fails and the cleanup also fails, the host is left with a stale pending rotation and every retry will hit the “already in progress” path even though nothing was actually queued.

Suggested fix
 	if err := svc.mdmAppleCommander.RotateRecoveryLock(ctx, host.UUID, cmdUUID); err != nil {
 		// Only clear the pending rotation if the enqueue itself failed.
 		// If it's an APNS delivery error, the command was successfully enqueued
 		// and will be delivered when the device checks in.
 		var apnsErr *apple_mdm.APNSDeliveryError
 		if !errors.As(err, &apnsErr) {
-			_ = svc.ds.ClearRecoveryLockRotation(ctx, host.UUID)
+			if clearErr := svc.ds.ClearRecoveryLockRotation(ctx, host.UUID); clearErr != nil {
+				return ctxerr.Wrap(
+					ctx,
+					fmt.Errorf("enqueue recovery lock rotation failed: %w; additionally failed to clear pending rotation: %v", err, clearErr),
+					"enqueue recovery lock rotation command",
+				)
+			}
 		}
 		return ctxerr.Wrap(ctx, err, "enqueue recovery lock rotation command")
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err := svc.mdmAppleCommander.RotateRecoveryLock(ctx, host.UUID, cmdUUID); err != nil {
// Only clear the pending rotation if the enqueue itself failed.
// If it's an APNS delivery error, the command was successfully enqueued
// and will be delivered when the device checks in.
var apnsErr *apple_mdm.APNSDeliveryError
if !errors.As(err, &apnsErr) {
_ = svc.ds.ClearRecoveryLockRotation(ctx, host.UUID)
}
return ctxerr.Wrap(ctx, err, "enqueue recovery lock rotation command")
if err := svc.mdmAppleCommander.RotateRecoveryLock(ctx, host.UUID, cmdUUID); err != nil {
// Only clear the pending rotation if the enqueue itself failed.
// If it's an APNS delivery error, the command was successfully enqueued
// and will be delivered when the device checks in.
var apnsErr *apple_mdm.APNSDeliveryError
if !errors.As(err, &apnsErr) {
if clearErr := svc.ds.ClearRecoveryLockRotation(ctx, host.UUID); clearErr != nil {
return ctxerr.Wrap(
ctx,
fmt.Errorf("enqueue recovery lock rotation failed: %w; additionally failed to clear pending rotation: %v", err, clearErr),
"enqueue recovery lock rotation command",
)
}
}
return ctxerr.Wrap(ctx, err, "enqueue recovery lock rotation command")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ee/server/service/hosts.go` around lines 652 - 660, When
RotateRecoveryLock(ctx, host.UUID, cmdUUID) fails and it's not an
APNSDeliveryError, don't ignore errors from svc.ds.ClearRecoveryLockRotation;
call ClearRecoveryLockRotation and if it returns an error, combine or wrap that
cleanup error with the original RotateRecoveryLock error and return the combined
error (instead of discarding the cleanup failure). Locate the block around
svc.mdmAppleCommander.RotateRecoveryLock and replace the current silent discard
of ClearRecoveryLockRotation with logic that captures its error (e.g., wrap both
errors or return a new error mentioning both RotateRecoveryLock and
ClearRecoveryLockRotation failures) while preserving the existing
APNSDeliveryError check via errors.As and apnsErr.

Comment on lines +7350 to +7356
const stmt = `
SELECT
status,
COALESCE(error_message, '') AS detail,
encrypted_password IS NOT NULL AS password_available
FROM host_recovery_key_passwords
WHERE host_uuid = ? AND deleted = 0`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

detail can be empty on rotation failures.

At Line 7353, detail only reads error_message, but Line 7720 writes rotation failures to pending_error_message. This can return status=failed with blank detail.

💡 Proposed fix
 const stmt = `
 		SELECT
 			status,
-			COALESCE(error_message, '') AS detail,
+			COALESCE(pending_error_message, error_message, '') AS detail,
 			encrypted_password IS NOT NULL AS password_available
 		FROM host_recovery_key_passwords
 		WHERE host_uuid = ? AND deleted = 0`

Also applies to: 7715-7721

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/datastore/mysql/apple_mdm.go` around lines 7350 - 7356, The SELECT in
the stmt constant reads only error_message so rotation failures recorded to
pending_error_message can yield an empty detail; change the projection to prefer
error_message but fall back to pending_error_message when error_message is empty
(for example use COALESCE(NULLIF(error_message, ''), pending_error_message, '')
AS detail) and apply the same change to the other similar query that reads
host_recovery_key_passwords so failed status returns a populated detail.

Comment on lines +7621 to +7644
func (ds *Datastore) InitiateRecoveryLockRotation(ctx context.Context, hostUUID string, newPassword string) error {
encryptedPassword, err := encrypt([]byte(newPassword), ds.serverPrivateKey)
if err != nil {
return ctxerr.Wrap(ctx, err, "encrypt pending recovery lock password")
}

// Set the pending password and mark status as pending.
// Only allow rotation if:
// - Has an existing password (encrypted_password IS NOT NULL)
// - Operation type is 'install' (not removing the password)
// - Current status is 'verified' or 'failed' (not 'pending' or NULL)
// - No pending rotation already (pending_encrypted_password IS NULL)
stmt := fmt.Sprintf(`
UPDATE host_recovery_key_passwords
SET pending_encrypted_password = ?,
pending_error_message = NULL,
status = '%s'
WHERE host_uuid = ?
AND deleted = 0
AND encrypted_password IS NOT NULL
AND operation_type = '%s'
AND status IN ('%s', '%s')
AND pending_encrypted_password IS NULL
`, fleet.MDMDeliveryPending, fleet.MDMOperationTypeInstall, fleet.MDMDeliveryVerified, fleet.MDMDeliveryFailed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect all call sites for rotation lifecycle methods.
rg -n --type=go -C4 '\b(InitiateRecoveryLockRotation|CompleteRecoveryLockRotation|FailRecoveryLockRotation|ClearRecoveryLockRotation|HasPendingRecoveryLockRotation|GetRecoveryLockRotationStatus)\b'

# Inspect pending-rotation state handling in service/orchestration layers.
rg -n --type=go -C4 'pending_encrypted_password|pending_error_message'

Repository: fleetdm/fleet

Length of output: 50369


🏁 Script executed:

# Get FailRecoveryLockRotation and ClearRecoveryLockRotation implementations
sed -n '7715,7776p' server/datastore/mysql/apple_mdm.go

Repository: fleetdm/fleet

Length of output: 2230


🏁 Script executed:

# Search for where ResetRecoveryLockForRetry and related retry logic is called
rg -n --type=go 'ResetRecoveryLockForRetry|pending.*password.*retry|retry.*pending' -A 3 -B 3

Repository: fleetdm/fleet

Length of output: 15212


🏁 Script executed:

# Check apple_mdm.go service layer handling around line 7390 for rotation failures
sed -n '7370,7410p' server/service/apple_mdm.go

Repository: fleetdm/fleet

Length of output: 1466


🏁 Script executed:

# Check if there's any other code path that might clear pending password when status='failed'
rg -n --type=go -A 5 'status.*failed.*pending|pending.*status.*failed' server/datastore/mysql/apple_mdm.go | head -40

Repository: fleetdm/fleet

Length of output: 362


🏁 Script executed:

# Check all uses of InitiateRecoveryLockRotation to see if there's any retry/reset mechanism
rg -n --type=go -B 5 'InitiateRecoveryLockRotation' server/service/ ee/server/service/ | grep -A 5 -B 5 'FailRecoveryLockRotation\|ResetRecoveryLock'

Repository: fleetdm/fleet

Length of output: 39


🏁 Script executed:

# Check ee/server/service/hosts.go where InitiateRecoveryLockRotation is called
sed -n '595,670p' ee/server/service/hosts.go

Repository: fleetdm/fleet

Length of output: 2532


Failed-rotation retry is unreachable.

After FailRecoveryLockRotation sets status='failed', the pending password is preserved but cannot be retried:

  1. The API check at ee/server/service/hosts.go line 620 blocks re-initiation when HasPendingRotation=true
  2. InitiateRecoveryLockRotation (line 7643) blocks when pending_encrypted_password IS NOT NULL
  3. ClearRecoveryLockRotation (line 7752) only applies when status='pending', not status='failed'
  4. Unlike clear operations, there is no ResetRecoveryLockForRetry call for rotation failures in the service layer (see line 7493 for clear-only usage)

Without a reset mechanism for failed rotations, users cannot retry after transient device failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/datastore/mysql/apple_mdm.go` around lines 7621 - 7644, The retry path
is unreachable because FailRecoveryLockRotation leaves
pending_encrypted_password set and InitiateRecoveryLockRotation rejects rows
with pending_encrypted_password IS NOT NULL while ClearRecoveryLockRotation only
handles status='pending'; add a ResetRecoveryLockForRetry Datastore method that
clears pending_encrypted_password and pending_error_message and resets status
(or sets status to a retryable value) for rows where status='failed', then call
this ResetRecoveryLockForRetry from the service (hosts.go) before calling
InitiateRecoveryLockRotation when HasPendingRotation / status='failed' is
detected; alternatively, allow InitiateRecoveryLockRotation to accept rows with
status='failed' by adjusting its WHERE clause to permit retryable failed
state—modify either ResetRecoveryLockForRetry (recommended) or the WHERE in
InitiateRecoveryLockRotation and update service logic accordingly (references:
InitiateRecoveryLockRotation, FailRecoveryLockRotation,
ClearRecoveryLockRotation and the hosts.go service call sites).

Comment on lines +7359 to +7398
// Check if this is a rotation (has pending password)
hasPendingRotation, err := ds.HasPendingRecoveryLockRotation(ctx, hostUUID)
if err != nil {
return ctxerr.Wrap(ctx, err, "SetRecoveryLock handler: check pending rotation")
}

if hasPendingRotation {
// This is a rotation result
logger.DebugContext(ctx, "SetRecoveryLock rotation result received",
"host_uuid", hostUUID,
"command_uuid", results.UUID(),
"status", status,
)

switch status {
case fleet.MDMAppleStatusAcknowledged:
// Rotation succeeded - move pending password to active
if err := ds.CompleteRecoveryLockRotation(ctx, hostUUID); err != nil {
return ctxerr.Wrap(ctx, err, "SetRecoveryLock handler: complete rotation")
}
logger.InfoContext(ctx, "RotateRecoveryLock acknowledged, password rotated",
"host_uuid", hostUUID,
)

case fleet.MDMAppleStatusError, fleet.MDMAppleStatusCommandFormatError:
errorMsg := apple_mdm.FmtErrorChain(rlResult.cmdResult.ErrorChain)
if errorMsg == "" {
errorMsg = "RotateRecoveryLock command failed"
}
if err := ds.FailRecoveryLockRotation(ctx, hostUUID, errorMsg); err != nil {
return ctxerr.Wrap(ctx, err, "SetRecoveryLock handler: fail rotation")
}
logger.WarnContext(ctx, "RotateRecoveryLock command failed",
"host_uuid", hostUUID,
"error", errorMsg,
)
}

return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify how rotation attempts are recorded and whether command UUID is persisted/checked.
rg -n -C4 'InitiateRecoveryLockRotation|HasPendingRecoveryLockRotation|CompleteRecoveryLockRotation|FailRecoveryLockRotation|GetRecoveryLockRotationStatus' server

# Inspect SetRecoveryLock enqueue paths and whether a single in-flight rotation is strictly enforced.
rg -n -C4 'RotateRecoveryLock|SetRecoveryLock|RecoveryLock.*CommandUUID|pending_recovery_lock|recovery_lock_rotation' server

# Inspect datastore implementation details for rotation state shape (including command UUID fields).
fd 'apple_mdm.go' server/datastore | xargs -I{} rg -n -C4 'RecoveryLockRotation|pending|command_uuid|host_uuid' {}

Repository: fleetdm/fleet

Length of output: 50370


🏁 Script executed:

# Find HostRecoveryLockRotationStatus definition
rg -n 'type HostRecoveryLockRotationStatus' server --type go

# Check the struct fields
rg -n -A10 'type HostRecoveryLockRotationStatus struct' server

Repository: fleetdm/fleet

Length of output: 807


🏁 Script executed:

# Inspect InitiateRecoveryLockRotation implementation (around line 7621)
sed -n '7621,7686p' server/datastore/mysql/apple_mdm.go

Repository: fleetdm/fleet

Length of output: 2514


🏁 Script executed:

# Check if command UUID is stored anywhere in recovery lock rotation context
rg -n 'rotation.*command|command.*uuid.*rotation|pending_.*command' server/datastore/mysql/apple_mdm.go

Repository: fleetdm/fleet

Length of output: 122


🏁 Script executed:

# Find where rotation is initiated from - look for EnqueueCommand calls with RotateRecoveryLock
rg -n -B5 -A5 'RotateRecoveryLock.*EnqueueCommand|EnqueueCommand.*RotateRecoveryLock' server

Repository: fleetdm/fleet

Length of output: 39


🏁 Script executed:

# Look at the SetRecoveryLock result handler around line 7359
sed -n '7330,7400p' server/service/apple_mdm.go

Repository: fleetdm/fleet

Length of output: 3000


🏁 Script executed:

# Check if the result has a command UUID and how it's used
rg -n 'results\.UUID\(\)|CommandUUID|command.*uuid.*result' server/service/apple_mdm.go | head -30

Repository: fleetdm/fleet

Length of output: 2853


🏁 Script executed:

# Check how SetRecoveryLock command is enqueued - see if command UUID is returned/stored
rg -n -B3 -A3 'EnqueueCommand.*SetRecoveryLock|SetRecoveryLock.*EnqueueCommand' server/service

Repository: fleetdm/fleet

Length of output: 39


🏁 Script executed:

# Find where RotateRecoveryLock command is enqueued
rg -n 'RotateRecoveryLock' server/service --type go | grep -i enqueue | head -20

Repository: fleetdm/fleet

Length of output: 39


🏁 Script executed:

# Look for the command enqueue request path and any validation
rg -n -B5 -A10 'RotateRecoveryLockCommand|rotate.*recovery.*lock.*command' server/service --type go | head -80

Repository: fleetdm/fleet

Length of output: 39


🏁 Script executed:

# Check if there's API-level validation preventing new rotation while pending
rg -n 'InitiateRecoveryLockRotation' server/service --type go -B5 -A5

Repository: fleetdm/fleet

Length of output: 39


🏁 Script executed:

# Search for recovery lock rotation API endpoints or handlers
rg -n 'rotation|RotateRecoveryLock' server/service/apple_mdm.go | head -40

Repository: fleetdm/fleet

Length of output: 805


🏁 Script executed:

# Find where SetRecoveryLock command is enqueued
rg -n -B10 -A5 'SetRecoveryLockCmdName|SetRecoveryLock.*command' server/service/apple_mdm.go | head -100

Repository: fleetdm/fleet

Length of output: 3546


🏁 Script executed:

# Check the full context of command enqueue for SetRecoveryLock
rg -n 'EnqueueCommand' server/service/apple_mdm.go | head -20

Repository: fleetdm/fleet

Length of output: 191


Store and correlate command UUID before completing or failing rotation.

The handler processes SetRecoveryLock results containing a command UUID (via results.UUID()) but never persists or compares it against the pending rotation state. This creates a race condition where stale or out-of-order results can mutate the wrong rotation:

  1. Scenario: Stale ACK completes new rotation

    • Rotation A enqueued → A result arrives with ERROR → FailRecoveryLockRotation called
    • Rotation B enqueued (new pending password) → Old stale A ACK arrives
    • Handler calls CompleteRecoveryLockRotation for B with A's result
  2. Scenario: Stale ERROR fails new rotation

    • Rotation A enqueued → A result arrives with ACK → CompleteRecoveryLockRotation called
    • Rotation B enqueued (new pending password) → Old stale A ERROR arrives
    • Handler calls FailRecoveryLockRotation for B with A's error

While InitiateRecoveryLockRotation enforces single-flight (no concurrent rotations), this only prevents simultaneous attempts, not out-of-order delivery across rotation attempts.

Fix: Store command_uuid when initiating rotation, then compare results.UUID() before calling CompleteRecoveryLockRotation or FailRecoveryLockRotation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/service/apple_mdm.go` around lines 7359 - 7398, The handler currently
acts on any pending rotation without verifying the originating command; persist
the rotation's command UUID when creating a rotation (e.g., in
InitiateRecoveryLockRotation) and, here in the SetRecoveryLock result path,
fetch/compare that stored UUID against results.UUID() before calling
CompleteRecoveryLockRotation or FailRecoveryLockRotation; if they differ,
log/ignore the out-of-order result and return nil, otherwise proceed to
complete/fail as now. Ensure the datastore API used for pending rotation
(HasPendingRecoveryLockRotation / InitiateRecoveryLockRotation) is extended to
store or return the pending command_uuid so you can compare it to results.UUID()
prior to mutating rotation state.

Comment on lines +3804 to +3810
func (svc *Service) RotateRecoveryLockPassword(ctx context.Context, hostID uint) error {
// skipauth: No authorization check needed due to implementation returning
// only license error.
svc.authz.SkipAuthorization(ctx)

return fleet.ErrMissingLicense
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Rotation is currently hard-disabled and authorization is bypassed.

At Line 3807 and Line 3809, the method skips auth and always returns fleet.ErrMissingLicense. This makes the endpoint unusable even for licensed users and bypasses normal access control for a sensitive action. Please enforce host-scoped write auth, perform a conditional license check, and invoke the real rotation flow instead of returning unconditionally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/service/hosts.go` around lines 3804 - 3810, Replace the unconditional
auth bypass and immediate fleet.ErrMissingLicense return in
RotateRecoveryLockPassword: remove svc.authz.SkipAuthorization(ctx), perform a
host-scoped write authorization check via the service authz (e.g. call the authz
Authorize method for the Host resource identified by hostID and the write
action), then perform a conditional license check and only return
fleet.ErrMissingLicense when the server/license state indicates rotation is not
allowed, and finally invoke the real rotation flow (call the existing rotation
implementation or extract the rotation logic into a helper and call it) to
perform the password rotation and return its error/result instead of returning
immediately.

@mostlikelee
Copy link
Contributor Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
server/datastore/mysql/apple_mdm.go (1)

7350-7354: ⚠️ Potential issue | 🟠 Major

Use pending_error_message fallback when populating recovery-lock detail.

At Line 7353, detail only reads error_message, but rotation failures are recorded in pending_error_message (Line 7720). This can return status=failed with blank detail.

Proposed fix
 		SELECT
 			status,
-			COALESCE(error_message, '') AS detail,
+			COALESCE(NULLIF(error_message, ''), pending_error_message, '') AS detail,
 			encrypted_password IS NOT NULL AS password_available
 		FROM host_recovery_key_passwords
 		WHERE host_uuid = ? AND deleted = 0`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/datastore/mysql/apple_mdm.go` around lines 7350 - 7354, The SELECT
currently sets detail to only COALESCE(error_message, '') so rotation failures
recorded in pending_error_message are dropped; update the query in the const
stmt to use COALESCE(error_message, pending_error_message, '') AS detail so
recovery-lock rows with status='failed' will include the pending_error_message,
then run tests that exercise the rotation failure path (the code using stmt in
apple_mdm.go) to confirm detail is populated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@server/datastore/mysql/apple_mdm.go`:
- Around line 7350-7354: The SELECT currently sets detail to only
COALESCE(error_message, '') so rotation failures recorded in
pending_error_message are dropped; update the query in the const stmt to use
COALESCE(error_message, pending_error_message, '') AS detail so recovery-lock
rows with status='failed' will include the pending_error_message, then run tests
that exercise the rotation failure path (the code using stmt in apple_mdm.go) to
confirm detail is populated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f2a390cf-3759-43f1-a385-2306234c8007

📥 Commits

Reviewing files that changed from the base of the PR and between 3756a8e and 7980121.

📒 Files selected for processing (8)
  • frontend/interfaces/host.ts
  • frontend/pages/hosts/details/HostDetailsPage/HostActionsDropdown/HostActionsDropdown.tests.tsx
  • frontend/pages/hosts/details/HostDetailsPage/HostActionsDropdown/HostActionsDropdown.tsx
  • frontend/pages/hosts/details/HostDetailsPage/HostActionsDropdown/helpers.tsx
  • frontend/pages/hosts/details/HostDetailsPage/HostDetailsPage.tsx
  • server/datastore/mysql/apple_mdm.go
  • server/datastore/mysql/apple_mdm_test.go
  • server/fleet/hosts.go

@mostlikelee mostlikelee deleted the 41779-expedited branch March 18, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set Password (new): show password if available

2 participants