Skip to content

Fixed a COM deadlock on Windows that could cause orbit to become unresponsive during BitLocker encryption#40142

Merged
getvictor merged 4 commits intomainfrom
victor/38405-bitlocker-encryption
Feb 20, 2026
Merged

Fixed a COM deadlock on Windows that could cause orbit to become unresponsive during BitLocker encryption#40142
getvictor merged 4 commits intomainfrom
victor/38405-bitlocker-encryption

Conversation

@getvictor
Copy link
Member

@getvictor getvictor commented Feb 20, 2026

Related issue: Resolves #38405

See issue for the root cause and fix description.

Checklist for submitter

  • Changes file added for user-visible changes in changes/, orbit/changes/ or ee/fleetd-chrome/changes.
    See Changes files for more information.

Testing

  • QA'd all new/changed functionality manually

fleetd/orbit/Fleet Desktop

  • Verified compatibility with the latest released version of Fleet (see Must rule)
  • If the change applies to only one platform, confirmed that runtime.GOOS is used as needed to isolate changes
  • Verified that fleetd runs on macOS, Linux and Windows
  • Verified auto-update works from the released version of component to the new version (see tools/tuf/test)

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Resolved BitLocker encryption deadlock on Windows systems during enforcement operations
  • Refactor

    • Improved osquery version tracking consistency across enrollment flows
    • Enhanced BitLocker operations handling on Windows with optimized threading

@getvictor
Copy link
Member Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

✅ Actions performed

Full review triggered.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical COM deadlock issue on Windows that could cause orbit to become unresponsive during BitLocker encryption enforcement. The root cause was that BitLocker, MDM Bridge, and Windows Update were all sharing a single COM thread via the comshim library, and rapid ref count oscillations during BitLocker's multi-volume enumeration created a race condition that led to permanent deadlocks.

Changes:

  • Introduced a dedicated COMWorker for BitLocker that initializes COM once on a locked OS thread, bypassing the shared comshim singleton
  • Refactored BitLocker public API functions to be package-private *OnCOMThread variants exposed through COMWorker methods
  • Fixed a nil pointer panic when running orbit with --disable-updates by introducing an osqueryVersion variable

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
orbit/pkg/bitlocker/bitlocker_worker_windows.go New COMWorker implementation that manages a dedicated OS thread with persistent COM initialization for all BitLocker operations
orbit/pkg/bitlocker/bitlocker_worker_notwindows.go No-op COMWorker stub for non-Windows platforms
orbit/pkg/bitlocker/bitlocker_management_windows.go Removed comshim usage, renamed public functions to *OnCOMThread variants for internal use via COMWorker
orbit/pkg/bitlocker/bitlocker_management_notwindows.go Removed obsolete public function stubs (now handled by COMWorker stubs)
orbit/pkg/update/notifications.go Updated middleware to accept COMWorker parameter and set function pointers from it; removed nil checks since functions are always set when middleware is registered
orbit/cmd/orbit/orbit.go Created COMWorker on Windows with proper error handling and cleanup; fixed osqueryVersion nil pointer by introducing local variable set in both enable/disable-updates paths
orbit/changes/38405-bitlocker-encryption Added user-facing changelog entry

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

Walkthrough

This pull request introduces a dedicated COM worker for BitLocker operations on Windows to resolve a COM deadlock. The changes create a new COMWorker type that runs BitLocker encryption/decryption/status operations on an isolated OS thread with its own COM initialization, bypassing the shared comshim singleton. Corresponding stub functions are removed from non-Windows builds. The BitLocker management functions are made private and renamed to indicate COM-thread usage. Additionally, a nil pointer panic in orbit.go is fixed by introducing separate osqueryVersion variable tracking. The Windows MDM BitLocker middleware is updated to receive the COM worker via dependency injection.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: fixing a COM deadlock during BitLocker encryption on Windows.
Linked Issues check ✅ Passed The code changes fully address the linked issue #38405: creates a dedicated COM worker goroutine for BitLocker to avoid deadlock, removes comshim integration, and fixes the osqueryVersion nil pointer panic.
Out of Scope Changes check ✅ Passed All changes are within scope: BitLocker COM worker implementation, removal of comshim, osqueryVersion tracking fix, and related refactoring directly address the stated objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed PR description includes related issue, changes file confirmation, manual QA, and platform compatibility verification, covering key submission requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch victor/38405-bitlocker-encryption

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
orbit/pkg/update/notifications.go (1)

562-575: Removal of nil guard is safe given the new wiring, but the function will now panic if the middleware is ever constructed without a COMWorker.

Previously, execGetEncryptionStatusFn (and the encrypt/decrypt counterparts) had nil fallbacks. Now they unconditionally invoke the function pointer. This is fine for production since the call site in orbit.go ensures a valid comWorker, but test code that constructs windowsMDMBitlockerConfigReceiver directly must always set these fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@orbit/pkg/update/notifications.go` around lines 562 - 575, The method
getEncryptionStatusForVolume on windowsMDMBitlockerConfigReceiver now directly
calls execGetEncryptionStatusFn and will panic if that function field is nil;
add a nil guard to check execGetEncryptionStatusFn (and similarly for the
encrypt/decrypt function fields mentioned in the review) before invoking and
return a clear error (e.g., "execGetEncryptionStatusFn not set") so tests or any
manually constructed windowsMDMBitlockerConfigReceiver don't cause a nil-pointer
panic, or alternatively ensure the constructor/initializer for
windowsMDMBitlockerConfigReceiver always assigns safe default no-op
implementations to these function fields.
orbit/pkg/bitlocker/bitlocker_worker_windows.go (1)

65-77: exec() will panic on send-to-closed-channel if called after Close().

If any public method (GetEncryptionStatus, EncryptVolume, DecryptVolume) is invoked after Close() has been called, the send on w.workCh at line 75 will panic. In the current usage pattern (orbit shutdown via defer), this is unlikely, but defensively you could recover from the panic or check the done channel before sending.

🛡️ Defensive option: guard exec against closed worker
 func (w *COMWorker) exec(fn func() (any, error)) comWorkResult {
+	select {
+	case <-w.done:
+		return comWorkResult{err: errors.New("COMWorker is closed")}
+	default:
+	}
 	ch := make(chan comWorkResult, 1)
-	w.workCh <- comWorkItem{fn: fn, result: ch}
-	return <-ch
+	select {
+	case w.workCh <- comWorkItem{fn: fn, result: ch}:
+		return <-ch
+	case <-w.done:
+		return comWorkResult{err: errors.New("COMWorker is closed")}
+	}
 }

Note: this also requires adding "errors" to the import block.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@orbit/pkg/bitlocker/bitlocker_worker_windows.go` around lines 65 - 77, The
exec method can panic if w.workCh is closed by Close(); update COMWorker.exec to
avoid sending on a closed channel by selecting on w.done (or a closed indicator)
before attempting to send: use a select that returns an appropriate error (e.g.,
via the errors package) if w.done is closed/canceled, otherwise send the
comWorkItem and wait for the result; ensure Close still closes workCh and
signals w.done so exec returns safely instead of panicking. Reference:
COMWorker.exec, COMWorker.Close, w.workCh and w.done (and add "errors" to
imports).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@orbit/pkg/bitlocker/bitlocker_worker_windows.go`:
- Around line 65-77: The exec method can panic if w.workCh is closed by Close();
update COMWorker.exec to avoid sending on a closed channel by selecting on
w.done (or a closed indicator) before attempting to send: use a select that
returns an appropriate error (e.g., via the errors package) if w.done is
closed/canceled, otherwise send the comWorkItem and wait for the result; ensure
Close still closes workCh and signals w.done so exec returns safely instead of
panicking. Reference: COMWorker.exec, COMWorker.Close, w.workCh and w.done (and
add "errors" to imports).

In `@orbit/pkg/update/notifications.go`:
- Around line 562-575: The method getEncryptionStatusForVolume on
windowsMDMBitlockerConfigReceiver now directly calls execGetEncryptionStatusFn
and will panic if that function field is nil; add a nil guard to check
execGetEncryptionStatusFn (and similarly for the encrypt/decrypt function fields
mentioned in the review) before invoking and return a clear error (e.g.,
"execGetEncryptionStatusFn not set") so tests or any manually constructed
windowsMDMBitlockerConfigReceiver don't cause a nil-pointer panic, or
alternatively ensure the constructor/initializer for
windowsMDMBitlockerConfigReceiver always assigns safe default no-op
implementations to these function fields.

@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

❌ Patch coverage is 15.78947% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.35%. Comparing base (06c192f) to head (5ed281b).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
orbit/cmd/orbit/orbit.go 0.00% 6 Missing ⚠️
orbit/pkg/bitlocker/bitlocker_worker_notwindows.go 0.00% 5 Missing ⚠️
orbit/pkg/update/notifications.go 37.50% 5 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #40142    +/-   ##
========================================
  Coverage   66.35%   66.35%            
========================================
  Files        2453     2449     -4     
  Lines      196515   196423    -92     
  Branches     8643     8520   -123     
========================================
- Hits       130388   130334    -54     
+ Misses      54317    54282    -35     
+ Partials    11810    11807     -3     
Flag Coverage Δ
backend 68.11% <15.78%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines -414 to -429
func GetRecoveryKeys(targetVolume string) (map[string]string, error) {
// Connect to the volume
vol, err := bitlockerConnect(targetVolume)
if err != nil {
return nil, fmt.Errorf("connecting to the volume: %w", err)
}
defer vol.bitlockerClose()

// Get recovery keys
keys, err := vol.getProtectorsKeys()
if err != nil {
return nil, fmt.Errorf("retreving protection keys: %w", err)
}

return keys, nil
}
Copy link
Member Author

@getvictor getvictor Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code that was never used. EncryptVolume returns the recovery key as part of the encryption process.

@getvictor getvictor marked this pull request as ready for review February 20, 2026 14:24
@getvictor getvictor requested a review from a team as a code owner February 20, 2026 14:24
@lucasmrod lucasmrod modified the milestone: fleetd-v1.53.0 Feb 20, 2026

func (w *COMWorker) exec(fn func() (any, error)) comWorkResult {
ch := make(chan comWorkResult, 1)
w.workCh <- comWorkItem{fn: fn, result: ch}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about something trying to use this channel after calling Close()? Would that cause a panic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Small chance, but theoretically possible. I made a fix.

getvictor added a commit that referenced this pull request Feb 20, 2026
<!-- Add the related story/sub-task/bug number, like Resolves #123, or
remove if NA -->
**Related issue:** Resolves #40200

QA done as part of #40142 PR

# Checklist for submitter

- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.

## Testing

- [x] QA'd all new/changed functionality manually

## fleetd/orbit/Fleet Desktop

- [x] Verified compatibility with the latest released version of Fleet
(see [Must
rule](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/workflows/fleetd-development-and-release-strategy.md))
- [x] If the change applies to only one platform, confirmed that
`runtime.GOOS` is used as needed to isolate changes
- [x] Verified that fleetd runs on macOS, Linux and Windows
- [x] Verified auto-update works from the released version of component
to the new version (see [tools/tuf/test](../tools/tuf/test/README.md))
@getvictor getvictor requested a review from ksykulev February 20, 2026 19:30
lucasmrod pushed a commit that referenced this pull request Feb 20, 2026
<!-- Add the related story/sub-task/bug number, like Resolves #123, or
remove if NA -->
**Related issue:** Resolves #40200

QA done as part of #40142 PR

# Checklist for submitter

- [x] Changes file added for user-visible changes in `changes/`,
`orbit/changes/` or `ee/fleetd-chrome/changes`.

## Testing

- [x] QA'd all new/changed functionality manually

## fleetd/orbit/Fleet Desktop

- [x] Verified compatibility with the latest released version of Fleet
(see [Must
rule](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/workflows/fleetd-development-and-release-strategy.md))
- [x] If the change applies to only one platform, confirmed that
`runtime.GOOS` is used as needed to isolate changes
- [x] Verified that fleetd runs on macOS, Linux and Windows
- [x] Verified auto-update works from the released version of component
to the new version (see [tools/tuf/test](../tools/tuf/test/README.md))
@getvictor getvictor merged commit a6065c9 into main Feb 20, 2026
50 of 51 checks passed
@getvictor getvictor deleted the victor/38405-bitlocker-encryption branch February 20, 2026 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows BitLocker encyption looping

4 participants