Conversation
…sponsive during BitLocker encryption
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical COM deadlock issue on Windows that could cause orbit to become unresponsive during BitLocker encryption enforcement. The root cause was that BitLocker, MDM Bridge, and Windows Update were all sharing a single COM thread via the comshim library, and rapid ref count oscillations during BitLocker's multi-volume enumeration created a race condition that led to permanent deadlocks.
Changes:
- Introduced a dedicated
COMWorkerfor BitLocker that initializes COM once on a locked OS thread, bypassing the sharedcomshimsingleton - Refactored BitLocker public API functions to be package-private
*OnCOMThreadvariants exposed throughCOMWorkermethods - Fixed a nil pointer panic when running orbit with
--disable-updatesby introducing anosqueryVersionvariable
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
orbit/pkg/bitlocker/bitlocker_worker_windows.go |
New COMWorker implementation that manages a dedicated OS thread with persistent COM initialization for all BitLocker operations |
orbit/pkg/bitlocker/bitlocker_worker_notwindows.go |
No-op COMWorker stub for non-Windows platforms |
orbit/pkg/bitlocker/bitlocker_management_windows.go |
Removed comshim usage, renamed public functions to *OnCOMThread variants for internal use via COMWorker |
orbit/pkg/bitlocker/bitlocker_management_notwindows.go |
Removed obsolete public function stubs (now handled by COMWorker stubs) |
orbit/pkg/update/notifications.go |
Updated middleware to accept COMWorker parameter and set function pointers from it; removed nil checks since functions are always set when middleware is registered |
orbit/cmd/orbit/orbit.go |
Created COMWorker on Windows with proper error handling and cleanup; fixed osqueryVersion nil pointer by introducing local variable set in both enable/disable-updates paths |
orbit/changes/38405-bitlocker-encryption |
Added user-facing changelog entry |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
WalkthroughThis pull request introduces a dedicated COM worker for BitLocker operations on Windows to resolve a COM deadlock. The changes create a new COMWorker type that runs BitLocker encryption/decryption/status operations on an isolated OS thread with its own COM initialization, bypassing the shared comshim singleton. Corresponding stub functions are removed from non-Windows builds. The BitLocker management functions are made private and renamed to indicate COM-thread usage. Additionally, a nil pointer panic in orbit.go is fixed by introducing separate osqueryVersion variable tracking. The Windows MDM BitLocker middleware is updated to receive the COM worker via dependency injection. 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
orbit/pkg/update/notifications.go (1)
562-575: Removal of nil guard is safe given the new wiring, but the function will now panic if the middleware is ever constructed without a COMWorker.Previously,
execGetEncryptionStatusFn(and the encrypt/decrypt counterparts) had nil fallbacks. Now they unconditionally invoke the function pointer. This is fine for production since the call site inorbit.goensures a validcomWorker, but test code that constructswindowsMDMBitlockerConfigReceiverdirectly must always set these fields.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@orbit/pkg/update/notifications.go` around lines 562 - 575, The method getEncryptionStatusForVolume on windowsMDMBitlockerConfigReceiver now directly calls execGetEncryptionStatusFn and will panic if that function field is nil; add a nil guard to check execGetEncryptionStatusFn (and similarly for the encrypt/decrypt function fields mentioned in the review) before invoking and return a clear error (e.g., "execGetEncryptionStatusFn not set") so tests or any manually constructed windowsMDMBitlockerConfigReceiver don't cause a nil-pointer panic, or alternatively ensure the constructor/initializer for windowsMDMBitlockerConfigReceiver always assigns safe default no-op implementations to these function fields.orbit/pkg/bitlocker/bitlocker_worker_windows.go (1)
65-77:exec()will panic on send-to-closed-channel if called afterClose().If any public method (
GetEncryptionStatus,EncryptVolume,DecryptVolume) is invoked afterClose()has been called, the send onw.workChat line 75 will panic. In the current usage pattern (orbit shutdown viadefer), this is unlikely, but defensively you could recover from the panic or check thedonechannel before sending.🛡️ Defensive option: guard exec against closed worker
func (w *COMWorker) exec(fn func() (any, error)) comWorkResult { + select { + case <-w.done: + return comWorkResult{err: errors.New("COMWorker is closed")} + default: + } ch := make(chan comWorkResult, 1) - w.workCh <- comWorkItem{fn: fn, result: ch} - return <-ch + select { + case w.workCh <- comWorkItem{fn: fn, result: ch}: + return <-ch + case <-w.done: + return comWorkResult{err: errors.New("COMWorker is closed")} + } }Note: this also requires adding
"errors"to the import block.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@orbit/pkg/bitlocker/bitlocker_worker_windows.go` around lines 65 - 77, The exec method can panic if w.workCh is closed by Close(); update COMWorker.exec to avoid sending on a closed channel by selecting on w.done (or a closed indicator) before attempting to send: use a select that returns an appropriate error (e.g., via the errors package) if w.done is closed/canceled, otherwise send the comWorkItem and wait for the result; ensure Close still closes workCh and signals w.done so exec returns safely instead of panicking. Reference: COMWorker.exec, COMWorker.Close, w.workCh and w.done (and add "errors" to imports).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@orbit/pkg/bitlocker/bitlocker_worker_windows.go`:
- Around line 65-77: The exec method can panic if w.workCh is closed by Close();
update COMWorker.exec to avoid sending on a closed channel by selecting on
w.done (or a closed indicator) before attempting to send: use a select that
returns an appropriate error (e.g., via the errors package) if w.done is
closed/canceled, otherwise send the comWorkItem and wait for the result; ensure
Close still closes workCh and signals w.done so exec returns safely instead of
panicking. Reference: COMWorker.exec, COMWorker.Close, w.workCh and w.done (and
add "errors" to imports).
In `@orbit/pkg/update/notifications.go`:
- Around line 562-575: The method getEncryptionStatusForVolume on
windowsMDMBitlockerConfigReceiver now directly calls execGetEncryptionStatusFn
and will panic if that function field is nil; add a nil guard to check
execGetEncryptionStatusFn (and similarly for the encrypt/decrypt function fields
mentioned in the review) before invoking and return a clear error (e.g.,
"execGetEncryptionStatusFn not set") so tests or any manually constructed
windowsMDMBitlockerConfigReceiver don't cause a nil-pointer panic, or
alternatively ensure the constructor/initializer for
windowsMDMBitlockerConfigReceiver always assigns safe default no-op
implementations to these function fields.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #40142 +/- ##
========================================
Coverage 66.35% 66.35%
========================================
Files 2453 2449 -4
Lines 196515 196423 -92
Branches 8643 8520 -123
========================================
- Hits 130388 130334 -54
+ Misses 54317 54282 -35
+ Partials 11810 11807 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| func GetRecoveryKeys(targetVolume string) (map[string]string, error) { | ||
| // Connect to the volume | ||
| vol, err := bitlockerConnect(targetVolume) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("connecting to the volume: %w", err) | ||
| } | ||
| defer vol.bitlockerClose() | ||
|
|
||
| // Get recovery keys | ||
| keys, err := vol.getProtectorsKeys() | ||
| if err != nil { | ||
| return nil, fmt.Errorf("retreving protection keys: %w", err) | ||
| } | ||
|
|
||
| return keys, nil | ||
| } |
There was a problem hiding this comment.
Dead code that was never used. EncryptVolume returns the recovery key as part of the encryption process.
|
|
||
| func (w *COMWorker) exec(fn func() (any, error)) comWorkResult { | ||
| ch := make(chan comWorkResult, 1) | ||
| w.workCh <- comWorkItem{fn: fn, result: ch} |
There was a problem hiding this comment.
Do we care about something trying to use this channel after calling Close()? Would that cause a panic?
There was a problem hiding this comment.
Good catch. Small chance, but theoretically possible. I made a fix.
<!-- Add the related story/sub-task/bug number, like Resolves #123, or remove if NA --> **Related issue:** Resolves #40200 QA done as part of #40142 PR # Checklist for submitter - [x] Changes file added for user-visible changes in `changes/`, `orbit/changes/` or `ee/fleetd-chrome/changes`. ## Testing - [x] QA'd all new/changed functionality manually ## fleetd/orbit/Fleet Desktop - [x] Verified compatibility with the latest released version of Fleet (see [Must rule](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/workflows/fleetd-development-and-release-strategy.md)) - [x] If the change applies to only one platform, confirmed that `runtime.GOOS` is used as needed to isolate changes - [x] Verified that fleetd runs on macOS, Linux and Windows - [x] Verified auto-update works from the released version of component to the new version (see [tools/tuf/test](../tools/tuf/test/README.md))
<!-- Add the related story/sub-task/bug number, like Resolves #123, or remove if NA --> **Related issue:** Resolves #40200 QA done as part of #40142 PR # Checklist for submitter - [x] Changes file added for user-visible changes in `changes/`, `orbit/changes/` or `ee/fleetd-chrome/changes`. ## Testing - [x] QA'd all new/changed functionality manually ## fleetd/orbit/Fleet Desktop - [x] Verified compatibility with the latest released version of Fleet (see [Must rule](https://github.com/fleetdm/fleet/blob/main/docs/Contributing/workflows/fleetd-development-and-release-strategy.md)) - [x] If the change applies to only one platform, confirmed that `runtime.GOOS` is used as needed to isolate changes - [x] Verified that fleetd runs on macOS, Linux and Windows - [x] Verified auto-update works from the released version of component to the new version (see [tools/tuf/test](../tools/tuf/test/README.md))
Related issue: Resolves #38405
See issue for the root cause and fix description.
Checklist for submitter
changes/,orbit/changes/oree/fleetd-chrome/changes.See Changes files for more information.
Testing
fleetd/orbit/Fleet Desktop
runtime.GOOSis used as needed to isolate changesSummary by CodeRabbit
Release Notes
Bug Fixes
Refactor