Windows BitLocker encyption looping

- `customer-mozartia`: [Slack thread](https://fleetdm.slack.com/archives/C07DHUZNA2V/p1759411298394059) (contains logs).

**Fleet version**: v4.78.1

**Web browser and operating system**: N/A

<hr/>

### 💥  Actual behavior

Windows hosts enter an infinite BitLocker encryption/decryption loop after Fleet enrollment and disk encryption enforcement.

- Fleet UI shows Disk encryption: Off, OS settings: Pending.
- BitLocker starts encrypting the OS drive (C:), progresses to Verifying (or partially encrypts, e.g. 86.9%), eventually reverts/fails, loops back to Pending, and repeats indefinitely.
- Uninstalling Fleet or moving the host back to a team without encryption enforced stops the loop.
- Manual BitLocker encryption works without issues/looping.
- Reinstalling Windows via Recovery option (no full disk wipe/format) does not fix the issue.
- Partition layouts show non-standard elements (multiple/extra recovery partitions, occasional ReFS data volumes alongside NTFS OS; see More info below).

### 🛠️ To fix

Timebox TBD at estimation

### 🧑‍💻  Steps to reproduce

These steps:

- [ ] Have been confirmed to consistently lead to reproduction in multiple Fleet instances.
- [x] Describe the workflow that led to the error, but have not yet been reproduced in multiple Fleet instances.

1. Set up a Windows device with a non-standard partition map (see More info below).
2. Enroll that host into Fleet.
3. Assign the host to a team with disk encryption enforcement enabled.

### 🕯️ More info

Partition data examples:

- One case: C: NTFS + large D: ReFS + multiple NTFS/FAT32/recovery partitions.
- Another: All NTFS but 3+ recovery partitions + non-standard map.

Might could be related to #37454.

### Root cause

Three concurrent orbit subsystems share a single COM thread managed by the `comshim` library: BitLocker, MDM Bridge, and Windows Update. The `comshim` library maintains a global reference count: `Add(1)` increments it (initializing COM if the count was 0), and `Done()` decrements it (tearing down COM if it reaches 0).

BitLocker's `GetEncryptionStatus()` enumerates **all logical volumes** and queries each one's BitLocker status via WMI. On a VM with 5 drives, this means 5 sequential cycles of:

```
comshim.Add(1) → bitlockerConnect → WMI query → bitlockerClose → comshim.Done()
```

Each `Done()` can drop the ref count to 0, triggering COM teardown. The next `Add(1)` re-initializes COM. This rapid oscillation through zero, with MDM Bridge and Windows Update also calling `Add`/`Done` on their own schedules, creates a race condition. If a `Done()` triggers teardown while another goroutine's `Add(1)` is trying to re-initialize, they deadlock on comshim's internal mutex and COM's initialization locks.

This is not a transient timing collision (BitLocker only holds COM for 1-2 seconds). It's a structural deadlock caused by the teardown/re-init lifecycle race, which permanently blocks all participating threads.

### Fix

Create a dedicated COM worker goroutine for BitLocker that bypasses `comshim` entirely.

The worker:
- Locks itself to an OS thread via `runtime.LockOSThread()`
- Calls `ole.CoInitializeEx(0, ole.COINIT_MULTITHREADED)` once at startup
- Processes BitLocker operations (encrypt, decrypt, status check) sequentially via a channel
- COM stays initialized for the lifetime of orbit; no ref count oscillation, no teardown races

MDM Bridge and Windows Update continue using `comshim`. Without BitLocker in the mix, they have minimal conflict risk.

### Bonus fix

Also fixed a pre-existing nil pointer panic in `orbit.go` when running with `--disable-updates`: `updateRunner.OsqueryVersion` was accessed but `updateRunner` is nil when updates are disabled. Fixed by introducing an `osqueryVersion` variable set in both code paths.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows BitLocker encyption looping #38405

💥 Actual behavior

🛠️ To fix

🧑‍💻 Steps to reproduce

🕯️ More info

Root cause

Fix

Bonus fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Windows BitLocker encyption looping #38405

Description

💥 Actual behavior

🛠️ To fix

🧑‍💻 Steps to reproduce

🕯️ More info

Root cause

Fix

Bonus fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions