Skip to content

Fix: IDE: restore shared writes to ATA Command Block registers#1541

Merged
copy merged 1 commit intocopy:masterfrom
felixrieseberg:ide-shared-registers
Apr 15, 2026
Merged

Fix: IDE: restore shared writes to ATA Command Block registers#1541
copy merged 1 commit intocopy:masterfrom
felixrieseberg:ide-shared-registers

Conversation

@felixrieseberg
Copy link
Copy Markdown
Contributor

This took me a while to figure out, so bear with me! I promise there's a reason this is so long.

What

Restores the dual master+slave writes for ATA Command Block registers (Features, Sector Count, LBA Low/Mid/High — ports 0x1F1-0x1F5) that 1b90d2e changed to write only current_interface.

Why

The 1b90d2e commit message asks: "It is not clear why these 6 functions simultaneously modify master and slave attributes instead of just modifying this.current_interface (as is the case everywhere else in the code). This patch changes that to use this.current_interface, experimental until there is some explanation."

The explanation is the hardware: the Command Block registers are channel registers, not device registers. There's one physical register file on the IDE cable interface — both drives are wired to the same I/O port lines and latch the same value on a write. Only the Device register's DEV bit (port 0x1F6 bit 4) selects which drive responds to a subsequent COMMAND write (port 0x1F7). The pre-1b90d2e7 code was modeling this correctly.

What it broke

Windows 95/98's protected-mode IDE driver (ESDI_506.PDR), when the disk is large enough to require LBA mode (~528MB+), does this during slave detection:

  1. Write Sector Count, LBA Low/Mid/High
  2. Set DEV=1 (select slave) via port 0x1F6
  3. Issue IDENTIFY DEVICE

With single-interface writes, step 3 sees the slave's stale registers from before step 1. The 16-bit FIFO shift (reg = (reg << 8 | data) & 0xFFFF) makes the divergence worse on each write. The read never completes → IRQ never fires → boot hangs at the splash screen with the CPU spinning indefinitely.

This is the root cause of:

The 535MiB threshold matches the CHS→LBA transition. Images under that size don't hit this code path, which is why the copy.sh demo (450MB) and smaller test images haven't shown it.

Found via

JS-only bisect against a 1GB Windows 95 OSR2 image (Closure-rebuild libv86.js at each commit, freeze the wasm to sidestep toolchain drift). 3c944a02 boots to desktop in ~30s; 1b90d2e7 hangs at the splash screen across 3/3 attempts.

Not changed

Port 0x1F6 (Device register) writes still go to current_interface only — that handler does the drive-select switch first, so it's already pointing at the right interface when it writes device_reg, is_lba, head. (Those last two arguably should also be shared, but the bisect shows they're not load-bearing for this fix.)

1b90d2e changed writes to ports 0x1F1-0x1F5 (Features, Sector Count, LBA
Low/Mid/High) from writing both master and slave to writing only
current_interface. The commit message marked it experimental: 'It is not
clear why these 6 functions simultaneously modify master and slave
attributes... experimental until there is some explanation.'

The explanation: these are channel registers, not device registers. There
is one physical register file on the cable; both drives latch the same
values. Only the Device register's DEV bit (port 0x1F6 bit 4) selects
which drive responds to a subsequent COMMAND write — the other registers
are shared by definition.

Windows 95/98's ESDI_506.PDR depends on this when using LBA mode (disks
above ~528MB). It writes LBA registers, switches DEV, issues a command —
and the newly-selected drive sees stale registers from before the writes.
The IDE read never completes; the OS spins at the splash screen waiting
for an IRQ that never fires.

Found via JS-only bisect against a 1GB Windows 95 image: 3c944a0 boots,
1b90d2e hangs deterministically. Fixes copy#1462, fixes copy#1502.
@chschnell
Copy link
Copy Markdown
Contributor

The explanation is the hardware: the Command Block registers are channel registers, not device registers. There's one physical register file on the IDE cable interface — both drives are wired to the same I/O port lines and latch the same value on a write. Only the Device register's DEV bit (port 0x1F6 bit 4) selects which drive responds to a subsequent COMMAND write (port 0x1F7). The pre-1b90d2e7 code was modeling this correctly.

I haven't tested this, but this argument (and its implications) makes sense and also explains the old state of the code (before 1b90d2e), nice catch!

I wish I had known this back then, my mistake was to assume that a device always gets selected before being otherwise adressed through these 6 registers.

@copy
Copy link
Copy Markdown
Owner

copy commented Apr 15, 2026

Good analysis, thanks!

@copy copy merged commit e34e9c2 into copy:master Apr 15, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants