An AmigaOS 4.1 Final Edition device driver for VirtIO SCSI disks in QEMU virtual machines.
This driver was developed with Claude AI (Anthropic) acting as the primary engineer — writing all C code, designing the architecture, debugging hardware-level issues, and navigating the AmigaOS 4.1 SDK. It stands as a practical demonstration of AI-assisted low-level systems programming on a niche, legacy platform with minimal training data.
Kyvos was used to develop and test this device driver.
virtioscsi.device exposes QEMU VirtIO SCSI virtual disks to AmigaOS 4.1 FE as standard trackdisk-compatible block devices. Once installed, AmigaOS treats them like any other hard disk: partitions are automatically discovered and mounted at boot, and filesystems (FFS2, SFS, etc.) work normally.
This driver supports both QEMU machine types:
- AmigaOne — uses VirtIO Legacy PCI (device 0x1004) via I/O port access
- Pegasos2 — uses VirtIO 1.0 Modern PCI (device 0x1048) via MMIO with
stwbrx/lwbrxinline assembly
- Dual VirtIO transport — Legacy PCI (AmigaOne) and Modern VirtIO 1.0 (Pegasos2), auto-detected at boot
- Interrupt-driven I/O — uses PCI INTx interrupts; no CPU-burning polling loops
- Async I/O — per-unit exec task with message port;
BeginIOreturns immediately for slow commands - Multi-disk — discovers up to 8 SCSI targets at boot, each announced to
mounter.library - Automounting — all discovered partitions mount automatically without manual configuration
- Full trackdisk command set —
CMD_READ,CMD_WRITE,CMD_UPDATE,TD_GETGEOMETRY,TD_FORMAT,TD_READ64,TD_WRITE64, NSD 64-bit commands,HD_SCSICMD, and more - >2TB disk support — two-step geometry discovery: READ CAPACITY (10) first; if last LBA == 0xFFFFFFFF, falls back to READ CAPACITY (16) for 64-bit block count
- SCSI VPD pages — INQUIRY EVPD requests (page 0x00/0x80/0x83) answered locally with serial number and device ID
- Accurate SCSI error codes — sense key decoded and mapped to specific AmigaOS io_Error codes (TDERR_WriteProt, TDERR_DiskChanged, TDERR_BadSecHdr, etc.)
- 4K sector support — block size read from device via READ CAPACITY, not hardcoded
- DMA scatter-gather — uses AmigaOS 4.1
StartDMA/GetDMAList/EndDMAfor correct VA→PA translation on the PPC MMU - Pre-allocated DMA buffers — per-unit MEMF_SHARED request/response buffers with permanent DMA mappings eliminate per-I/O allocation overhead
- Bounce buffer ring — pre-pinned 4096-byte MEMF_SHARED bounce buffers per inflight slot; transfers ≤4096 bytes skip per-call
StartDMA/EndDMAentirely - Interrupt coalescing —
used_eventbatching reduces ISR frequency under pipeline load: N in-flight completions → 1 ISR per burst - No deprecated APIs — uses only current AmigaOS 4.1 FE SDK functions (
StartDMAnotCachePreDMA, etc.)
- AmigaOS 4.1 Final Edition (PowerPC)
- QEMU with one of the supported machine types (
amigaoneorpegasos2)
Add the following to your existing QEMU command line to attach VirtIO SCSI disks.
AmigaOne uses the legacy/transitional VirtIO device (virtio-scsi-pci):
-device virtio-scsi-pci,id=scsi0 \
-drive file=virtioscsi1.img,if=none,id=vd0,format=raw \
-device scsi-hd,drive=vd0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \
-drive file=virtioscsi2.img,if=none,id=vd1,format=raw \
-device scsi-hd,drive=vd1,bus=scsi0.0,channel=0,scsi-id=1,lun=1
Pegasos2 requires the non-transitional (modern-only) VirtIO device (virtio-scsi-pci-non-transitional):
-device virtio-scsi-pci-non-transitional,id=scsi0 \
-drive file=virtioscsi1.img,if=none,id=vd0,format=raw \
-device scsi-hd,drive=vd0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \
-drive file=virtioscsi2.img,if=none,id=vd1,format=raw \
-device scsi-hd,drive=vd1,bus=scsi0.0,channel=0,scsi-id=1,lun=1
Replace virtioscsi1.img and virtioscsi2.img with your own hard drive image files. You can attach fewer or more drives by adjusting the -drive/-device scsi-hd pairs (up to 8 targets).
BBoot boots AmigaOS from a zip archive containing all Kickstart modules. To add virtioscsi.device:
- Add
virtioscsi.deviceto theKickstart/folder inside your BBoot zip archive. - Edit the
Kicklayoutfile inside the zip archive and add the following line after the existing boot device driver entry (e.g. afterMODULE Kickstart/a1ide.device.kmodfor AmigaOne, or afterMODULE Kickstart/peg2ide.device.kmodfor Pegasos2):
MODULE Kickstart/virtioscsi.device
- Save the zip archive and boot with BBoot as normal.
If you are not using BBoot and have AmigaOS installed on a bootable disk:
- Copy
virtioscsi.deviceto theSYS:Kickstart/folder on your AmigaOS system disk. - Edit the
SYS:Kickstart/Kicklayoutfile and add the following line after the existing boot device driver entry (e.g. afterMODULE Kickstart/a1ide.device.kmodfor AmigaOne, or afterMODULE Kickstart/peg2ide.device.kmodfor Pegasos2):
MODULE Kickstart/virtioscsi.device
- Save and reboot. The driver will be resident in memory from the very start of the boot process.
Note: The driver has a resident priority of -60 so it initialises after
mounter.library. Ensuremounter.libraryis also present in your Kickstart module set.
The project cross-compiles on Linux/WSL2 using the walkero/amigagccondocker:os4-gcc11 Docker image.
- Docker (or WSL2 + Docker Desktop on Windows)
- The AmigaOS 4.1 SDK is included in the Docker image
# From the project root:
docker run --rm -v $(pwd):/src -w /src walkero/amigagccondocker:os4-gcc11 make
# If $(pwd) expansion fails (non-interactive shell), use an absolute path:
docker run --rm -v /mnt/w/path/to/VirtualSCSIDevice:/src -w /src walkero/amigagccondocker:os4-gcc11 makeOutput: build/virtioscsi.device
docker run --rm -v $(pwd):/src -w /src walkero/amigagccondocker:os4-gcc11 make CFLAGS="-O2 -Wall -I./include -fno-tree-loop-distribute-patterns -DDEBUG"With DEBUG defined, the driver emits detailed serial/debug output via IExec->DebugPrintF() for every I/O operation, PCI discovery step, and VirtIO queue event.
docker run --rm -v $(pwd):/src -w /src walkero/amigagccondocker:os4-gcc11 make cleansrc/
device.c — resident tag, library base init
Init.c — library open: PCI discovery, VirtIO init, unit discovery
Open.c / Close.c — per-opener reference counting, unit task lifecycle
Expunge.c — library cleanup
BeginIO.c — I/O request dispatcher
cmd_names.c — command name table (shared by BeginIO and NSCMD_DEVICEQUERY)
scsi_cdb_helpers.c — CDB builders, geometry cache helper
unit_discovery.c — SCSI INQUIRY scan, mounter.library announcement
unit_task.c — per-unit exec task, pre-allocated DMA buffers
exec_cmds/ — CMD_READ, CMD_WRITE, TD_GETGEOMETRY, TD_IO64, etc.
scsi_cmds/ — SCSI INQUIRY, READ CAPACITY, READ/WRITE(10), etc.
ns_cmds/ — NSD NSCMD_DEVICEQUERY, NSCMD_TD_GETGEOMETRY64, etc.
pci/ — PCI bus enumeration, BAR mapping, modern cap detection
virtio/ — VirtIO queue management, IRQ handler, SCSI I/O engine
include/
virtioscsi.h — library base and unit structs
version.h — version/revision/build defines
virtio/ — VirtIO protocol headers, MMIO helpers
tests/
test_virtioscsi.c — stress test (concurrent I/O, geometry, 64-bit offsets)
test_modern.c — VirtIO 1.0 Modern device probe (Pegasos2 validation)
- Pegasos2 support: VirtIO 1.0 Modern PCI transport (device 0x1048) with MMIO via
stwbrx/lwbrxinline assembly. Auto-detected at boot alongside legacy transport (device 0x1004) for AmigaOne. - Modern VirtIO init: PCI capability chain walk detects COMMON/NOTIFY/ISR/DEVICE config regions. Full VirtIO 1.0 status handshake (Reset→ACK→DRIVER→FEATURES_OK→DRIVER_OK). Three-address queue setup (DESC/AVAIL/USED). Per-queue notify via MMIO. LE vring byte-swap wrappers for all descriptor/ring field accesses.
- Bug fixes: PCI Memory Space and Bus Master enable before MMIO access; NULL-safe BAR0 dereference in modern mode; modern-aware queue notify in DoIO path; reset polling after device reset.
- Performance:
MAX_INFLIGHTincreased from 8 to 16. Each unit task can now sustain 16 simultaneous in-flight block I/O requests, doubling the pipeline depth and improving sequential throughput under burst loads.
- Compatibility: SCSI INQUIRY VPD pages implemented. EVPD requests for page 0x00 (Supported Pages), 0x80 (Unit Serial Number —
"VIRTIOSCSI-T%lu"), and 0x83 (Device Identification) are now answered locally rather than forwarded to VirtIO where they may fail. Unsupported VPD page codes return CHECK CONDITION ILLEGAL REQUEST.
- Correctness: SCSI sense key decoded and mapped to specific AmigaOS io_Error codes:
TDERR_WriteProt(DATA PROTECT),TDERR_DiskChanged(UNIT ATTENTION),TDERR_BadSecHdr(MEDIUM ERROR),TDERR_BadDriveType(NOT READY/HARDWARE ERROR),IOERR_NOCMD(ILLEGAL REQUEST). Previously all errors reportedHFERR_BadStatus. - Correctness:
TD_GETDRIVETYPEhandler incmd_stubs.cdocumented —DRIVE_NEWSTYLE(0x44) is the correct value signalling 64-bit + NSD support.
- Correctness: READ CAPACITY (16) fallback for disks ≥ 2TB.
ensure_geometry_cached()now issues READ CAPACITY (10) first; if the returned last LBA is0xFFFFFFFF, follows up with READ CAPACITY (16) (opcode 0x9E, service action 0x10) to get the true 64-bit block count.total_blocksis nowuint64.dg_TotalSectorsinTD_GETGEOMETRYresponse clamped to0xFFFFFFFFfor disks over 2TB. - Version: bumped to v1.4 (DEVICE_REVISION 3 → 4).
- Compatibility: ATA PASS-THROUGH stub (opcodes 0x85 / 0xA1) for S.M.A.R.T. tool support. SMART applications on AmigaOS 4 send ATA PASS-THROUGH commands via
HD_SCSICMD(SAT layer) to retrieve ATA SMART data. Since VirtIO SCSI is not an ATA device, there is no real ATA layer to query — the driver now returns a synthetic 512-byte ATA SMART Read Data block with plausible health attributes (all green, temperature=30°C, power-on hours=1) instead ofHFERR_BadStatus(io_Error 45). Handles both the 16-byte (0x85, primary) and 12-byte (0xA1, fallback for older drivers) pass-through variants.
- Release build: Removed
-DDEBUGfrom Makefile default CFLAGS. Release is now the default; debug builds available viamake CFLAGS="... -DDEBUG".
- Performance: Interrupt coalescing via
used_eventbatching. Two-layer design:VirtQueue_GetBuf()baseline writesused_event = last_used_idxon every call (keeps polling→IRQ handoff safe).VirtIOSCSI_Harvest()overrides withlast_used_idx + (occupied-1)when ≥2 inflight slots are occupied, coalescing N pipeline completions into 1 ISR per burst. Under peak pipeline load (8 in-flight) this reduces ISR frequency 8× vs. the previous per-completion interrupt. Idle pipeline: baseline fires on the very next completion — no added latency.
- Performance: Bounce buffer ring — zero-overhead I/O for transfers ≤4096 bytes. One 4096-byte
MEMF_SHAREDbuffer per inflight slot, permanently DMA-mapped at unit open.VirtIOSCSI_Submit()detects transfers within the bounce size and uses the pre-pinned buffer directly, eliminatingStartDMA/EndDMAon every small-block read or write. Read data is copied bounce→user inVirtIOSCSI_Harvest()beforeReplyMsg. Large transfers (>4096 bytes) use the direct DMA path unchanged.
- Performance: Deferred kick — batch
QUEUE_NOTIFYfor burst I/O.VirtIOSCSI_Submit()no longer callsVirtQueue_Kick()per request. The unit task's dispatch loop drains the entire message port queue first, then callsVirtIOSCSI_Kick()once. N queued requests → 1 PCI write instead of N. Single-request workloads are unchanged.
- Bug fix (release-build Heisenbug):
VirtIOSCSI_Harvest()was discarding DoIO cookies — when Harvest won the GetBuf race against a concurrentVirtIOSCSI_DoIO()on the other unit, the cookie appeared unmatched (DoIO cookies are not registered ininflight[]) and was silently dropped. DoIO then timed out after 50 retries, causing its filesystem to fail to mount. Fix: addeddoio_pending_cookie/doio_pending_writtenfields toVirtIOUSCSIDevUnit. Harvest identifies DoIO cookies by matchingunit->req_buf, stashes them underio_lock, and signals the unit; DoIO's drain loop checks this stash first each iteration.
- Bug fix (release-build Heisenbug): The inner
GetBufdrain loop inVirtIOSCSI_DoIO()was missing abreakafter finding its own completion cookie. Without it, the loop continued callingGetBufand could dequeue the other unit's pipeline completion — but then mishandled the lock state, silently dropping the IORequest. In debug builds this race was suppressed by the timing overhead ofDPRINTFcalls; in release builds it reliably caused the second drive (DH8/FastFileSystem) to fail to mount. Fix: addbreakimmediately aftercookie = c. Both drives now appear correctly in release builds.
- Stability fix: Cross-unit VirtIO completion harvest. When unit 0's
VirtIOSCSI_Harvest()dequeues a completion cookie belonging to unit 1's pipelineinflight[]slot (or vice versa), it now searcheslibBase->units[]globally to find the true owner and replies the IORequest correctly. Previously these completions were silently dropped, causing one drive's filesystem to hang permanently and only one drive to appear on Workbench. - Both drives (SmartFilesystem DH7, FastFileSystem DH8) now appear and operate correctly on Workbench.
- Stability fix: Serialise all
VirtQueue_GetBuf()calls withio_lock. Both unit tasks share VQ2 and the ISR wakes both on any completion. Without the lock, two concurrentGetBuf()calls race onlast_used_idx, each seeing the other's cookie as unmatched and dropping the IORequest permanently. The lock is held aroundGetBufonly; released beforeReplyMsg, re-acquired at the bottom of the loop.
- Stability fix: Disabled
VIRTIO_F_EVENT_IDXkick suppression permanently. QEMU legacy VirtIO never writesavail_eventinto the used ring — the field stays 0 forever, causing all kicks after the first to be suppressed by the(1 < 1)condition.VirtQueue_Kicknow always sends unconditionalQUEUE_NOTIFY. The interrupt-suppression half of EVENT_IDX (driver writesused_eventafter eachGetBuf) is retained and working.
- Performance: pipelined block I/O — up to 8 simultaneous VirtIO SCSI requests per unit (
MAX_INFLIGHT=8). Block I/O commands (CMD_READ, CMD_WRITE, TD_READ64, TD_WRITE64, NSCMD variants) are submitted asynchronously viaVirtIOSCSI_Submit()and replied byVirtIOSCSI_Harvest()on ISR signal. HD_SCSICMD and geometry commands remain synchronous. - Performance: per-unit ISR signal is now persistent (allocated once at unit task startup, not per-request). The interrupt handler signals the unit task on any VirtIO completion without per-call setup overhead.
- Pre-allocated DMA buffers extended from 1 to 8 per-unit slots (one per inflight request). Slot 0 is aliased for the synchronous DoIO path for backwards compatibility.
- Fallback to synchronous DoIO when all inflight slots are full (ensures forward progress under high concurrency without queueing complexity).
- Performance: READ(16)/WRITE(16) support for disks >2TB. The 64-bit I/O paths (TD_READ64, TD_WRITE64, NSCMD variants) now use READ(16)/WRITE(16) CDBs when the computed LBA exceeds 0xFFFFFFFF, allowing correct access to images larger than ~2.1TB at 512B sectors.
- Performance: VIRTIO_F_INDIRECT_DESC (bit 28) negotiated and implemented. A single vring descriptor now points to a MEMF_SHARED indirect table containing the full scatter-gather chain, eliminating the MAX_SG_ENTRIES=64 limit on transfer size.
- Performance: VIRTIO_F_EVENT_IDX (bit 29) re-enabled and fixed. Root cause of the previous kick-suppression bug was
last_kick_avail_idxinitialised to 0, causing the second kick's suppression check to always fire. Fixed by initialising to 0xFFFF so the first comparison always passes until the device writes a real avail_event value.
- Performance: response buffer reset reduced from 108-byte volatile loop to 3 targeted volatile stores (only
response,status,residualneed resetting between retries; remaining fields are written by device or never read in practice). - Performance:
MAX_SG_ENTRIESincreased from 32 to 64, raising the maximum DMA scatter-gather transfer from ~96KB to ~240KB at 4KB page granularity. - Performance: polling fallback reduced from 5,000,000 to 500,000 iterations — the interrupt path handles all normal I/O; the polling path is only a safety net during discovery or if
AllocSignalfails.
- Fixed VIRTIO_F_EVENT_IDX kick-suppression bug: avail_event is zero at init time, causing all I/O after the first request to be silently dropped. EVENT_IDX disabled until the basic I/O path is proven stable.
- Release build (DEBUG disabled by default).
- Interrupt-driven I/O (Phase 5): replaced 5M-iteration polling loop with PCI INTx interrupt handler via
pciDevice->MapInterrupt()+IExec->AddIntServer(). Falls back to polling if interrupt installation fails. - Async I/O (Phase 6): per-unit exec task with message port.
BeginIOposts slow commands and returns immediately; the unit task executes them and callsReplyMsg.AbortIOremoves pending requests from the port queue. - Performance (Phase 7): pre-allocated per-unit
MEMF_SHAREDreq/resp DMA buffers — no per-request allocation or DMA setup on the I/O hot path.zero_shared()volatile word loop for non-cacheable buffer zeroing (safe alternative toClearMem/SetMemonMEMF_SHARED). - Replaced deprecated
CachePreDMA/CachePostDMAwithStartDMA/GetDMAList/EndDMAthroughout. - Added
-fno-tree-loop-distribute-patternsto Makefile: prevents GCC 11-O2from replacing fill loops withmemset()calls (newlib not linked in device drivers). - Added VIRTIO_F_EVENT_IDX negotiation in virtio_init.c (later disabled — see build 1042).
- CDB helpers (Phase 1):
make_read10_cdb(),make_write10_cdb(),unpack_io64_offset(),ensure_geometry_cached()— eliminated copy-paste across 6 files. Block size now read from device (4K sector support). - Stub consolidation (Phase 2): merged
cmd_td_changestate.c,cmd_td_protstatus.c,cmd_td_getdrivetype.c,cmd_success.cintocmd_stubs.c; mergedscsi_read_10.c+scsi_write_10.cintoscsi_rw_10.c. - Init split (Phase 3): extracted SCSI INQUIRY scan and mounter announcement into
unit_discovery.c. - BeginIO cleanup (Phase 4): extracted
GetCommandName()andsupported_commands[]intocmd_names.c;ns_devicequery.creferences the shared table.
- Multi-unit automounting: INQUIRY scan of up to 8 targets at Init, announced to
mounter.library. - Resolved Workbench boot hang: removed manual
AddDevice(), rely onRTF_AUTOINIT. - I/O semaphore (
io_lock) protecting VirtIO queue submit window against concurrent access. TD_GETDRIVETYPEreturnsDRIVE_NEWSTYLEfor correct 64-bit geometry handling.TD_GETNUMTRACKSfallback of 32768 prevents "empty drive" errors before geometry is cached.- Migrated DMA API:
CachePreDMA→StartDMA/GetDMAList/EndDMA(scatter-gather). - Full 64-bit command coverage:
TD_READ64,TD_WRITE64, allETD_*andNSCMD_ETD_*variants. CMD_UPDATE/CMD_FLUSHwired to SCSISYNCHRONIZE CACHE(10).
- Initial working driver: PCI discovery, VirtIO legacy init, real SCSI I/O (INQUIRY, READ CAPACITY, READ(10), WRITE(10)).
- Single-disk, single-partition operation.