Kernel fails to boot on 86box IBM-PC (1982), works IBM-XT onwards #102

mateuszviste · 2024-08-21T21:16:21Z

Tested under 86Box with an emulated 8086 PC. When booting from a 720K floppy disk, the kernel displays a warning as shown on the screenshot, and the boot stops (freezes).

same story with a different IDE controller:

VM configuration:

mateuszviste · 2024-08-21T21:19:12Z

it works when switching the VM to an IBM PC XT model. Everything else is the same (same 720K floppy, same RAM amount, same HDD controller...).

boeckmann · 2024-08-21T21:44:31Z

Yes can confirm. This is also for the 360k floppy image.

boeckmann · 2024-08-22T15:55:27Z

@ecm-pushbx is there a change getting lDebug to run on a 256k machine? I made an instsect.com on a 360K drive, copied ldebug onto it. It bails out with an "out of memory" message. Screenshot and a 360k floppy image attached with uncompressed single-file edr protocol kernel named drbio.sys (because this one does not relocate).

ldeb360.img.zip

ecm-pushbx · 2024-08-22T16:58:45Z

I built a special-purpose binary which works with as little as 192 KiB of memory. (Still fails if only 128 KiB are available.) It's in https://pushbx.org/ecm/test/20240822/

Here's the history of a test I did in qemu (with an EBDA of size 1 KiB):

r v0 = word [0:413]
r word [0:413] = #192
m v0<<6:0 l #1024 #192<<6:0
dw 0:40E l 2
h v0<<6
r word [0:40E] = #192<<6
dw 0:40E l 2
boot protocol ldos smalll.sys
g

Here's the SmallL executable showing its own resident size, again with a 1 KiB EBDA.

 -h word [word [0:413]+1<<6:8]
 1580  decimal: 5504
 -h as paras word [word [0:413]+1<<6:8]
 00015800   86.0 KiB
 -

Same but independent of EBDA, using the DPR variable:

 -h as paras word [dpr-1:8]
 00015800   86.0 KiB
 -

You can use your existing installer as in instsect.com A: /F1=smalll.sys as the boot loading appears to work.

Here's how I created the build:

./makec (to compile mktables)
./mktables 8086 (build tables without 186+ level instructions)
INICOMP_METHOD=none ./mak.sh -D_EXTENSIONS=0 -D_BOOT_ENV_SIZE=256 -D_REGSHIGHLIGHT=0 -D_GETLINEHIGHLIGHT=0 -D_REGSLINEBREAK=0 -D_REGSREADABLEFLAGS=0 -D_MS_N_COMPAT=0 -D_MS_0RANGE_COMPAT=0 -D_MS_PROMPT_COMPAT=0 -D_MS_MNEMON_COMPAT=0 -D_VXCHG=0 -D_ALTVID=0 -D_INDOS_PROMPT=0 -D_MCB=0 -D_MMXSUPP=0 -D_DT=0 -D_DSTRINGS=0 -D_DTOP=0 -D_CLEAR=0 -D_RSEPARATE=0 -D_VDD=0 -D_EXTHELP=0 -D_APPLICATION=0 -D_DEVICE=0 -D_RE_BUFFER_SIZE=256 -D_NUM_B_BP=4 -D_NUM_G_BP=4 -D_NUM_B_WHEN_BYTES=256 -D_DELAY_BEFORE_BP=0 -D_HISTORY_SEPARATE_FIXED=0 -D_40COLUMNS=0 -D_DETECT95LX=0 -D_DOSEMU=0 -D_DISASM_32BIT=0 -D_ONLYNON386=1 -D_AREAS=0

ecm-pushbx · 2024-08-22T17:03:59Z

This is the smallest it can get easily. But it disables some parts you may want to use such as support for Extensions for lDebug.

ecm-pushbx · 2024-08-22T17:19:06Z

The mak script creates a binary named ldebugu.com but despite the name this is built with -D_APPLICATION=0 -D_DEVICE=0 so it can only be loaded in bootloaded mode. From your description I gather that's what you want.

boeckmann · 2024-08-22T17:29:47Z

Thanks :) I will try this probably at weekend, when my mind has the capacity to deal with this :)
I try to load the EDR kernel under lDebug, with only 256K available. This might not fit (have not figured out yet the additional amount of memory the kernel requires apart from itself). Thinking about it, I better try the uncompressed dual-file drbio.sys, as the drdos.sys part does not seem to be involved in the trouble going on when booted under the IBM-PC vm type.

boeckmann · 2024-08-22T17:40:46Z

edrdos/drbio/config.equ

Line 38 in 525b418

MOVE_DOWN equ 1800h ; start relocated code 96K down

256K - 86k(ldebug) - 96k=74k

Hopefully sufficient that there is not some self-overwriting going on when the kernel relocates (dual-file drbio.sys may work)...

ecm-pushbx · 2024-08-22T17:43:04Z

Be sure to use boot protocol edrdos . // if you don't have a drdos.sys file, or boot protocol freedos segment=70 drbio.sys

boeckmann · 2024-08-23T09:18:13Z

That looks very promising. I can boot EDR on an IBM-XT vm under lDebug with 256k RAM (dual-file, uncompressed kernel).

But guess what: the kernel also boots perfectly fine on an IBM-PC vm under lDebug. Another case where the observed object behaves different under observation :/

boeckmann · 2024-08-23T09:35:20Z

@ecm-pushbx if I boot the kernel without setting the carry flag on int 3, so that the kernel intercept interrupts 0, 1 and 3 by itself, is there still anything lDebug does in the background, or is it simply a "space waster" after the kernel being booted?

If it does nothing anymore, the problem may be in the handover chain from BIOS -> bootloader -> kernel, with some value not expected by the kernel...

boeckmann · 2024-08-23T09:40:57Z

Addition to the previous post: under the assumption that no breakpoints etc. are set by the user...

ecm-pushbx · 2024-08-23T09:59:47Z

@ecm-pushbx if I boot the kernel without setting the carry flag on int 3, so that the kernel intercept interrupts 0, 1 and 3 by itself, is there still anything lDebug does in the background, or is it simply a "space waster" after the kernel being booted?

Pretty much, yes. It will still respond to int 18h, int 19h, or int 06h (not on HP 95LX) if these are either not hooked or restored at a later point.

If it does nothing anymore, the problem may be in the handover chain from BIOS -> bootloader -> kernel, with some value not expected by the kernel...

You can prepare a boot sector file to chainload, eg using instsect as in instsect A: /B=bootsect.dos /BO to save the current boot sector loader into a file. Then in the booted lDebug run boot protocol chain and you can trace the original boot loader. This may help pin down a problem between the loader and the kernel, or it may not.

If it doesn't I'd suggest you write a small bootloader that just displays a hexdump of itself (including BPB) and all register and flags values at its entry, so that you can recreate the same when tracing with lDebug.

ecm-pushbx · 2024-08-23T10:19:50Z

If it makes any difference (possible!) you can boot lDebug (off fdb or hda1, or off fda then switch the diskette image loaded into fda) and run boot fda (same as boot protocol sector fda) which will have lDebug imitate what it assumes the ROM-BIOS would do to load the boot sector, including to read the sector off the actual unit.

ecm-pushbx · 2024-08-23T11:41:51Z

If it doesn't I'd suggest you write a small bootloader that just displays a hexdump of itself (including BPB) and all register and flags values at its entry, so that you can recreate the same when tracing with lDebug.

Added in https://pushbx.org/ecm/test/20240823/

This is a small test loader that can be installed using instsect A: /S12=test12.bin (only the informational FAT type string is FAT12-specific). It was built using nasm -I ../lmacros/ -D_FAT12 testboot.asm -o test12.bin from the latest revision of https://hg.pushbx.org/ecm/ldosboot.exp/rev/4123bf41df1f

It starts out displaying register values, then will prompt a few times for a keypress to display more of the boot sector hexdump. After the last data from the sector has been displayed it will wait for another keypress and then run an int 19h.

boeckmann · 2024-08-23T12:50:53Z

Has lDebug some heuristics on when to skip showing instructions? It sometimes "misses" to display instructions (albeit they are executed). For example, in the following screenshot there is a LEA instruction missing setting DI to 100h. It is executed but not displayed. But to my understanding it should (I entered a "p" and then simply return multiple times to step thorough the following instructions)... See also the increased IP.

Layer 8 error?

Another one. Missing a pop ds and a mov si,... :

ecm-pushbx · 2024-08-23T13:03:02Z

Try running r dao or= 80 to set the Debugger Assembler option: "80 Disassembler: NEC V20 repeat rules (for segregs)". This will make the debugger repeat disassembly on T/TP/P/G/R register dumps if the first instruction writes to any segreg. So it will still execute them in a single trace/proceed step but the debugger will disassemble the following instruction.

It seems likely that this is the cause of your problem as in all three cases the prior instruction was a pop es or pop ds. The next instruction being run immediately on a single trace step is an effect of the interrupt lockout, which is only really needed for writing ss so that another instruction immediately after which writes sp will always be run together with the write to ss (as the 8088/8086/186/286 didn't have lss sp). However, the NEC V20/V30 and the VM you use both seem to apply the lockout to any mov/pop to a segment register.

ecm-pushbx · 2024-08-23T13:09:59Z

For context of what instructions cause the repetition with DAO 80:

pop ds, pop es, https://hg.pushbx.org/ecm/ldebug/file/9316c0cfe06a/source/uu.asm#l382
mov to or from any segment register, https://hg.pushbx.org/ecm/ldebug/file/9316c0cfe06a/source/uu.asm#l2069 and https://hg.pushbx.org/ecm/ldebug/file/9316c0cfe06a/source/uu.asm#l3539

boeckmann · 2024-08-23T13:52:52Z

Interestingly the bootsector provided by FreeDOS SYS fails with message ".Error!" if chainloaded via boot protocol chain. It works better if booted via boot fda.

boeckmann · 2024-08-23T14:03:25Z

Kernel boots fine via boot protocol edrdos even if I match the register values to these provided by the FreeDOS bootloader upon kernel start.

boeckmann · 2024-08-23T14:13:07Z

Kernel boots also fine if the FreeDOS bootloader is executed via lDebug boot fda.

ecm-pushbx · 2024-08-23T14:20:16Z

Interestingly the bootsector provided by FreeDOS SYS fails with message ".error!" if chainloaded via boot protocol chain. It works better if booted via boot fda.

Perhaps sector reads can sometimes fail? That'd at least explain why using the boot protocol edrdos command always seems to work, and would also explain the FreeDOS loader erroring out.

ecm-pushbx · 2024-08-23T14:24:10Z

Interestingly the bootsector provided by FreeDOS SYS fails with message ".Error!" if chainloaded via boot protocol chain. It works better if booted via boot fda.

Can you provide a diskette image with all needed files and list the 86box settings to reproduce? I may try to debug this on the desktop at home, I think I have 86box on there (running on an amd64 Debian Linux host).

boeckmann · 2024-08-23T15:00:36Z

Sure. I attach two images (zipped) to this post. The first one sysbs.img is the FreeDOS bootloader installed into sector 0, which at least manages to boot into the kernel. The second image is the one chainld.img with lDebug being booted and the bootsector from image 1 included as bootsect.dos in the root dir. This fails if I load it with boot protocol chain bootsect.dos, outputting the following. See @mateuszviste first post for the machine configuration (screenshot). Note: I left the speed at 4.77Mhz (did not make any difference).

Here the INT13 fails. Register values look good though:

boeckmann · 2024-08-23T15:05:24Z

The exact same chainld.img works with pcjs at https://www.pcjs.org/machines/pcx86/ibm/5150/cga/

However, booting the kernel still crashes (in another way) when booted directly via FreeDOS bootsect.

boeckmann · 2024-08-23T15:10:35Z

And kernel boots under the pcjs XT type https://www.pcjs.org/machines/pcx86/ibm/5160/cga/ with the FreeDOS loader.

ecm-pushbx · 2024-08-23T15:21:09Z

Just as a hint, you can boot protocol chain without the filename which will default to bootsect.dos

boeckmann · 2024-08-23T17:06:42Z

I found something out: the int 3 in init.asm triggeres this:

edrdos/drbio/init.asm

Line 978 in 525b418

int 3

If I comment it out it seems to work as expected. This also explains why it works under the debugger. It works now in 86box and with pcjs.

I now will have a look at why exactly this fails.

boeckmann · 2024-08-23T17:41:06Z

Looks like the INT 3 vector contains no sensible value on the original IBM-PC. Under PCjs it is set to 0:0. Can not confirm yet that this is also true for 86box, as it behaves differently (both work with the INT 3 commented out). Is it possible to extend your test12 test programm to also dump the first bytes of the IVT?

boeckmann · 2024-08-23T17:43:51Z

I am questioning myself if the INT 3 call should better be guarded by a DEBUG build flag, so disabled for normal builds.

ecm-pushbx · 2024-08-23T19:12:37Z

Looks like the INT 3 vector contains no sensible value on the original IBM-PC. Under PCjs it is set to 0:0. Can not confirm yet that this is also true for 86box, as it behaves differently (both work with the INT 3 commented out).

Good catch!

Is it possible to extend your test12 test programm to also dump the first bytes of the IVT?

Yes, I updated it to dump the first 32 IVT entries (int 00h to 1Fh) in segment:offset format in https://hg.pushbx.org/ecm/ldosboot.exp/file/1c89531daf16/testboot.asm I also updated the binary in https://pushbx.org/ecm/test/20240823/

I am questioning myself if the INT 3 call should better be guarded by a DEBUG build flag, so disabled for normal builds.

This is what we did in FreeDOS: https://github.com/FDOS/kernel/blob/1cc00e194dd969d30c78775c67a1df44af307abf/kernel/kernel.asm#L80 The check debugger option is by default disabled, skipping the int3 breakpoint in the init. EDR-DOS doesn't have a CONFIG or lCFG or comparable patchable block yet so unclear what to do. Add one? Or just disable the check at build time? Or validate the int 3 handler address by default before calling it, ie checking that it isn't segment = 0, isn't offset = FFFFh, and points at a linear adress >= top of LMA (taking into account int 12h or RPL reserved area) and < 10_0000h.

ecm-pushbx · 2024-08-23T19:16:19Z

Additional possible check, lDebug's int 3 handler always uses a standard IISP header: https://hg.pushbx.org/ecm/ldebug/file/9316c0cfe06a/source/run.asm#l6098 Testing for this would tie the check closer to lDebug but would make it more resilient against false positives.

ecm-pushbx · 2024-08-23T19:57:19Z

I just updated patchpro to allow it to recognise lCFG blocks on any dword boundary within a file's first 8 KiB, rather than only paragraph boundaries.

This could be useful to place an lCFG block (currently 32 bytes in size) near the beginning of SvarDOS flavoured kernels. I'm considering to put it in place of two device driver headers of 18 bytes each to have it stay in the initial part of the kernel file. The kernel would then later in its init overwrite the lCFG block with the device headers copied from a temporary location.

lDOS flavoured kernels can store an lCFG block either in the uncompressed header of inicomp (already implemented) or in drkernpl's beginning (not yet). It can be passed to the kernel on the stack (also not yet).

ecm-pushbx · 2024-08-23T20:00:04Z

For now I'd suggest to just go with the build option but we can revisit this at a later time.

boeckmann · 2024-08-23T20:01:08Z

Yes is build option is the safest option for the moment. Then I have time to figure out what a lCFG block is, and what patchini is for :)

ecm-pushbx · 2024-08-23T20:07:23Z

=) lCFG block is my alternative to the FreeDOS CONFIG block. FreeDOS CONFIG must be at offset 0 in the file for now, which I didn't like. In designing the lCFG block I also took care to add a bitmap of supported bytes. Like the CONFIG block each configuration item is "identified" only by its position in the block. Unlike the CONFIG block, lCFG's bitmap means the kernel can advertise support for every single byte (up to 64 bytes) individually. For FreeDOS, the CONFIG block only stores a single "length" of how many bytes are supported, so you can't really advertise that an earlier positioned byte isn't supported by a particular kernel.

lCFG blocks are only used by lDOS inicomp so far, and 3 bytes are used. Each byte has the same meaning, one each for application mode, device mode, and bootloaded operation. The byte value indicates what style of depack progress display to use.

ecm-pushbx · 2024-08-23T20:10:35Z

patchpro is the canonical example of accessing the lCFG block, so as to display or set the progress display variant for lDOS inicomp. The other tool in patchini, patchqry, is unrelated to lCFG blocks but rather deals with the query patch to patch the behaviour of lDOS iniload / a loaded kernel in bootloaded mode.

boeckmann · 2024-08-23T20:12:56Z

Yes, I updated it to dump the first 32 IVT entries (int 00h to 1Fh) in segment:offset format in https://hg.pushbx.org/ecm/ldosboot.exp/file/1c89531daf16/testboot.asm I also updated the binary in https://pushbx.org/ecm/test/20240823/

Nice! This is really helpful to get an overview of the system state right after boot!

boeckmann · 2024-08-23T20:16:33Z

There is also the kernflg which could be used to enable these kind of things:

edrdos/drbio/init.asm

Lines 203 to 208 in 525b418

    
           ; kernel flags: 
        
           ;   bit 0: set if assembled for compression 
        
           ;   bit 1: set if assembled for single-file kernel 
        
           ;   bit 7: set after kernel was processed by COMPBIOS and COMPKERN  
        
           kernflg		db	(SINGLEFILE shl 1) + COMPRESSED

There are five bits left. Could be used as a middle way between the build time flag and your more sophisticated solution.

boeckmann · 2024-08-23T22:10:53Z

For the time being, I decided in favour of the kernflg solution. Byte 5, bit 2 has to be set to enable debugger interception. A little tool would come handy. But instead of implementing this a more general solution like @ecm-pushbx proposed is desireable, because currently config space is one byte... Not that urgent right now...

However, the bug causing this issue should be gone. So closing this.

ecm-pushbx · 2024-09-03T11:13:56Z

Looks like the INT 3 vector contains no sensible value on the original IBM-PC. Under PCjs it is set to 0:0. Can not confirm yet that this is also true for 86box, as it behaves differently (both work with the INT 3 commented out).

Is the vector 0:0 for the affected 86box machines too? I will prepare a patch that allows to set a "check only if vector appears valid" flag in the check debugger byte of the lCFG block.

ecm-pushbx · 2024-09-03T12:11:09Z

Is the vector 0:0 for the affected 86box machines too? I will prepare a patch that allows to set a "check only if vector appears valid" flag in the check debugger byte of the lCFG block.

Added in https://hg.pushbx.org/ecm/edrdos/rev/1e453d972df2

boeckmann changed the title ~~Warning: can't find boot partition~~ Kernel fails to boot on 86box IBM-PC (1982), works IBM-XT onwards Aug 21, 2024

boeckmann added the bug Something isn't working label Aug 21, 2024

boeckmann closed this as completed Aug 23, 2024

ecm-pushbx mentioned this issue Sep 2, 2024

drbio / drdos module ported to NASM - differences to SvarDOS repo #104

Open

ecm-pushbx mentioned this issue Sep 7, 2024

unable to find COMMAND.COM when booting from secondary disk + wrong current drive set #111

Closed

Kernel fails to boot on 86box IBM-PC (1982), works IBM-XT onwards #102

Kernel fails to boot on 86box IBM-PC (1982), works IBM-XT onwards #102

Comments

mateuszviste commented Aug 21, 2024

mateuszviste commented Aug 21, 2024

boeckmann commented Aug 21, 2024

boeckmann commented Aug 22, 2024 • edited Loading

ecm-pushbx commented Aug 22, 2024

ecm-pushbx commented Aug 22, 2024

ecm-pushbx commented Aug 22, 2024

boeckmann commented Aug 22, 2024

boeckmann commented Aug 22, 2024

ecm-pushbx commented Aug 22, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

boeckmann commented Aug 23, 2024 • edited Loading

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

ecm-pushbx commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

boeckmann commented Aug 23, 2024

ecm-pushbx commented Sep 3, 2024

ecm-pushbx commented Sep 3, 2024

boeckmann commented Aug 22, 2024 •

edited

Loading

boeckmann commented Aug 23, 2024 •

edited

Loading