Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support installation to non-C: drive #89

Closed
mateuszviste opened this issue Jul 30, 2024 · 36 comments
Closed

Support installation to non-C: drive #89

mateuszviste opened this issue Jul 30, 2024 · 36 comments
Labels
install The SvarDOS installer

Comments

@mateuszviste
Copy link
Collaborator

The EDR kernel assigns drive letters in an order that is independant of the ACTIVE (bootable) flag on partitions. This means that it is possible to install SvarDOS on a drive that is not C:.

Install should support that, and preset pkg paths accordingly.

@mateuszviste mateuszviste added the install The SvarDOS installer label Jul 30, 2024
@boeckmann
Copy link
Collaborator

FDISK got DR-DOS drive letter assignment support with commit 1c9271f

@mateuszviste
Copy link
Collaborator Author

SvarDOS/edrdos#94 makes this feature more challenging than expected. Currently the installer can't assume the system drive after boot will be C:, but it also can't assume the drive letters won't change if there is more than one disk.

@mateuszviste
Copy link
Collaborator Author

a possible solution would be for the installer to have two "stages". First stage would be about SYSing the destination drive, copying all SVP packages to it and rebooting. Second stage would be booting from the destination drive and running install.exe again just to detect the definitive bootdrive letter and generate all configurations accordingly (and then installing all SVP packages).

@mateuszviste
Copy link
Collaborator Author

Current bleeding edge is able to list available drives (C,D,E or F) and install on any of them. It's functional, but still needs some polish:

  • drive selection screen needs to have more human words
  • NLS (keyb & mode) directives in final autoexec.bat are malformed

Also, I'm not sure if FDISK /MBR is necessary after a successful SYS (maybe SYS is already putting proper bootcode in the MBR? to be checked).

Installation from a USB drive is likely broken, needs to be checked. I have never tested it with EDR-DOS, so maybe it won't even boot.

@boeckmann
Copy link
Collaborator

Also, I'm not sure if FDISK /MBR is necessary after a successful SYS (maybe SYS is already putting proper bootcode in the MBR? to be checked).

To my knowledge, SYS never touches the MBR but only the volume boot record. So this is probably needed unless the MBR already contains proper code.

@mateuszviste
Copy link
Collaborator Author

To my knowledge, SYS never touches the MBR but only the volume boot record.

Cool. So SYS + FDISK it is.

This makes me think about another issue with the new installer: it displays the list of DOS non-removable drives it detects, but it does not know anything else about the drives. If the user chooses a drive that belongs to the secondary HDD, the installer will still execute FDISK/MBR on the primary HDD, which is most probably not what we want. Also, I'm not sure the system boots if the user chooses a drive letter that is a logical partition. Plus the problem of a possibly missing ACTIVE flag that roflo0r described in #71.

@boeckmann
Copy link
Collaborator

This makes me think about another issue with the new installer: it displays the list of DOS non-removable drives it detects, but it does not know anything else about the drives. If the user chooses a drive that belongs to the secondary HDD, the installer will still execute FDISK/MBR on the primary HDD, which is most probably not what we want. Also, I'm not sure the system boots if the user chooses a drive letter that is a logical partition. Plus the problem of a possibly missing ACTIVE flag that roflo0r described in #71.

The system most probably will NOT boot from a logical drive, because the EMBR will not contain the code to do so, and logical partitions cannot be marked as active by FDISK.

Rofl0r's problem with the active flag could be solved by extending FDISK to allow activation by DOS drive letter insted of partition number. If the logical partitions could be hidden in the installer by querying DOS I have to research.

@ecm-pushbx
Copy link

lDOS altmbr (based on syslinux's) can boot any logical partition, if the subsequent loaders also support such operation.

@ecm-pushbx
Copy link

@mateuszviste
Copy link
Collaborator Author

My plan for now is to fetch the MBR of all disks via INT 13H 02H, parse their 4 partition entries to learn which one is active and get the starting sector of each partition, and then read the boot sector of every partition to extract their serial number. So I get a list of triplets: diskid(0x80, 0x81..), ACTIVE_FLAG, serial

Then, I ask DOS for the serial number of every drive it knows (INT 21h 69h) and I match this with the earlier list. This way I get all informations I need:

  • drive (DOS) letter
  • the disk it belongs to (needed to call FDISK)
  • partion type = primary
  • its ACTIVE flag (either to warn the user or maybe even to set it myself)

This solves all the issues. And parsing the list of 4 partitions in a MBR is not very difficult.

The only risk is that two partitions might have the same serial... so maybe I should fingerprint something more. Extreme check would be to match over a hash of the entire boot sector, but reading this via DOS is more complex than just getting the serial (int 25h).

@mateuszviste
Copy link
Collaborator Author

my approach of matching drive letters to MBR entries based on the comparison of boot sectors obtained through BIOS vs DOS fails horribly on unformatted drives because the boot sector of such partition can be anything (in my tests on VMs the partitions were zeroed, so my code matched every letter to every partition...).

Perhaps the installer should propose every unformatted drive as an option for installation, ask to format it, and only then be able to tell whether or not it is okay to install SvarDOS on it.

This could be a passably robust approach. Another would be to figure out where EDR keeps his letters-to-disk-location informations and read it there. But then it means the installer will be bound to EDR (and possibly to a specific version of it)

@boeckmann
Copy link
Collaborator

I will have to think about it. The kernels UDSC structure should provide every info you need but the logical/non-logical info. Have not tested yet myself, but you should be able to get the root of the UDSC chain via INT2F,0803 in EDR:

https://github.com/SvarDOS/edrdos/blob/8b1c69ca12c500af2e54dda6d94e242a3da2abf3/drbio/disk.asm#L514-L520

There is one UDSC entry per volume. RUNIT is the physical drive number, DRIVE is the logical drive number. The UDSC structure is described here:

https://github.com/SvarDOS/edrdos/blob/8b1c69ca12c500af2e54dda6d94e242a3da2abf3/drbio/udsc.equ#L38-L54

You should get the partition offset by looking into the BPB hidden field. You can use this to match with the start offset of the entries in the partition table of the drive specified by RUNIT in the UDSC, IN THEORY :)

INT2F,0803 should also work with FreeDOS, but the structure might be different:

https://github.com/FDOS/kernel/blob/1cc00e194dd969d30c78775c67a1df44af307abf/hdr/device.h#L206-L240

@ecm-pushbx
Copy link

@mateuszviste
Copy link
Collaborator Author

Thanks guys for your feedback. I did glance at int 2Fh,AX=0803h last evening but skipped it as I did not see how to match it against MBR entries. I also investigated the BPB route (through int 21h/AH=32h) but did not find anything MBR-related there either (what I was specifically looking for is the disk sector where the partition starts). And that's when I thought about doing this "brute force" bootsector matching.

What is this "BPB hidden field" that you mentioned?

@ecm-pushbx
Copy link

Not a guy...

The BPB hidden field is part of the BPB. In the case of a primary partition, or, to support the boot loader, this should encode the sector number of the partition beginning. Most boot sector loaders will access their file systems using the (hardcoded) hidden sectors in their sector. (lDOS boot naturally is more flexible.)

The hidden sectors must be set to 0 for diskette file systems, including "super floppy" formats (larger than the usual 1440 KiB or 2880 KiB types). For primary partitions it gives the LBA of the partition within the whole unit. For logical partitions it may or may not be accurate, as these traditionally weren't bootable the argument for the boot loader use isn't as certain.

You can query the BPB from a block device driver. (Walk the DPB / DDSC chain to associate DOS drives with the block device driver and the driver's unit number.) This will contain the hidden sectors, but again it is unclear whether these are always the total-unit LBA number or something else. And the int 13h unit number is not contained in the BPB, but in the UPB / UDSC / DDT which is a structure internal to the block device driver. The UPB chain is what you get from 2F.0803.

For an example, EDR-DOS's bdosldr queries for the BPB of the block device unit and then compares the int 13h unit number and hidden sectors. In EDR-DOS the hidden sectors are always the unit-absolute LBA number. This is also true of the lDOS fork of MS-DOS v4 now.

In FreeDOS I think you currently would have to read the "ddt_offset" field: https://github.com/FDOS/kernel/blob/1cc00e194dd969d30c78775c67a1df44af307abf/hdr/device.h#L239

@mateuszviste
Copy link
Collaborator Author

"guys" is gender neutral these days (or so I've been told). Sorry if you felt offended, was of course not my intention. :-)

Thanks for your clarifications. When you say "BPB" it seems you really mean "DDT" (or "UDSC" as Bernd calls it)? This is what confused me - Bernd also referred to "hidden BPB fields" earlier so I thought this is about some undocumented fields in the BPB.

Now I understand this is still about the struct returned by int 2Fh,AX=0803h. This struct includes a 32-bit "number of hidden sectors" at offset 11h, and so this is supposed to be "the number of disk sectors existing before the partition", is that right?. If so, that would definitely solve my problem, since I would "only" have to translate the CHS geometry found in the MBR to a number of sectors to match the entries (or use the LBA address of first sector from MBR, but this field might not be always set properly). I will investigate this.

@boeckmann
Copy link
Collaborator

The offset 11h is for the "hidden sectors" inside BPB. The UDSC includes a BPB at offset 6, so it is at offset 17h from the UDSC start. As @ecm-pushbx noted, this value is always absolute in EDR and stores the partition offset from the start of the disk. You can use this to match partitions with the partition table stored in the MBR. For this you better ignore the CHS values and only rely on the LBA values stored in the partition table. On FreeDOS you have to use another offset. There it is stored in the 32 bit ddt_offset.

It gets a little more complicated if you follow the EMBR chain to match for logical partitions. Because LBA values are not always absolute in these tables, if I recall correctly. But have to look it up how the algorithm for this is...

@ecm-pushbx
Copy link

Thanks for your clarifications. When you say "BPB" it seems you really mean "DDT" (or "UDSC" as Bernd calls it)? This is what confused me - Bernd also referred to "hidden BPB fields" earlier so I thought this is about some undocumented fields in the BPB.

The hidden sectors field is part of a BPB, at offset 11h in the BPB or 1Ch in the boot sector. One (actually two) field(s) of the UPB / UDSC / DDT is a BPB, so therefore the UPB has a hidden sectors field within the UPB's BPB.

The hidden sector field is also present in the actual on-disk BPB that's always stored in the boot sector of a file system, where the BPB is at offset 0Bh (which gives 11h + 0Bh = 1Ch for the hidden sectors field within the boot sector). But as mentioned this is also not perfectly reliable.

Now I understand this is still about the struct returned by int 2Fh,AX=0803h. This struct includes a 32-bit "number of hidden sectors" at offset 11h, and so this is supposed to be "the number of disk sectors existing before the partition", is that right?

No, it differs. The hidden-sectors-in-a-BPB-in-a-UPB may or may not be the true LBA number of the partition start. In MS-DOS v4 to v6 (excluding the v4 lDOS fork) for example, primary partitions do have the true LBA there but logical partitions do not, they instead have a relative offset from a container extended partition. This may be true of MS-DOS v7 in CHS access mode as well.

If so, that would definitely solve my problem, since I would "only" have to translate the CHS geometry found in the MBR to a number of sectors to match the entries (or use the LBA address of first sector from MBR, but this field might not be always set properly). I will investigate this.

Au contraire, the CHS fields are actually less reliable than the LBA fields. Some OSes do depend on the CHS fields but I consider that bad form.

Replying to @boeckmann :

The offset 11h is for the "hidden sectors" inside BPB. The UDSC includes a BPB at offset 6, so it is at offset 17h from the UDSC start. As @ecm-pushbx noted, this value is always absolute in EDR and stores the partition offset from the start of the disk. You can use this to match partitions with the partition table stored in the MBR.

Yes, but you should also check the int 13h unit number from the UDSC. Otherwise you may confuse hdb1 for hda1 if they have the same absolute LBA number on their respective units.

For this you better ignore the CHS values and only rely on the LBA values stored in the partition table. On FreeDOS you have to use another offset. There it is stored in the 32 bit ddt_offset.

Yes.

It gets a little more complicated if you follow the EMBR chain to match for logical partitions. Because LBA values are not always absolute in these tables, if I recall correctly. But have to look it up how the algorithm for this is...

Yes, the start LBA of a logical partition in an EPBR partition table is relative to the container Extended Partition, that is you have to add the EPBR absolute LBA to the relative start LBA to gain the absolute LBA of the file system on that logical partition.

However, the start LBA of another Extended Partition in an EPBR is relative to the outermost Extended Partition, which for deeply nested chains is not equal to the innermost EPBR's absolute LBA.

You may want to study lDOS examples of this in:

@ecm-pushbx
Copy link

For the differently relative LBAs in EPBRs particularly look at: https://hg.pushbx.org/ecm/ldosmbr/file/998133cbf221/altmbr.asm#l208

base is the current (innermost) EPBR, or zero for the MBR. When a logical partition is selected, its start LBA in the EPBR partition table is added to base to get an absolute LBA start number. root is either zero (for the MBR again) or equal to the outermost EPBR's absolute LBA start. When a nested extended partition is to be read, its relative LBA start is added to root to gain an absolute LBA start number.

@mateuszviste
Copy link
Collaborator Author

Au contraire, the CHS fields are actually less reliable than the LBA fields. Some OSes do depend on the CHS fields but I consider that bad form.

This depends, I guess. But in any case, I'm afraid I do not have much choice - because I assume these fields (CHS/LBA) are written at he time the partition has been created, so depending on which tool was used, on which computer and at which era, the LBA information may or may not be present. My idea was to use the LBA value whenever the partition type says "FAT16/32 LBA", and CHS for other cases. Is that wrong?

@boeckmann
Copy link
Collaborator

In my opinion it is more dangerous to utilize the CHS values than not doing it. Different PC / BIOSes may disagree about the drives CHS geometry. Then the CHS values in the MBR are garbage. On the other side: it is nearly sure that the LBA values are there and correct. You may use the CHS as a last resort if the LBA values are zero and the partition type is a known one. But I would even be careful with this.

@boeckmann
Copy link
Collaborator

And another cause of problem: partitions spaning beyond the 8.4GB contain truncated CHS values, or special marker values.

@mateuszviste
Copy link
Collaborator Author

Yes I know about the 8G limit, but then I imagine the fs type in MBR is set to "LBA-aware".
Anyhow - not every BIOS knows about LBA. When a BIOS knows only CHS, what does EDR do? Does it still rely on the LBA value, and convert it to CHS when reading/writing from/to HDD?

@boeckmann
Copy link
Collaborator

Another problem: logical partitions are not flagged as FAT16 LBA if created under MS-DOS, but as FAT16.

@mateuszviste
Copy link
Collaborator Author

logical partitions are (fortunately) not a concern for me :) all I'm interested in are "boot candidates", ie. the 4 MBR entries. And I do not even need to actually read or write, it's just a matter of matching "something" to the UDSC value so I can pin-point "MBR entry no 2 is drive C:"

@boeckmann
Copy link
Collaborator

EDR only deals with the LBA values, as the FreeDOS kernel does, to my knowledge.

@ecm-pushbx
Copy link

Original MS-DOS does use the CHS for some things but I don't see a reason for not using LBA throughout the stack until you actually call into the ROM-BIOS (ie int 13.02 or 13.03).

@boeckmann
Copy link
Collaborator

logical partitions are (fortunately) not a concern for me :) all I'm interested in are "boot candidates", ie. the 4 MBR entries. And I do not even need to actually read or write, it's just a matter of matching "something" to the UDSC value so I can pin-point "MBR entry no 2 is drive C:"

Ok, so for filtering out the logical partitions and present the user a way to select the primary partition to install onto (and activate it)? For the second part there is a FDISK issue: FDOS/fdisk#81

@mateuszviste
Copy link
Collaborator Author

understood. so let's forget about CHS, I will rely on LBA only and see how it goes. Thanks again both of you for your valuable insights!

Ok, so for filtering out the logical partitions and present the user a way to select the primary partition to install onto (and activate it)? For the second part there is a FDISK issue: FDOS/fdisk#81

Yes, but activating a partition is trivial (compared to the UDSC/BPB stuff) :) so I can do it in the installer, at least as a short term solution. I'm already parsing the MBR anyway. When FDISK will be smarter then of course I can change it.

@mateuszviste
Copy link
Collaborator Author

I confirm the int 2Fh,AX=0803h method of matching drives works. Current bleeding edge relies on it. It's much better and much simpler than my previous bootsector-hashing comparison. The installer still needs some polish, but it's functional. There is still one risk of mistake, though: if a primary partition happens to be located at the same absolute offset as the relative offset of a logical drive on the same disk, the installer might confuse the two and format the wrong one if the logical drive happens to be first in the UDSC DOS chain (because both would have the same value in the LBA start field).
There is a flag at offset 47h of the BPB that is supposed to provide the information about the drive being a primary/logical drive, but RBIL says it is broken on MS-DOS 5. Didn't check yet on EDR.

Besides that, a couple of boring/non-challenging things still left to do:

  • drive selection screen needs to have more human words
  • NLS (keyb & mode) directives in final autoexec.bat are malformed
  • from-USB installation to be checked
  • make the selected drive ACTIVE (and remove the ACTIVE flag from all other MBR entries)
  • warn about the disk not having a proper 0xAA55 sig so the user knows it must be partitioned first
  • read MBRs of all disks (0x80, 0x81, 0x82, 0x83) - currently it is limited to 0x80 only

@boeckmann
Copy link
Collaborator

There is still one risk of mistake, though: if a primary partition happens to be located at the same absolute offset as the relative offset of a logical drive on the same disk, the installer might confuse the two and format the wrong one if the logical drive happens to be first in the UDSC DOS chain (because both would have the same value in the LBA start field).

Nope, will not be a problem, at least under EDR, as everything stored in the UDSC is absolute. Also the start sectors of the logicals :). Keep in mind: under FreeDOS you should use the ddt_offset field instead. I do not know by mind at what offset this field is stored, but it is different then the one you are currently using.

@mateuszviste
Copy link
Collaborator Author

will not be a problem, at least under EDR, as everything stored in the UDSC is absolute.

Ah, perfect then. I did not understand this and thought the logical drives are kept as relative. Awesome.

Keep in mind: under FreeDOS you should use the ddt_offset field instead. I do not know by mind at what offset this field is stored, but it is different then the one you are currently using.

The current offset happens to be the same as described by RBIL, it's cool. I'm not planning on going back to FreeDOS, so having the installer being EDR-only is totally fine. :-)

@ecm-pushbx
Copy link

I confirm the int 2Fh,AX=0803h method of matching drives works. Current bleeding edge relies on it. It's much better and much simpler than my previous bootsector-hashing comparison. The installer still needs some polish, but it's functional. There is still one risk of mistake, though: if a primary partition happens to be located at the same absolute offset as the relative offset of a logical drive on the same disk, the installer might confuse the two and format the wrong one if the logical drive happens to be first in the UDSC DOS chain (because both would have the same value in the LBA start field). There is a flag at offset 47h of the BPB that is supposed to provide the information about the drive being a primary/logical drive, but RBIL says it is broken on MS-DOS 5. Didn't check yet on EDR.

You mean this: https://fd.lod.bz/rbil/interrup/dos_kernel/2f0803.html#4327

---fixed media---
 47h	WORD	partition (FFFFh = primary, 0001h = extended)
		always 0001h for DOS 5+
 49h	WORD	absolute cylinder number of partition's start on physical drive
		(FFFFh if primary partition in DOS 4.x)

Three notes:

  1. This is part of the UPB, not any BPB.
  2. The offset can differ if the kernel supports FAT32.
  3. The "always 0001h" bit for MS-DOS v5 doesn't mean it is broken exactly. If the UPB is for a hard disk, and word [47h] is == 0001h, then word [49h] is assumed to refer to the cylinder of an extended partition, or 0.

It is added to the cylinder value between the LBA-to-CHS calculation and actually handing the tuple in cx:dx off to the ROM-BIOS. (That does mean the extended partition must be on a cylinder boundary to work as expected.)

The hidden sectors field in a BPB in a UPB then gives a "relative" start value for the logical partition, in principle relative to the cylinder in word [49h]. In MS-DOS v7 partitions accessed using CHS behave the same, whereas partitions using LBA are more like EDR-DOS.

@ecm-pushbx
Copy link

So on MS-DOS v5 dword [47h] is 0000_0001h if the partition is a primary partition.

@mateuszviste
Copy link
Collaborator Author

Current bleeding edge is taking care of setting the ACTIVE flag on the partition that SvarDOS is being installed.

left to do:

  • drive selection screen needs to have some human words
  • NLS (keyb & mode) directives in final autoexec.bat are malformed

@mateuszviste
Copy link
Collaborator Author

everything seems to work now. changes pushed to bleeding edge. closing this issue.

roytam1 pushed a commit to roytam1/SvarDOS that referenced this issue Sep 12, 2024
…e very good hints from Bernd and ECM - SvarDOS/bugz#89

git-svn-id: svn://svn.svardos.org/svardos@1940 911cea91-c70f-4353-bd03-772f58fe8c9d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
install The SvarDOS installer
Projects
None yet
Development

No branches or pull requests

3 participants