Add support for multiple sectors per cluster in GhostFAT by kaetemi · Pull Request #81 · adafruit/tinyuf2

kaetemi · 2021-03-06T06:51:53Z

Add support for multiple sectors per cluster in GhostFAT to enable larger flash sizes.

…rger flash sizes

kaetemi · 2021-03-07T11:05:18Z

Affects the size of the FAT table (each entry points to a cluster, not to a sector), so the FAT table is smaller with a larger sector per cluster value. Affects the generation of FAT table entries (clusters instead of sectors.)

Affects the size of the text files. Each file takes a cluster at minimum, not a sector. This affects the offset of the current.uf2 file.

Writing is not affected.

hathach

to be honest, I am not FAT expert. This file is single handed by @henrygab to increase the capacity from 8MB to 32MB. Which is already large enough for bootloader. The changes look great to me and it shouldn't affect existing port with CFG_UF2_SECTORS_PER_CLUSTER =1 Therefore it is good for a merge.

Out of curiosity, what is your setup/platform that requires a much larger storage for firmware ?

kaetemi · 2021-03-09T04:44:06Z

For flashing Bridgetek EVE graphics controller (as peripheral).

It includes both program and graphics data. Support is up to 256MB flash.

hathach · 2021-03-09T04:52:45Z

For flashing Bridgetek EVE graphics controller (as peripheral).

It includes both program and graphics data. Support is up to 256MB flash.

One last question, have you tried to read back the CURRENT.UF2 and use that file to flash the Bridgetek peripheral (with a bit of modified like date, verion). It is like an sanity check to make sure both read/write behave correctly.

kaetemi · 2021-03-09T04:55:42Z

Yep, data between repeated read and write matches. Text and html file look good too.

hathach

Superb !! Thank you very much for the PR. It greatly increase the virtual disk capacity. I hope more user will find it useful since it is really neat but rarely see in mcu world.

kaetemi · 2021-03-09T05:12:52Z

FAT16 is relatively simple.

Header (the struct with all the size values)
FAT (just a big array of uint16, one for each cluster - a cluster is just a combined bunch of physical sectors to reduce the size of this array (and speed up allocation) - the value of the uint16 is either empty, EOF, or pointing to the next cluster of the file)
Root directory (just an array of file entries, one or more sectors, can be smaller than a cluster)
Data (the clusters, may also contain directories, a file is minimum one cluster since that's the allocation granularity)

hathach · 2021-03-09T05:32:37Z

Thank you for detail explanation, I also understand the basic but it is still complicated though 🤯

henrygab

@kaetemi, Excellent work.

@hathach, there are a number of changes I would recommend. Some of them strongly.

Let me know if any questions.

henrygab · 2021-03-10T18:05:31Z

 // NOTE: MS specification explicitly allows FAT to be larger than necessary
-#define BPB_SECTORS_PER_FAT       ( (BPB_TOTAL_SECTORS / FAT_ENTRIES_PER_SECTOR) + \
-                                   ((BPB_TOTAL_SECTORS % FAT_ENTRIES_PER_SECTOR) ? 1 : 0))
+#define TOTAL_CLUSTERS_ROUND_UP   ((BPB_TOTAL_SECTORS + BPB_SECTORS_PER_CLUSTER - 1) / BPB_SECTORS_PER_CLUSTER)


In the TOTAL_CLUSTERS_ROUND_UP macro, your changes do not protect against integer overflow. As this code is forked for use in many embedded systems, it's a good practice to code against this.
For example, prior BPB_SECTORS_PER_FAT macro splits the +1 case into its own clause, rather than adding the divisor - 1.
As this has zero runtime effect, it is recommended.
(It also helps raise awareness in system-level programmers of this common cause of security vulnerabilities.)

Interesting, thanks.

henrygab · 2021-03-10T18:08:04Z

-#define BPB_SECTORS_PER_FAT       ( (BPB_TOTAL_SECTORS / FAT_ENTRIES_PER_SECTOR) + \
-                                   ((BPB_TOTAL_SECTORS % FAT_ENTRIES_PER_SECTOR) ? 1 : 0))
+#define TOTAL_CLUSTERS_ROUND_UP   ((BPB_TOTAL_SECTORS + BPB_SECTORS_PER_CLUSTER - 1) / BPB_SECTORS_PER_CLUSTER)
+#define BPB_SECTORS_PER_FAT       ((TOTAL_CLUSTERS_ROUND_UP + FAT_ENTRIES_PER_SECTOR - 1) / FAT_ENTRIES_PER_SECTOR)


This rewrite BPB_SECTORS_PER_FAT removes handling of integer overflow in the macro.

As an alternative, add the following lines somewhere in this file:

STATIC_ASSERT( BPB_TOTAL_SECTORS + BPB_SECTORS_PER_CLUSTER > BPB_TOTAL_SECTORS ); STATIC_ASSERT( BPB_TOTAL_SECTORS + BPB_SECTORS_PER_CLUSTER - 1 > BPB_TOTAL_SECTORS );

henrygab · 2021-03-10T18:09:36Z

@@ -167,7 +166,7 @@ STATIC_ASSERT( CLUSTER_COUNT >= 0x1015 && CLUSTER_COUNT < 0xFFD5 );
 #define UF2_SIZE           (UF2_SECTORS * BPB_SECTOR_SIZE)

 #define UF2_FIRST_SECTOR   ((NUM_FILES + 1) * BPB_SECTORS_PER_CLUSTER) // WARNING -- code presumes each non-UF2 file content fits in single sector


This warning should be updated to reflect that the code now presumes that each non-UF2 file contents fit in a single cluster, rather than a single sector.

henrygab · 2021-03-10T18:11:29Z


 #define UF2_FIRST_SECTOR   ((NUM_FILES + 1) * BPB_SECTORS_PER_CLUSTER) // WARNING -- code presumes each non-UF2 file content fits in single sector
-#define UF2_LAST_SECTOR    ((UF2_FIRST_SECTOR + UF2_SECTORS - 1) * BPB_SECTORS_PER_CLUSTER)
+#define UF2_LAST_SECTOR    (UF2_FIRST_SECTOR + UF2_SECTORS - 1)


First, I apologize. The confusing naming convention here should have been fixed long ago.

At least this line of the commit is not recommended for long-term maintainability, as it will likely (continue to) cause confusion and bugs later, at least because it now adds another meaning to "sector" (sector offset within the data volume portion).

The confusion is rooted in the three different offsets commonly used: sector number (aka LBA) vs. cluster number vs. cluster index (and the related "off by two" errors the two cluster offsets cause).

Cluster Number vs. Cluster Index (why +2?)

I strongly recommend standardizing on the following:

Use the term "sector" when dealing with LBAs relative to the media as a whole.

Use the term "cluster_number" when dealing with the DirEntry calculations, or accessing the FAT by array index.

Use the term "cluster_index" when treating the clusters of the data area as an array, such as when converting from a sector, to finding the corresponding entry in the FAT table for that sector.

Note that, as used above, the first data sector's "cluster number" is 2, while the first data sector's "cluster index" is zero. This creates a whole slew of potential places where an off-by-two error can be introduced, when converting from cluster number to sector/LBA, but it's something that must just be accepted as part of the FAT file system.

Preferably, the entire code base would standardize on using exactly ONE of cluster_number or cluster_index. Alternatively, the entire code base would require the defines / variable name clearly indicate at each point whether it's the cluster number or cluster index. Either way, the use of helper functions to convert is highly recommended.

Again, you walked into code where these pre-existing defines failed to follow the common-sense rule. This likely resulted in your reading them as being a sector offset, rather than cluster index, and thus your removing the multiplication by the sectors per cluster.

Old Name New Name Meaning

UF2_SECTORS UF2_SECTOR_COUNT The count of sectors used by the UF2 file

n/a UF2_CLUSTER_COUNT The count of clusters used by the UF2 file

UF2_SIZE UF2_BYTE_COUNT The count of valid bytes to expose for the UF2 file

UF2_FIRST_SECTOR UF2_FIRST_CLUSTER_NUMBER The first cluster number used by the UF2 file

UF2_LAST_SECTOR UF2_LAST_CLUSTER_NUMBER The last cluster number used by the UF2 file

UF2_FIRST_SECTOR UF2_FIRST_CLUSTER_INDEX The first cluster index used by the UF2 file

UF2_LAST_SECTOR UF2_LAST_CLUSTER_INDEX The last cluster index used by the UF2 file

Using this style of consistent naming convention for these constants will significantly help with code clarity.

It's defined as UF2_FIRST_SECTOR + UF2_SECTORS - 1.
The multiplication multiplies UF2_SECTORS (the number of sectors) by sectors per cluster. That does not make sense.

henrygab · 2021-03-10T20:45:17Z

    {
      // Generate the FAT chain for the firmware "file"
-      uint32_t v = (sectionIdx * FAT_ENTRIES_PER_SECTOR) + i;
+      uint32_t v = (sectionIdx * FAT_ENTRIES_PER_SECTOR * BPB_SECTORS_PER_CLUSTER) + (i * BPB_SECTORS_PER_CLUSTER);


Comments to help refresh my memory

Here's my recollection:

All parts of this function rely on the memory buffer having been zero'd prior to entry to this function.

This top-level else if ( block_no < FS_START_ROOTDIR_SECTOR ) is the code that dynamically generates a single sector's worth of FAT data.

Lines 258..267 only apply if reading the first sector of either FAT.

At line 261, sets FAT[0]'s low 8 bits to match BPB

At lines 265-266, it sets all other bytes of FAT[0], FAT[1] (the two "special" FAT entries), AND all other bytes for the static (single-cluster) files to 0xFFFF (end-of-cluster-chain ... in other words, single cluster per file). I should probably split these lines so the code is more clear.

The loop at lines 269-278 loops for the entire FAT table in this sector, to fill in additional non-zero entries.
The value v is intended to be the FAT Cluster Index ... the array offset if looking at one entire FAT as a single array.

At line 274, it checks if v is part of the UF2 file. ???DANGER??? -- need to check this math, it seems off?

At line 276, based on being part of the UF2 file, the FAT entry data is generated.

Yes, this cheats a little, because GhostFAT presumes and generates a fully contiguous file. Thus, the "next cluster number" calculation is simply the next cluster, until reaching the last cluster (when it's set to an end-of-cluster mark).

Consider changing the name of v to fatArrayIndex

Consider changing the name of sectionIndex to sectionRelativeSector

Math at 274 looks right. Sector v is aligned to the cluster boundary (so it's rounded down, essentially), which means that should always be less or equal to the actual last sector.

But yes, as you specified before, it'll be simpler working with the clusters rather than sectors there.

henrygab · 2021-03-10T22:28:44Z

-      uint32_t v = (sectionIdx * FAT_ENTRIES_PER_SECTOR) + i;
+      uint32_t v = (sectionIdx * FAT_ENTRIES_PER_SECTOR * BPB_SECTORS_PER_CLUSTER) + (i * BPB_SECTORS_PER_CLUSTER);

      if ( UF2_FIRST_SECTOR <= v && v <= UF2_LAST_SECTOR )


Because v is the fatArrayIndex, this should use UF2_FIRST_CLUSTER_NUMBER and UF2_LAST_CLUSTER_NUMBER. Those are not defined, but recommended in prior comment.

henrygab · 2021-03-10T22:34:31Z

      if ( UF2_FIRST_SECTOR <= v && v <= UF2_LAST_SECTOR )
      {
-        ((uint16_t*) (void*) data)[i] = v == UF2_LAST_SECTOR ? 0xffff : v + 1;
+        ((uint16_t*) (void*) data)[i] = v == UF2_LAST_SECTOR ? 0xffff : (v / BPB_SECTORS_PER_CLUSTER) + 1;


Reverting this line is recommended. Your changes were based on the very confusing #define names used in the prior code, coupled with your commit's change to the semantic meaning of two of the #defines above.

Comments to help refresh my memory

Presuming the above change to use UF2_FIRST_CLUSTER_NUMBER and UF2_LAST_CLUSTER_NUMBER....

Here, v is the fatArrayIndex being modified by the particular loop iteration. The array index inside the FAT uses the "cluster number" as its index.

The correct values to insert here are the cluster number.
This appears to be setting the FAT entry at index v

Revert this line, keeping the simpler:

((uint16_t*) (void*) data)[i] = v == UF2_LAST_CLUSTER_NUMBER ? 0xffff : v + 1;

henrygab · 2021-03-10T22:43:35Z

  {
    sectionIdx -= FS_START_CLUSTERS_SECTOR;
-    if ( sectionIdx < NUM_FILES - 1 )
+    if ( sectionIdx < (NUM_FILES - 1) * BPB_SECTORS_PER_CLUSTER )


Comments to help refresh my memory

sectionIdx is renamed to sectionRelativeSector

The root directory in FAT16 is NOT part of the data volume

Therefore, if only having files in root as ghostfat does, the first data sectors are for the file contents.

Yes, good!

Please update the static assertion to reflect that each file in ghostfat must fit within a single cluster, not a single sector (e.g., ~lines 142 here)

henrygab · 2021-03-10T22:46:11Z

+#define CFG_UF2_FLASH_SIZE          (4*1024*1024)    // TODO absolute max flash size
+#define CFG_UF2_NUM_BLOCKS          (0x10109)        // just under 32MB
+#define CFG_UF2_SECTORS_PER_CLUSTER (1)
+// #define CFG_UF2_FLASH_SIZE          (256*1024*1024)  // TODO absolute max flash size


Remove the commented-out defines. Instead, if your intent is to define a highest value tested / supported, just make that a comment in the file.

kaetemi · 2021-03-10T23:57:37Z

@henrygab With the standardized terminology it'll be clearer, yes. Thanks for the suggestions. I'll do a cleanup tomorrow.

henrygab · 2021-03-11T02:25:37Z

@kaetemi -- I actually put together a PR. After updating some names, I found at least three bugs.

Check out #85 to see most of the changes, broken into many individual, tiny commits for easier review.

henrygab · 2021-03-11T06:09:46Z

@kaetemi ... I listed the wrong PR... mine WIP PR is #85. Would appreciate if you could try out the changes from my PR. The changes pass the automated builds.

FYI, I think your changes had some unexpected results, including effectively duplicating the small files' data (first sector per cluster) on all the other sectors of that same cluster. Not visible from the file system, but visible from a dump of the volume.

I think my PR addresses all the above items. Let me know if you agree. Unfortunately, I don't currently have my setup to build/test the changes "live" on a device. In the past, I've added code to manually exercise the ghostfat code, and dump the results to a disk image. I then used 010 editor (a hex editor with ability to parse structures) to look at FAT volumes, and found it helpful.

I lost that wrapper code. Maybe, if @hathach agrees, I can create a proper stub that generates well-defined patterns of data via board_flash_read() and dumps the full volume... with the goal of allowing automated testing of this ghostfat code (by generating images and running custom comparison that excludes directory entry date/time stamp changes).

hathach · 2021-03-11T06:57:56Z

I lost that wrapper code. Maybe, if @hathach agrees, I can create a proper stub that generates well-defined patterns of data via board_flash_read() and dumps the full volume... with the goal of allowing automated testing of this ghostfat code (by generating images and running custom comparison that excludes directory entry date/time stamp changes).

Thank you @henrygab for taking a deep look at this PR, I am totally clueless. For the stub, sure, you could just put it somewhere in the ghostfat as well if you are not familiar with the file org of the repo. This is more like an generic version across MCU based on nRF52. We are trying to standardize and make bootloader project easier for users/makers.

#if 0
// stub test code
#endif

Later on we could move it to its own tests if we manage to add unit test suite support to this repo as well.

Add support for multiple sectors per cluster in GhostFAT to enable la…

760e1e5

…rger flash sizes

hathach reviewed Mar 9, 2021

View reviewed changes

hathach approved these changes Mar 9, 2021

View reviewed changes

hathach merged commit 9cc3c03 into adafruit:master Mar 9, 2021

henrygab reviewed Mar 10, 2021

View reviewed changes

		@@ -167,7 +166,7 @@ STATIC_ASSERT( CLUSTER_COUNT >= 0x1015 && CLUSTER_COUNT < 0xFFD5 );
		#define UF2_SIZE (UF2_SECTORS * BPB_SECTOR_SIZE)

		#define UF2_FIRST_SECTOR ((NUM_FILES + 1) * BPB_SECTORS_PER_CLUSTER) // WARNING -- code presumes each non-UF2 file content fits in single sector

Old Name	New Name	Meaning
`UF2_SECTORS`	`UF2_SECTOR_COUNT`	The count of sectors used by the UF2 file
n/a	`UF2_CLUSTER_COUNT`	The count of clusters used by the UF2 file
`UF2_SIZE`	`UF2_BYTE_COUNT`	The count of valid bytes to expose for the UF2 file
`UF2_FIRST_SECTOR`	`UF2_FIRST_CLUSTER_NUMBER`	The first cluster number used by the UF2 file
`UF2_LAST_SECTOR`	`UF2_LAST_CLUSTER_NUMBER`	The last cluster number used by the UF2 file
`UF2_FIRST_SECTOR`	`UF2_FIRST_CLUSTER_INDEX`	The first cluster index used by the UF2 file
`UF2_LAST_SECTOR`	`UF2_LAST_CLUSTER_INDEX`	The last cluster index used by the UF2 file

Conversation

kaetemi commented Mar 6, 2021

Uh oh!

kaetemi commented Mar 7, 2021

Uh oh!

hathach left a comment

Choose a reason for hiding this comment

Uh oh!

kaetemi commented Mar 9, 2021

Uh oh!

hathach commented Mar 9, 2021

Uh oh!

kaetemi commented Mar 9, 2021

Uh oh!

hathach left a comment

Choose a reason for hiding this comment

Uh oh!

kaetemi commented Mar 9, 2021

Uh oh!

hathach commented Mar 9, 2021

Uh oh!

henrygab left a comment

Choose a reason for hiding this comment

Uh oh!

henrygab Mar 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

henrygab Mar 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaetemi commented Mar 10, 2021

Uh oh!

henrygab commented Mar 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henrygab commented Mar 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hathach commented Mar 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

henrygab Mar 10, 2021 •

edited

Loading

henrygab Mar 10, 2021 •

edited

Loading

henrygab commented Mar 11, 2021 •

edited

Loading

henrygab commented Mar 11, 2021 •

edited

Loading