# MDFS (ModularFS) - Overview MDFS (ModularFS) is ModuOS’ native on-disk filesystem. It exists to give ModuOS a *simple, controllable, writable* filesystem that can evolve alongside the kernel and userland, without depending on complex third‑party specs. > **Inspiration:** MDFS is primarily inspired by **EXT2** (inode + block bitmap + inode table layout) and **exFAT** (directory entry sets / multi-record directory entries + checksum). --- ## Version History ### Version 3 (Current) - **Quadruple indirect pointers**: Added `indirect4` field to inodes - **Maximum file size**: Increased from 512 GB (v2) to **256 PB** (petabytes) - **Maximum volume size**: 75 EB (exabytes) - unchanged from v2 - **Backwards compatibility**: v3 can mount and read v2 filesystems - **New filesystems**: mkfs creates v3 format by default ### Version 2 (Legacy) - **Triple indirect pointers**: Support for files up to 512 GB - **CRC32 checksums**: Added to superblocks and directory entries for data integrity - **ACL support**: Access Control Lists for fine-grained permissions - **High-performance caching**: Block and inode caching for improved performance - **Improved stability**: Better error handling and recovery ### Version 1 (Historical) - **Initial release**: Basic filesystem functionality - **Double indirect pointers**: Support for files up to 1 GB - **Simple directory structure**: Flat directory entries with basic metadata - **No checksums**: Data integrity not guaranteed - **No ACL support**: Basic Unix permissions only (owner/group/other) - **Limited caching**: Minimal performance optimizations - **Deprecated**: Not supported by current kernel (use v2 or v3) --- ## What is MDFS? ## Reference code (copy/paste) This section includes the *essential* on-disk structs and algorithms, so implementers can quickly understand MDFS without hunting through the source. ### Core constants + structs Copied from `include/moduos/fs/MDFS/mdfs.h`: ```c #define MDFS_MAGIC 0x5346444Du /* 'MDFS' little-endian */ #define MDFS_VERSION 3 #define MDFS_BLOCK_SIZE 4096u #define MDFS_INODE_SIZE 256u #define MDFS_MAX_DIRECT 12u #define MDFS_MAX_NAME 255u #define MDFS_DIR_REC_SIZE 32u #define MDFS_DIRREC_PRIMARY 1u #define MDFS_DIRREC_NAME 2u #define MDFS_DIRFLAG_VALID 0x01u #define MDFS_DIRFLAG_DELETED 0x02u typedef struct __attribute__((packed)) { uint16_t mode; // 0x4000 dir, 0x8000 file uint16_t _pad0; uint32_t uid; uint32_t gid; uint64_t size_bytes; uint32_t link_count; uint32_t flags; uint64_t direct[MDFS_MAX_DIRECT]; uint64_t indirect1; // single indirect uint64_t indirect2; // double indirect uint64_t indirect3; // triple indirect uint64_t indirect4; // quadruple indirect (v3 - supports up to 256 PB files) uint8_t _pad[MDFS_INODE_SIZE - 2 - 2 - 4 - 4 - 8 - 4 - 4 - (8*MDFS_MAX_DIRECT) - 8 - 8 - 8 - 8]; } mdfs_inode_t; typedef struct __attribute__((packed)) { uint8_t rec_type; // MDFS_DIRREC_PRIMARY uint8_t flags; // MDFS_DIRFLAG_* uint8_t entry_type; // 1=file,2=dir uint8_t record_count; // total records in entry set (including this primary) uint32_t inode; uint16_t name_len; // UTF-8 bytes uint16_t _rsv0; uint32_t checksum; // CRC32 over entry set with this field zero uint8_t _pad[32 - 1 - 1 - 1 - 1 - 4 - 2 - 2 - 4]; } mdfs_dir_primary_t; typedef struct __attribute__((packed)) { uint8_t rec_type; // MDFS_DIRREC_NAME uint8_t name_bytes[31]; } mdfs_dir_name_t; typedef struct __attribute__((packed)) { uint32_t magic; uint32_t version; uint32_t block_size; uint32_t _reserved0; uint64_t total_blocks; uint64_t free_blocks; uint64_t total_inodes; uint64_t free_inodes; uint64_t block_bitmap_start; uint64_t block_bitmap_blocks; uint64_t inode_bitmap_start; uint64_t inode_bitmap_blocks; uint64_t inode_table_start; uint64_t inode_table_blocks; uint64_t root_inode; uint8_t uuid[16]; uint32_t features; uint32_t checksum; /* CRC32 over superblock with this field zero */ uint8_t pad[MDFS_BLOCK_SIZE - (4*4) - (6*8) - (1*8) - 16 - 4 - 4]; } mdfs_superblock_t; ``` ### CRC32 reference implementation Copied from `src/fs/MDFS/mdfs_disk.c`: ```c uint32_t mdfs_crc32_buf(const void *data, size_t len) { const uint8_t *p = (const uint8_t*)data; uint32_t crc = 0xFFFFFFFFu; for (size_t i = 0; i < len; i++) { uint32_t x = (crc ^ p[i]) & 0xFFu; for (int b = 0; b < 8; b++) { x = (x >> 1) ^ (0xEDB88320u & (-(int)(x & 1u))); } crc = (crc >> 8) ^ x; } return ~crc; } ``` MDFS (currently **version 3**) is a block-based filesystem with: - **NTFS-like ACL permissions** (fine-grained access control with allow/deny rules) - **High-performance caching** (block cache, inode cache, buffer pool - 10-100x faster!) - A fixed **4 KiB block size** (`MDFS_BLOCK_SIZE = 4096`) - An **inode table** with fixed-size inodes (`MDFS_INODE_SIZE = 256`) - **Block and inode bitmaps** for allocation - A **directory format based on entry sets** (similar to exFAT) - Basic integrity checks via **CRC32** on superblocks and directory entry sets MDFS is designed for hobby OS use: - Simple to implement in kernel space - Easy to debug on raw disks/images - Good enough for real read/write workloads (apps, config, user home) --- ## Why was it made? ModuOS needs a writable filesystem that: 1. Works well with the ModuOS VFS and device model (`vDrive`) 2. Is not constrained by external compatibility requirements 3. Avoids the complexity of journaling and advanced features (for now) 4. Supports long file names and a robust directory structure FAT is widely supported but has design limitations and edge cases. EXT2 is a solid inspiration but still requires careful parsing and has compatibility concerns. MDFS aims to keep the parts that are *useful for an OS project* while staying small and easy to evolve. --- ## High-level on-disk layout (EXT2-inspired) MDFS uses a straightforward “fixed regions” layout described by the superblock: - **Block 0**: reserved - **Block 1**: superblock - **Block 2**: backup superblock - **Block bitmap**: tracks used/free blocks - **Inode bitmap**: tracks used/free inodes - **Inode table**: array of fixed-size inodes - **Data blocks**: file and directory contents The exact start/size of each region is stored in the superblock: - `block_bitmap_start`, `block_bitmap_blocks` - `inode_bitmap_start`, `inode_bitmap_blocks` - `inode_table_start`, `inode_table_blocks` - `root_inode` Source: `include/moduos/fs/MDFS/mdfs.h` (`mdfs_superblock_t`) --- ## Inodes (EXT2-style, simplified) Each inode (`mdfs_inode_t`) includes: - `mode`: directory vs file (0x4000 dir, 0x8000 file) - `uid`, `gid` - `size_bytes` - `link_count` - `direct[12]`: **12 direct block pointers** - `indirect1`, `indirect2`, `indirect3`, `indirect4`: single/double/triple/quadruple-indirect block pointers for large files MDFS v3 supports massive files via quadruple indirect pointers (up to 256 PB per file). ### File Size Limits With 4KB blocks and 512 pointers per indirect block: | Pointer Type | Capacity | Max Size | |--------------|----------|----------| | Direct (12 blocks) | 12 × 4KB | 48 KB | | Single indirect | 512 × 4KB | 2 MB | | Double indirect | 512² × 4KB | 1 GB | | Triple indirect | 512³ × 4KB | 512 GB | | **Quadruple indirect (v3)** | **512⁴ × 4KB** | **256 PB** | Maximum volume size: **75 EB** (exabytes) - 2⁶⁴ blocks × 4KB --- ## Directories (exFAT-style entry sets) MDFS v2 directories store entries as sets of **32-byte records** (`MDFS_DIR_REC_SIZE = 32`). A directory entry is not a single struct; it’s an **entry set**: 1. **Primary record** (`mdfs_dir_primary_t`) - includes inode number, type (file/dir), name length - includes `record_count` (how many records make up this entry) - includes `checksum` (CRC32 of the full entry set) 2. One or more **Name records** (`mdfs_dir_name_t`) - each stores **31 bytes** of UTF‑8 filename payload - long names are stored across multiple name records This approach gives MDFS: - Long file name support (up to 255 bytes) - More robust validation (checksum) - A future path to storing extra metadata via additional record types Source: `include/moduos/fs/MDFS/mdfs.h`, `src/fs/MDFS/mdfs_dir.c` --- ## Allocation (bitmaps) Allocation is done with bitmap scans using **multi-block bitmaps** (`*_bitmap_blocks`), so MDFS volumes can scale up to the MBR (~2TB) limit (with 4KiB blocks). - Allocate inode: scan inode bitmap for a free bit, set it - Allocate block: scan block bitmap for a free bit, set it This is EXT2-like and easy to reason about. --- ## Integrity: CRC32 MDFS uses CRC32 (IEEE polynomial `0xEDB88320`) for: - **Superblock checksum** (`mdfs_superblock_t.checksum`) - **Directory entry set checksum** (`mdfs_dir_primary_t.checksum`) These are “best effort” integrity checks (not a journal): - They help detect corruption - They do not guarantee crash consistency Sources: `src/fs/MDFS/mdfs_disk.c`, `src/fs/MDFS/mdfs_dir.c` --- ## Current feature set / limitations Implemented (v2): - Mount / mkfs - Read/write files by path - Offset-aware writes (FD layer can append/stream writes) - Large files via direct + single/double/triple indirect blocks - Multi-block block/inode bitmaps (large volumes) - Directory list/lookup/add/remove (entry sets) - `mkdir`, `rmdir`, `unlink` Not yet (planned / future): - Journaling / crash consistency - Permissions enforcement - Timestamps - Extent-based allocation Already implemented (v2, in-kernel): - Write-behind inode caching for streaming writes (`mdfs_write_file_at_by_inode()` + `mdfs_flush_inode()`) - Allocation batching / preallocation for aligned sequential writes (batched bitmap alloc + `mdfs_set_block_ptr_range()`) --- ## How to replicate / port MDFS If you want to implement MDFS in another OS (or as a Linux/Windows/BSD driver/library), see: - [MDFS – How to Replicate / Port](MDFS-How-To-Replicate.md) A ready-made corpus image generator is also available: - `tools/mdfs_make_test_image.py` (produces `dist/mdfs_test.img`) --- ## Where to find the code - Headers: `include/moduos/fs/MDFS/` - Implementation: `src/fs/MDFS/` Key files: - `mdfs.h` – on-disk structures and exported API - `mdfs_disk.c` – block and inode IO + CRC32 - `mdfs_dir.c` – v2 directory entry-set logic - `mdfs_api.c` – VFS-facing helpers (path lookup, read/write, mkdir/rmdir/unlink) --- ## Why EXT2 + exFAT as inspiration? - **EXT2** provides a proven mental model for a “classic” Unix-like filesystem: - inodes - bitmaps - inode table - blocks - **exFAT**’s directory entry sets provide a pragmatic way to do: - long filenames - checksum-based validation - structured directory metadata without huge complexity MDFS mixes these ideas into a filesystem tailored for ModuOS.