-
Notifications
You must be signed in to change notification settings - Fork 0
Performance Issues
Problem: Every inode read/write allocates and frees a 4KB block
// mdfs_disk_write_inode() - called for EVERY inode update
uint8_t *blk = (uint8_t*)kmalloc(MDFS_BLOCK_SIZE); // 4KB allocation
if (mdfs_disk_read_block(..., blk) != VDRIVE_SUCCESS) { kfree(blk); return -2; }
memcpy(blk + off, in, sizeof(*in)); // Only writing 256 bytes!
if (mdfs_disk_write_block(..., blk) != VDRIVE_SUCCESS) { kfree(blk); return -3; }
kfree(blk); // Immediate freeImpact:
- Every file operation does 2-4 inode reads/writes
- Each read/write allocates 4KB just to access 256 bytes
- Memory allocator overhead on every operation
- Heap fragmentation over time
Solution: Inode caching or pre-allocated buffers
Problem: Writing a single inode requires:
- Read entire 4KB block from disk
- Modify 256 bytes
- Write entire 4KB block back
Impact:
- 2x disk I/O (read + write) for single inode update
- On slow storage (IDE), this can take 10-20ms per inode
Solution: Write-behind caching or inode buffer pool
Problem: Block/inode allocation scans from start every time
for (uint64_t b = start; b < total; b++) {
// Read bitmap block
// Check if bit is free
// If free, allocate and write back
}Impact:
- For near-full filesystem: O(n) where n = total blocks
- Each iteration reads a bitmap block (4KB)
- Creating 100 files could scan gigabytes of bitmaps
Solution: Free block hint, bitmap caching, or allocation groups
Problem: Every block read goes directly to disk, no cache
Impact:
- Reading same directory multiple times = multiple disk reads
- Metadata blocks (inodes, bitmaps) re-read constantly
- Directory listings very slow
Solution: LRU block cache (even 64KB would help massively)
Problem: Every write is synchronous to disk
Impact:
- Small writes (1 byte) trigger full 4KB disk write
- No write coalescing
- Very slow for sequential writes
Solution: Write-back buffer with periodic flush
Problem: Software CRC32 with inner loop
for (size_t i = 0; i < len; i++) {
uint32_t x = (crc ^ p[i]) & 0xFFu;
for (int b = 0; b < 8; b++) { // 8 iterations per byte!
x = (x >> 1) ^ (0xEDB88320u & (-(int)(x & 1u)));
}
crc = (crc >> 8) ^ x;
}Impact:
- For 4KB block: 4096 * 8 = 32,768 loop iterations
- Moderate CPU overhead
Solution: Table-based CRC32 (256-entry lookup table)
| Operation | Disk Reads | Disk Writes | Mallocs | Time (estimate) |
|---|---|---|---|---|
| Create file | 6-8 | 6-8 | 6-8 | 60-160ms |
| Read inode | 1 | 0 | 1 | 10ms |
| Write inode | 1 | 1 | 1 | 20ms |
| Allocate block | 1-1000+ | 1 | 1 | 10ms-10s |
| Read 4KB file | 2-3 | 0 | 2-3 | 20-30ms |
| Write 4KB file | 3-4 | 3-4 | 3-4 | 30-80ms |
Note: Times based on typical IDE drive (10ms seek + 5ms read/write)
-
Creating 100 files on 90% full disk
- Bitmap scans: 100 * average_scan_time
- Could take minutes on large filesystems
-
Reading directory with 50 files
- No caching: Re-reads same directory block 50 times
- 50 * 10ms = 500ms just for one listing
-
Writing log file (1000 x 100 byte writes)
- Each write: read inode + write inode + write block
- 1000 * 40ms = 40 seconds for 100KB of data
Implement simple LRU cache:
- Size: 64-256 blocks (256KB - 1MB)
- Cache inodes, bitmaps, frequently accessed data
- Expected speedup: 10-100x for metadata operations
typedef struct {
uint64_t block_no;
uint32_t fs_handle;
uint8_t data[MDFS_BLOCK_SIZE];
uint32_t dirty;
uint32_t lru_counter;
} block_cache_entry_t;
#define CACHE_SIZE 64
static block_cache_entry_t g_cache[CACHE_SIZE];Keep hot inodes in memory:
- Cache 32-64 most recently used inodes
- Avoid read-modify-write for every update
- Expected speedup: 5-10x for file operations
Track last allocated block/inode:
- Start search from last allocation
- Avoid scanning from beginning every time
- Expected speedup: 10-100x for allocation
typedef struct {
// ... existing fields ...
uint64_t last_allocated_block;
uint32_t last_allocated_inode;
} mdfs_fs_t;Buffer writes and flush periodically:
- Reduces sync writes
- Allows write coalescing
- Expected speedup: 2-5x for write-heavy workloads
Pre-computed lookup table:
- 256-entry table = 1KB memory
- Expected speedup: 8x for CRC calculations
static uint8_t g_mdfs_temp_buffers[4][MDFS_BLOCK_SIZE];
static int g_mdfs_temp_in_use[4] = {0};
static uint8_t* mdfs_alloc_temp_buffer(void) {
for (int i = 0; i < 4; i++) {
if (!g_mdfs_temp_in_use[i]) {
g_mdfs_temp_in_use[i] = 1;
return g_mdfs_temp_buffers[i];
}
}
return kmalloc(MDFS_BLOCK_SIZE); // Fallback
}
static void mdfs_free_temp_buffer(uint8_t *buf) {
for (int i = 0; i < 4; i++) {
if (buf == g_mdfs_temp_buffers[i]) {
g_mdfs_temp_in_use[i] = 0;
return;
}
}
kfree(buf); // Was fallback allocation
}Benefit: Eliminates most kmalloc/kfree overhead
// In mdfs_fs_t, add:
uint64_t alloc_hint_block;
uint32_t alloc_hint_inode;
// In mdfs_alloc_block(), start from hint:
for (uint64_t b = fs->alloc_hint_block; b < total; b++) {
// ... search ...
if (found) {
fs->alloc_hint_block = b + 1; // Update hint
return 0;
}
}Benefit: O(1) allocation when disk not fragmented
Add performance counters:
typedef struct {
uint64_t disk_reads;
uint64_t disk_writes;
uint64_t cache_hits;
uint64_t cache_misses;
uint64_t malloc_count;
} mdfs_stats_t;Track:
- Disk I/O count
- Cache hit/miss ratio
- Memory allocation count
- Operation latency
- Overview - MDFS architecture
- ACL Permissions - Access control