Skip to content

🔴 CRITICAL: Security and Robustness Fixes Required Before Excel Tables Implementation #27

@sbroenne

Description

@sbroenne

🚨 CRITICAL ISSUES - MUST FIX BEFORE ISSUE #24

These critical security and robustness issues were identified during senior architect review of the Excel Tables implementation plan (Issue #24). They must be addressed before proceeding with Excel Tables development.

Severity: 🔴 CRITICAL
Priority: P0 - Blocker
Blocks: Issue #24 (Excel Tables Implementation)


Issue 1: Path Traversal Vulnerability 🔴

Problem

TableCommands implementation does not use PathValidator for file path validation, exposing path traversal vulnerability.

Current Code (Vulnerable)

public TableListResult List(string filePath)
{
    if (!File.Exists(filePath))
        return new TableListResult { /* error */ };
    
    // No path validation - VULNERABLE!
}

Required Fix

using Sbroenne.ExcelMcp.Core.Security;

public TableListResult List(string filePath)
{
    // CRITICAL: Validate path to prevent traversal attacks
    filePath = PathValidator.ValidateExistingFile(filePath, nameof(filePath));
    
    var result = new TableListResult { FilePath = filePath, Action = "list-tables" };
    // Rest of implementation...
}

Impact

  • Security Risk: HIGH
  • Attack Vector: Path traversal (e.g., ../../etc/passwd, C:\Windows\System32\config\SAM)
  • Affected Methods: List(), Create(), Rename(), Delete(), GetInfo()

Acceptance Criteria

  • ✅ All 5 TableCommands methods use PathValidator.ValidateExistingFile()
  • ✅ PathValidator called FIRST before any file operations
  • ✅ Unit tests verify path validation (invalid paths rejected)

Issue 2: Null Reference - HeaderRowRange/DataBodyRange 🔴

Problem

HeaderRowRange and DataBodyRange can be NULL when tables have no headers or no data, causing NullReferenceException.

Current Code (Vulnerable)

headerRowRange = listObject.HeaderRowRange;  // CAN BE NULL!
dataBodyRange = listObject.DataBodyRange;    // CAN BE NULL!

var tableInfo = new TableInfo
{
    DataRowCount = dataBodyRange.Rows.Count,  // CRASH if null!
    ColumnCount = headerRowRange.Columns.Count  // CRASH if null!
};

Required Fix

// Safe access with null checks
headerRowRange = listObject.ShowHeaders ? listObject.HeaderRowRange : null;
dataBodyRange = listObject.DataBodyRange;  // Can legitimately be null

var tableInfo = new TableInfo
{
    DataRowCount = dataBodyRange?.Rows.Count ?? 0,  // Safe
    ColumnCount = headerRowRange?.Columns.Count ?? 0  // Safe
};

// Get column names only if header exists
if (headerRowRange != null && listObject.ShowHeaders)
{
    dynamic? columns = listObject.ListColumns;
    try
    {
        for (int k = 1; k <= columns.Count; k++)
        {
            dynamic? column = null;
            try
            {
                column = columns.Item(k);
                tableInfo.ColumnNames.Add(column.Name);
            }
            finally
            {
                ComUtilities.Release(ref column);
            }
        }
    }
    finally
    {
        ComUtilities.Release(ref columns);
    }
}

Impact

  • Reliability Risk: HIGH
  • Failure Mode: NullReferenceException crashes application
  • Affected Methods: List(), GetInfo()

Acceptance Criteria

  • ✅ All HeaderRowRange accesses have null checks
  • ✅ All DataBodyRange accesses have null checks
  • ✅ Integration tests with tables having no data (header-only tables)
  • ✅ Integration tests with tables having ShowHeaders=false

Issue 3: COM Cleanup After Unlist() 🔴

Problem

ListObject.Unlist() removes the table but doesn't release the COM reference, causing memory leaks and potential use-after-free bugs.

Current Code (Vulnerable)

// Delete operation
listObject.Unlist();  // Removes table but listObject still points to dead COM object

// Later code might try to access listObject - CRASH or leak!

Required Fix

// Delete operation with proper cleanup
try
{
    listObject = listObjects.Item(j);
    if (listObject.Name == tableName)
    {
        // Step 1: Unlist (converts to range)
        listObject.Unlist();
        
        // Step 2: CRITICAL - Release COM reference
        ComUtilities.Release(ref listObject);
        
        // Step 3: Explicit null to prevent use-after-free
        listObject = null;
        
        found = true;
        break;
    }
}
finally
{
    // Final cleanup (handles case where Unlist() not called)
    if (listObject != null)
    {
        ComUtilities.Release(ref listObject);
    }
}

Impact

  • Reliability Risk: MEDIUM-HIGH
  • Failure Mode: Excel.exe process leaks, memory leaks
  • Affected Methods: Delete()

Acceptance Criteria

  • Delete() releases COM object after Unlist()
  • ✅ OnDemand tests verify no Excel.exe process leaks after delete
  • ✅ Integration tests verify delete operation completes cleanly

Issue 4: Table Name Validation (Injection Prevention) 🔴

Problem

Current validation only checks for spaces, missing comprehensive validation per Excel naming rules.

Current Code (Inadequate)

if (tableName.Contains(' '))
    return new OperationResult { Success = false, ErrorMessage = "No spaces allowed" };

Required Fix

private static (bool isValid, string? errorMessage) ValidateTableName(string tableName)
{
    // Null/empty check
    if (string.IsNullOrWhiteSpace(tableName))
        return (false, "Table name cannot be empty");
    
    // Length check (Excel limit)
    if (tableName.Length > 255)
        return (false, $"Table name too long: {tableName.Length} characters (max 255)");
    
    // No spaces
    if (tableName.Contains(' '))
        return (false, "Table name cannot contain spaces");
    
    // Must start with letter or underscore
    if (!char.IsLetter(tableName[0]) && tableName[0] != '_')
        return (false, "Table name must start with a letter or underscore");
    
    // Can only contain letters, numbers, underscores, periods
    foreach (char c in tableName)
    {
        if (!char.IsLetterOrDigit(c) && c != '_' && c != '.')
            return (false, $"Table name contains invalid character: '{c}'");
    }
    
    // Reserved names check
    string[] reserved = { "Print_Area", "Print_Titles", "_FilterDatabase" };
    if (reserved.Contains(tableName, StringComparer.OrdinalIgnoreCase))
        return (false, $"'{tableName}' is a reserved name");
    
    return (true, null);
}

// Usage in Create():
var (isValid, errorMessage) = ValidateTableName(tableName);
if (!isValid)
    return new OperationResult { Success = false, ErrorMessage = errorMessage, FilePath = filePath, Action = "create-table" };

Impact

  • Security Risk: MEDIUM
  • Attack Vector: Name injection, Excel formula injection
  • Affected Methods: Create(), Rename()

Acceptance Criteria

  • ✅ Comprehensive table name validation implemented
  • ✅ Unit tests for all validation rules (empty, too long, invalid chars, reserved names)
  • ✅ Error messages are clear and actionable

Additional Recommendations (Not Blocking)

Range Validation

Add validation to prevent invalid/dangerous range operations:

private static (bool isValid, string? errorMessage) ValidateRange(dynamic rangeObj)
{
    try
    {
        int rowCount = rangeObj.Rows.Count;
        int colCount = rangeObj.Columns.Count;
        long cellCount = (long)rowCount * colCount;
        
        // Prevent DOS with huge ranges
        if (cellCount > 1_000_000)
            return (false, $"Range too large: {cellCount:N0} cells (max 1,000,000)");
        
        if (rowCount < 1 || colCount < 1)
            return (false, "Range must contain at least one cell");
        
        return (true, null);
    }
    catch (Exception ex)
    {
        return (false, $"Invalid range: {ex.Message}");
    }
}

COMException Handling

Wrap range access to provide clear error messages:

try
{
    rangeObj = sheet.Range[range];
}
catch (COMException ex) when (ex.HResult == -2147352567)
{
    result.Success = false;
    result.ErrorMessage = $"Invalid range '{range}': Range does not exist or is malformed";
    return 1;
}

Implementation Checklist

Phase 1: Security Fixes (Critical)

  • Add PathValidator to all TableCommands methods
  • Add comprehensive table name validation
  • Add unit tests for path validation
  • Add unit tests for name validation

Phase 2: Robustness Fixes (Critical)

  • Add null checks for HeaderRowRange/DataBodyRange
  • Fix COM cleanup after Unlist()
  • Add integration tests for null scenarios
  • Add OnDemand tests for process cleanup

Phase 3: Validation (Recommended)

  • Add range size validation
  • Add COMException handling for range access
  • Add integration tests for edge cases

Testing Requirements

Security Tests

[Fact]
public void List_WithPathTraversal_ThrowsArgumentException()
{
    var commands = new TableCommands();
    Assert.Throws<ArgumentException>(() => commands.List("../../etc/passwd"));
}

[Fact]
public void Create_WithInvalidTableName_ReturnsError()
{
    var commands = new TableCommands();
    
    // Test various invalid names
    var result1 = commands.Create("test.xlsx", "Sheet1", "Invalid Name", "A1:B2");  // Space
    var result2 = commands.Create("test.xlsx", "Sheet1", "123Start", "A1:B2");  // Starts with number
    var result3 = commands.Create("test.xlsx", "Sheet1", "Print_Area", "A1:B2");  // Reserved
    
    Assert.False(result1.Success);
    Assert.False(result2.Success);
    Assert.False(result3.Success);
}

Robustness Tests

[Fact]
public void GetInfo_HeaderOnlyTable_ReturnsZeroDataRows()
{
    // Create table with header but no data
    var result = _commands.GetInfo(_testFile, "HeaderOnlyTable");
    
    Assert.True(result.Success);
    Assert.Equal(0, result.Table.DataRowCount);  // Should not crash
}

[Fact]
public void Delete_Table_NoProcessLeak()
{
    // Create and delete table
    _commands.Create(_testFile, "Sheet1", "TempTable", "A1:B10");
    _commands.Delete(_testFile, "TempTable");
    
    // Verify no Excel.exe leak (OnDemand test)
    // Force GC and check process count
}

Acceptance Criteria (Overall)

This issue is complete when:

  1. ✅ All 4 critical issues fixed
  2. ✅ All security tests pass
  3. ✅ All robustness tests pass
  4. ✅ Code review completed
  5. ✅ Zero build warnings/errors
  6. ✅ OnDemand tests verify no COM leaks
  7. ✅ Documentation updated with security notes

Related Issues


Estimated Effort

  • Security Fixes: 3-4 hours
  • Robustness Fixes: 2-3 hours
  • Testing: 3-4 hours
  • Total: 8-11 hours

This work MUST be completed before starting Issue #24 Excel Tables implementation.

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions