Skip to content

SEC-004: No Null Pointer Checks in C++ FFI Bridge — Segfault on Invalid Handles #73

@s2x

Description

@s2x

| Severity | High |
| File | ffi/zvec_ffi.cc (throughout) |
| Impact | Denial-of-service via segfault; potential memory corruption |
| CVSS | 7.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H) |
| Status | Open |

Problem

All FFI functions in zvec_ffi.cc accept opaque handle types (zvec_collection_t, zvec_doc_t, zvec_schema_t, etc.) and immediately static_cast<> them to C++ object pointers without null checks. If an invalid or post-destroy handle is passed, the result is a null pointer dereference or use-after-free, causing a segfault.

The PHP layer uses checkClosed() to prevent this, but:

  • If destroy() fails (the C++ error is non-fatal but the handle is dangling), $closed stays false and the handle is invalid
  • Direct FFI access bypasses checkClosed() entirely
  • Any bug in closed-state tracking exposes a segfault
  • Third-party PHP code calling FFI directly is unprotected

Affected Code

Every function in ffi/zvec_ffi.cc that takes a handle parameter follows this pattern:

// ffi/zvec_ffi.cc:525-533 (typical pattern)
void zvec_collection_free(zvec_collection_t coll) {
    auto* raw = static_cast<Collection*>(coll);  // No null check
    for (auto it = g_collections.begin(); it != g_collections.end(); ++it) {
        if (it->get() == raw) {
            g_collections.erase(it);
            return;
        }
    }
}

// ffi/zvec_ffi.cc:1136-1138
int zvec_doc_get_operator(zvec_doc_t doc) {
    return static_cast<int>(static_cast<Doc*>(doc)->get_operator());  // No null check
}

All handle-accepting functions in the file are affected (50+ functions total):

  • zvec_collection_* (17+ functions)
  • zvec_doc_* (40+ functions)
  • zvec_schema_* (5 functions)
  • zvec_vector_query_* (10+ functions)
  • zvec_group_by_vector_query_* (5+ functions)

Attack Scenario

  1. Application calls $collection->destroy(), which succeeds (handle becomes dangling)
  2. Due to a bug in PHP state tracking, $closed is not set to true
  3. Application calls $collection->insert() on the dangling handle
  4. The C++ layer dereferences the freed pointer → segfault or memory corruption
  5. An attacker who can trigger this sequence can cause denial of service, or in some cases, exploit use-after-free for code execution

Solution

Add null-pointer checks at the C++ level. For destructor/free functions, treat null as a no-op (idempotent). For accessor functions, return an error status.

// ffi/zvec_ffi.cc — revised collection_free
void zvec_collection_free(zvec_collection_t coll) {
    if (!coll) return;  // Idempotent on null
    auto* raw = static_cast<Collection*>(coll);
    for (auto it = g_collections.begin(); it != g_collections.end(); ++it) {
        if (it->get() == raw) {
            g_collections.erase(it);
            return;
        }
    }
}

// ffi/zvec_ffi.cc — revised doc_get_operator
int zvec_doc_get_operator(zvec_doc_t doc) {
    if (!doc) return 0;  // Return default on null
    return static_cast<int>(static_cast<Doc*>(doc)->get_operator());
}

// ffi/zvec_ffi.cc — revised doc_get_string (returns status via out parameter)
int zvec_doc_get_string(zvec_doc_t doc, const char* field, const char** out) {
    if (!doc || !field || !out) return 0;
    auto result = static_cast<Doc*>(doc)->get_field<std::string>(field);
    if (result.ok()) {
        g_string_buf = result.value();
        *out = g_string_buf.c_str();
        return 1;
    }
    return 0;
}

Pattern for free/destroy functions — null is a no-op:

void zvec_*_free(zvec_*_t handle) {
    if (!handle) return;
    // ... existing logic ...
}

Pattern for accessor functions — null returns error/default:

int zvec_*_get_*(zvec_*_t handle, ...) {
    if (!handle) return 0;  // Error/default
    // ... existing logic ...
}

Rationale

  • Defense-in-depth: Protects against bugs in PHP-layer state tracking, direct FFI access, and future misuse
  • Minimal code change: Each function gets a single guard clause at the top
  • Zero performance impact: Branch predictor handles the always-not-null case in normal usage
  • Idiomatic C: Null pointer checks are standard practice in C APIs

Implementation Steps

  1. In ffi/zvec_ffi.cc, add if (!coll) return; to: zvec_collection_free, zvec_collection_flush, zvec_collection_optimize, zvec_collection_destroy, zvec_collection_num_docs, and all other zvec_collection_* functions with handle parameters
  2. In ffi/zvec_ffi.cc, add if (!doc) return 0; (or appropriate default) to all zvec_doc_* accessor functions
  3. In ffi/zvec_ffi.cc, add if (!schema) return; to all zvec_schema_* functions
  4. In ffi/zvec_ffi.cc, add if (!query) return; to all zvec_vector_query_* functions
  5. In ffi/zvec_ffi.cc, add if (!query) return; to all zvec_group_by_vector_query_* functions
  6. Rebuild FFI library: ./build_ffi.sh

Verification

  1. Write a test that calls zvec_collection_free(nullptr) via FFI — should not segfault
  2. Write a test that calls zvec_doc_get_int64(nullptr, "field", &out) — should return 0 (error)
  3. Run existing test suite to ensure no regressions: php run-tests.php tests/

Acceptance Criteria

  • Unit tests: tests/test_null_handle_collection.phpt — call zvec_collection_free(nullptr), zvec_collection_flush(nullptr), zvec_collection_destroy(nullptr) via FFI — verify no segfault; tests/test_null_handle_doc.phpt — call zvec_doc_get_int64(nullptr, ...), zvec_doc_get_string(nullptr, ...), zvec_doc_get_pk(nullptr) — verify return 0/null without crash; tests/test_null_handle_schema.phpt — call zvec_schema_free(nullptr) — verify no-op
  • Functional tests: tests/test_null_handle_after_destroy.phpt — create collection, call destroy(), then call insert() on the dangling handle — verify the C++ null check catches it (return error, not segfault)
  • Documentation: ffi/zvec_ffi.h — document that all handle-accepting functions are null-safe; SECURITY.md — document defense-in-depth null pointer protection
  • Changelog: CHANGELOG.md entry under ### Security — "Add null-pointer guards to all FFI C++ functions to prevent segfaults on invalid handles"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions