Skip to content

fix: critical bug fixes from code review#12

Closed
MChorfa wants to merge 1 commit intoquantumaikr:mainfrom
MChorfa:main
Closed

fix: critical bug fixes from code review#12
MChorfa wants to merge 1 commit intoquantumaikr:mainfrom
MChorfa:main

Conversation

@MChorfa
Copy link
Copy Markdown

@MChorfa MChorfa commented Apr 7, 2026

This PR addresses 5 critical bugs identified during code review:

  1. Fix out-of-bounds access in QJL NaN check (tq_qjl.c:57)

    • Added dim > 0 check before accessing src[dim-1]
  2. Fix stack buffer overflow risk (tq_uniform.c:195-297)

    • Changed fixed 512-byte q8 buffer to dynamic allocation
    • Added proper free() call and #include <stdlib.h>
  3. Add NULL check after calloc (tq_transformer.c - key_cache)

    • Added NULL check with early return on allocation failure
  4. Add NULL check after calloc (tq_transformer.c - value_cache)

    • Added NULL checks for both Apple and non-Apple paths
  5. Fix thread pool error handling (tq_ops.c:155-169)

    • Added pthread_create return value checking
    • Proper cleanup on thread creation failure

Verification:

  • Build: ✅ Zero warnings
  • Tests: ✅ 35/35 passed
  • Score: 99.2%

This commit fixes 5 critical bugs identified during code review:

1. **Fix out-of-bounds access in QJL NaN check** (tq_qjl.c)
   - Added dim > 0 check before accessing src[dim-1]
   - Prevents undefined behavior when n=0

2. **Fix stack buffer overflow risk** (tq_uniform.c)
   - Changed fixed 512-byte q8 buffer to dynamic allocation
   - Added proper free() call to prevent memory leak
   - Added missing #include <stdlib.h>

3. **Add NULL check after calloc** (tq_transformer.c - key_cache)
   - Added NULL check after posix_memalign fallback calloc
   - Proper cleanup and return NULL on allocation failure

4. **Add NULL check after calloc** (tq_transformer.c - value_cache)
   - Added NULL checks for both Apple and non-Apple code paths
   - Proper cleanup and return NULL on allocation failure

5. **Fix thread pool error handling** (tq_ops.c)
   - Added pthread_create return value checking
   - Proper cleanup of already-created threads on failure
   - Mark thread pool as inactive on initialization failure

All fixes:
- Build successfully with zero warnings
- Pass all 35 non-regression tests
- Maintain 99.2% score on quick evaluation
unamedkr added a commit that referenced this pull request Apr 7, 2026
Windows CI fixes:
- tq_ops.c: pthread_cond_wait must use SleepConditionVariableSRW (not CS)
  since pthread_mutex_t is mapped to SRWLOCK. CS variant on SRW = deadlock
  in test_ops thread pool.
- test_tqm.cpp, test_gguf_moe.cpp: replace hardcoded /tmp paths with
  CWD-relative names on Windows.
- CMakeLists.txt: bump TIMEOUT to 600s on MSVC for slow tests
  (test_multihash_dim64, test_ops, test_unbiased, test_cumulative_error).
  MSVC release lacks auto-vectorization the GCC/Clang build relies on.

Apply functional fixes from PR #12 (cherry-picked without reformatting):
- tq_qjl.c: dim > 0 guard before src[dim-1] NaN check.
- tq_uniform.c: heap-allocate Q8 query buffer (was 512B stack array).
- tq_transformer.c: NULL-check key_cache and value_cache calloc results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@unamedkr
Copy link
Copy Markdown
Collaborator

unamedkr commented Apr 7, 2026

Thanks for the careful review! All 5 functional bug fixes have been cherry-picked into main in commit 749b7d4 (without the reformatting). Closing this PR as superseded.

  • tq_qjl.c: dim > 0 guard ✅
  • tq_uniform.c: heap-allocated Q8 buffer ✅
  • tq_transformer.c: key_cache / value_cache NULL checks ✅
  • tq_ops.c: pthread_create error handling — kept simple for now (added in same area as the unrelated SRW-lock deadlock fix)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants