Skip to content

test: add comprehensive unit tests for vermicelli and noodle accelerators#380

Merged
markos merged 4 commits intoVectorCamp:developfrom
AhnLab-OSSG:unittests
Apr 7, 2026
Merged

test: add comprehensive unit tests for vermicelli and noodle accelerators#380
markos merged 4 commits intoVectorCamp:developfrom
AhnLab-OSSG:unittests

Conversation

@byeonguk-jeong
Copy link
Copy Markdown

Summary

Add extensive unit tests for vermicelli and noodle accelerator functions to improve coverage and catch edge-case regressions.

Tests Added

vermicelli (~750 lines, new file)

  • Single-byte small buffer (1–31 bytes) match/no-match
  • Exact VECTORSIZE boundary and VECTORSIZE+1 tail path
  • Per-position match sweep (forward and reverse)
  • Non-alphabetic character matching
  • Negated vermicelli (NVermicelli) small buffers and nocase
  • Reverse and reverse-negated vermicelli
  • Double vermicelli: small buffers, partial match at end, nocase
  • Reverse double vermicelli: no-match, small buffers, multiple matches
  • Double vermicelli masked: small buffers and partial-bit masking
  • Alignment stress test (varying offset × varying length)
  • Cross-SIMD-boundary double match sweep
  • Forward + reverse consistency verification

noodle (~413 lines)

  • Early termination via callback (single and double char)
  • No-match scenarios for single and double patterns
  • Empty and minimal-length buffer handling
  • Large buffer scanning (multi-vector iteration)
  • Case-insensitive matching for single and double patterns
  • Unaligned buffer scanning
  • Various alignment boundary conditions
  • All-match dense buffers for single and double patterns

gtest: uninitialized variable warning

  • Initialize dummy variable in StackGrowsDown() to suppress -Wuninitialized.

sheng: DFA state transition table size mismatch

  • Add missing sentinel element to state transition vectors for both 16-state and 32-state DFA test configurations. The alpha_size includes a sentinel entry, so each state's next vector must have alpha_size elements.

@AhnLab-OSS @AhnLab-OSSG

@markos
Copy link
Copy Markdown

markos commented Apr 3, 2026

There are conflicts could you please rebase against develop?

@byeonguk-jeong
Copy link
Copy Markdown
Author

There are conflicts could you please rebase against develop?

Done :)

@markos
Copy link
Copy Markdown

markos commented Apr 6, 2026

Several SVE2 related tests have failed, perhaps because the fixes were in a separate PR (which is now merged). Could you please look into them or just rebase against current develop to see if they persist? Thanks again.

@byeonguk-jeong
Copy link
Copy Markdown
Author

Several SVE2 related tests have failed, perhaps because the fixes were in a separate PR (which is now merged). Could you please look into them or just rebase against current develop to see if they persist? Thanks again.

Okay, I just rebased it.

@byeonguk-jeong
Copy link
Copy Markdown
Author

byeonguk-jeong commented Apr 7, 2026

Several SVE2 related tests have failed, perhaps because the fixes were in a separate PR (which is now merged). Could you please look into them or just rebase against current develop to see if they persist? Thanks again.

It seems that vermicelliDoubleMakedExec() is implemented only with NEON, not SVE. It requires at least 16 bytes.
This function is always called with buffer larger than 16 bytes from run_accel(), so it is safe now. However, eventually it has to be refactored with SVE, or at least a fallback has to be implemented for small buffer less than 16 bytes.
For now, I will comment out the unit test with small buffer case.

Add 50 new tests in vermicelli_extra.cpp covering:
- Small buffer paths (1 byte to VECTORSIZE-1)
- Exact VECTORSIZE and VECTORSIZE+1 boundary cases
- Per-position match sweep for all 7 vermicelli functions
- Reverse double vermicelli NoMatch (previously missing)
- Forward/reverse consistency checks
- Alignment stress tests
- Double vermicelli SIMD cross-boundary sweep
- Masked double vermicelli with partial bit masks
- Non-alphabetic character matching

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Initialize 'dummy' variable in StackLowerThanAddress() to zero to
avoid potential undefined behavior and compiler warning.

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Add missing sentinel element to state transition vectors for both
16-state and 32-state DFA test configurations. The alpha_size includes
a sentinel entry at index alpha_size-1, so each state's next vector
must have alpha_size elements.

Fixes: d032540 ("Add sheng tests")

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Add tests covering edge cases and broader scenarios:
- Early termination via callback (single and double char)
- No-match scenarios for single and double patterns
- Empty and minimal-length buffer handling
- Large buffer scanning (multi-vector iteration)
- Case-insensitive matching for single and double patterns
- Unaligned buffer scanning
- Various alignment boundary conditions
- All-match dense buffers for single and double patterns

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Copy link
Copy Markdown

@markos markos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice set of unit tests! Again thank you for your contribution!

@markos markos merged commit 6c1b351 into VectorCamp:develop Apr 7, 2026
108 of 111 checks passed
@byeonguk-jeong byeonguk-jeong deleted the unittests branch April 7, 2026 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants