fix(cdx): size a character index key from the field width, not the trimmed first record by Admnwk · Pull Request #68 · FiveTechSoft/OpenADS

Admnwk · 2026-06-24T21:13:22Z

Problem

A character index key is fixed-width on disk, but AdsCreateIndex61 derived the
key length from the trimmed value of the first record. When the first row is
short and later rows share a longer prefix, every later key is truncated to the
first row's width and collapses onto the same key — distinct values become
indistinguishable and a seek lands on the wrong record, both inside the index
and for native FoxPro/Clipper readers of the bag.

Reproduction

A NOME C(30) table whose first row is "ANA" and whose later rows are
"ANABELA CARDOSO" / "ANABELA FERREIRA": before the fix all three keys are
stored as the 3-byte "ANA", so seek("ANABELA FERREIRA") returns record 1
instead of record 3.

Fix

Derive the key width from the declared field length for a bare character field;
fall back to the untrimmed first-record key width for a composite
expression; keep the 32-char default only for an empty table with nothing to
probe. (Numeric CDX/NTX key widths are unchanged.)

Test

Adds abi_cdx_char_keylen_test, mirroring the existing NTX numeric-key interop
test: it builds the index through the ACE ABI and asserts every stored key is
the full field width and that a seek for a long, prefix-sharing name finds the
right record. The change is additive; the full suite stays green at 739/739.

🤖 Generated with Claude Code

…immed first record A character index key is fixed-width on disk, but AdsCreateIndex61 derived the key length from the *trimmed* value of the first record. When the first row is short and later rows share a longer prefix (e.g. "ANA", then "ANABELA CARDOSO" and "ANABELA FERREIRA"), every later key was truncated to the first row's width and collapsed onto the same key: distinct names became indistinguishable and a seek landed on the wrong record -- both inside the index and for native FoxPro/Clipper readers of the bag. Derive the width from the declared field length for a bare character field; fall back to the untrimmed first-record key width for a composite expression; keep the 32-char default only for an empty table with nothing to probe. Adds abi_cdx_char_keylen_test, mirroring the NTX numeric-key interop test: it asserts every stored key is the full field width and that a seek for a long, prefix-sharing name finds the right record. Full suite 739/739. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019Um1qrC4vrHD4rA643CjAJ

gemini-code-assist

Code Review

This pull request modifies how the on-disk key length is determined for character keys during index creation, preferring the declared field width of bare character fields over the trimmed value of the first record to prevent key truncation and collisions. A new unit test is added to verify this fix. The reviewer pointed out a significant issue where composite expressions will always result in a key length of 254 bytes due to padding in evaluate_index_expr, which wastes disk space and degrades performance.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-24T21:15:26Z

+                if (auto k = openads::engine::evaluate_index_expr(*t, expr, 254)) {
+                    std::size_t n = std::move(k).value().size();  // untrimmed
+                    if (n > 0)
+                        klen = static_cast<std::uint16_t>(
+                            std::min<std::size_t>(n, 254));


Mismatch in Key Length Determination for Composite Expressions\n\nThere is a subtle but significant issue with the fallback logic for composite expressions here.\n\n1. evaluate_index_expr Always Pads:\n The call openads::engine::evaluate_index_expr(*t, expr, 254) explicitly passes 254 as the key_len. Inside evaluate_index_expr, the returned string is always padded or resized to exactly key_len (254). Therefore, std::move(k).value().size() will always be 254, and klen will always be set to 254 for any composite expression. This results in unnecessarily large, fixed-width index keys on disk (254 bytes), wasting significant disk space and memory, and degrading performance.\n\n2. read_field Trims Trailing Spaces:\n Even if evaluate_index_expr did not pad the result, t->read_field trims trailing spaces from character fields. For an expression like UPPER(NOME) where the first record is "ANA", the evaluator only sees "ANA" (length 3). Thus, the evaluated length would still be 3, which would re-introduce the truncation bug for subsequent longer records (e.g., "ANABELA").\n\n### Suggested Solutions:\nTo correctly determine the untrimmed key length of a composite expression without defaulting to 254 or truncating:\n- Option A (Architectural): Introduce a way to evaluate index expressions using the declared/padded field lengths instead of trimmed values (e.g., an option in read_field or a separate evaluation path).\n- Option B (Heuristic/Fallback): If a robust evaluation is too complex, consider using a more reasonable default fallback length (e.g., 128 or max(32, trimmed_len)) or document that composite expressions will default to 254 bytes.

…ld fix Record PR #68 (character index key sized from field width, not the trimmed first record) and the <cstdint> include that unblocked the ninja-clang -Werror CI job. Bump fallback project VERSION to 1.2.3. Suite now 739/739. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(cdx): don't pin composite index keys to the 254-byte probe width (follow-up to #68)

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

Merge branch 'main' into fix/cdx-char-key-length

9b16af2

FiveTechSoft merged commit 377a413 into FiveTechSoft:main Jun 24, 2026
5 checks passed

Admnwk mentioned this pull request Jun 24, 2026

fix(cdx): don't pin composite index keys to the 254-byte probe width (follow-up to #68) #69

Merged

FiveTechSoft added a commit that referenced this pull request Jun 26, 2026

Merge pull request #69 from Admnwk/fix/cdx-composite-key-length

fab8ff7

fix(cdx): don't pin composite index keys to the 254-byte probe width (follow-up to #68)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cdx): size a character index key from the field width, not the trimmed first record#68

fix(cdx): size a character index key from the field width, not the trimmed first record#68
FiveTechSoft merged 2 commits into
FiveTechSoft:mainfrom
Admnwk:fix/cdx-char-key-length

Admnwk commented Jun 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Admnwk commented Jun 24, 2026

Problem

Reproduction

Fix

Test

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants