Skip to content

Python: Fix Redis FT.CREATE PREFIX sending one prefix per character of collec…#13895

Closed
daric93 wants to merge 6 commits intomicrosoft:mainfrom
daric93:fix/python-redis-ft-create-prefix-13894
Closed

Python: Fix Redis FT.CREATE PREFIX sending one prefix per character of collec…#13895
daric93 wants to merge 6 commits intomicrosoft:mainfrom
daric93:fix/python-redis-ft-create-prefix-13894

Conversation

@daric93
Copy link
Copy Markdown

@daric93 daric93 commented Apr 20, 2026

Fixes #13894.

The Python Redis connector was passing a plain string to redis-py's IndexDefinition(prefix=...). redis-py treats that argument as an iterable of prefixes and iterates the string character-by-character. As a result, FT.CREATE was registering one prefix per character of the collection name instead of one prefix equal to "<collection_name>:".

Observed on the wire for collection redis_json_list_data_model:

FT.CREATE redis_json_list_data_model ON JSON
  PREFIX 27 r e d i s _ j s o n _ l i s t _ d a t a _ m o d e l :
  SCORE 1.0 SCHEMA ...

Expected:

FT.CREATE redis_json_list_data_model ON JSON
  PREFIX 1 redis_json_list_data_model:
  SCORE 1.0 SCHEMA ...

RediSearch on Redis Stack silently accepts the malformed command, so the bug has been latent for single-collection users.

Contribution Checklist

…tion name

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@daric93 daric93 requested a review from a team as a code owner April 20, 2026 21:52
@moonbox3 moonbox3 added the python Pull requests for the Python Semantic Kernel label Apr 20, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 94%

✓ Correctness

The change from prefix=f"{self.collection_name}:" (a string) to prefix=[f"{self.collection_name}:"] (a list) is a correct bug fix. The upstream IndexDefinition._append_prefix method (redis-py) calls len(prefix) to emit the PREFIX count argument and then iterates for p in prefix. When a bare string is passed, len() returns the character count and iteration yields individual characters, producing a malformed Redis command. Wrapping it in a list ensures the count is 1 and the single prefix string is emitted correctly.

✓ Security Reliability

This is a correct and important bug fix. The IndexDefinition._append_prefix() method (in redis-py) expects an iterable of prefix strings (typically a list), not a bare string. When a bare string is passed, len() returns the character count and iteration yields individual characters, producing a malformed Redis FT.CREATE command (e.g., PREFIX 7 m y n a m e : instead of PREFIX 1 myname:). Changing to [f"{self.collection_name}:"] fixes this. The legacy memory store at redis_memory_store.py:132 has the same latent bug with prefix=f"{collection_name}:" but is outside this diff's scope. No security or reliability concerns with the change itself.

✓ Test Coverage

The change from a string to a list for the prefix parameter in IndexDefinition is a bug fix (redis-py expects a list for prefix), but the existing unit test test_create_index does not verify the arguments passed to IndexDefinition or create_index. Since create_index is fully mocked, the test only checks that no exception is raised — it never asserts that prefix is a list. This means the fix has no regression coverage; reverting it would not cause any test to fail.

✓ Design Approach

The change correctly fixes a bug where a bare string was passed to IndexDefinition(prefix=...). The redis-py IndexDefinition._append_prefix method calls len(prefix) (expecting a list length) and then iterates over prefix element-by-element. Passing a string like "test:" caused the count to be the string length (e.g. 5) and iterated character-by-character, producing a malformed Redis FT.CREATE command. Wrapping the prefix in a list [f"{self.collection_name}:"] is the correct fix per the library's API contract.

Suggestions

  • The legacy redis_memory_store.py (line 132) has the same bug: IndexDefinition(prefix=f"{collection_name}:", ...) should also be wrapped in a list. Consider fixing it in this PR for consistency.
  • The test_create_index test (line 294) should assert that create_index was called with an IndexDefinition whose prefix is a list (e.g., ["test_collection:"]). Currently the mock makes zero assertions on its arguments, so this fix has no regression protection. Additionally, test_create_index_manual (line 298) passes prefix="test:" (a bare string); consider updating it to a list to match the production code and document the expected format.

Automated review by daric93's agents

Comment thread python/semantic_kernel/connectors/redis.py
@github-actions github-actions Bot changed the title Fix Redis FT.CREATE PREFIX sending one prefix per character of collec… Python: Fix Redis FT.CREATE PREFIX sending one prefix per character of collec… Apr 20, 2026
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@daric93
Copy link
Copy Markdown
Author

daric93 commented Apr 21, 2026

@microsoft-github-policy-service agree

@microsoft-github-policy-service agree

@daric93 daric93 requested a review from moonbox3 April 21, 2026 16:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Redis Search index creation bug in the Python Redis vector store connector where FT.CREATE received one PREFIX entry per character of the collection name (due to passing a plain string into redis-py’s IndexDefinition(prefix=...)).

Changes:

  • Update RedisCollection.ensure_collection_exists() to pass prefix as a list ([f"{collection_name}:"]) when constructing IndexDefinition.
  • Add a unit test asserting the generated FT.CREATE ... PREFIX arguments represent a single prefix, not a character-split list.
  • Update the manual IndexDefinition(prefix=...) usage in unit tests to use a list.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
python/semantic_kernel/connectors/redis.py Fixes index definition creation to pass prefix as a list to redis-py.
python/tests/unit/connectors/memory/test_redis_store.py Adds/updates unit tests to validate correct PREFIX argument formatting and manual IndexDefinition usage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 280 to 283
index_definition = IndexDefinition(
prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type]
prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type]
)
await self.redis_database.ft(self.collection_name).create_index(fields, definition=index_definition, **kwargs)
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is at least one other Redis connector in this repo that still constructs IndexDefinition(prefix=f"{collection_name}:") (e.g., python/semantic_kernel/connectors/memory_stores/redis/redis_memory_store.py:132). That code path will exhibit the same character-by-character PREFIX bug against redis-py. Consider updating that call to pass a list (and/or adding coverage) so the fix is consistent across Redis connectors.

Copilot uses AI. Check for mistakes.
daric93 added a commit to daric93/microsoft-semantic-kernel that referenced this pull request Apr 29, 2026
…e prefix

Combines fixes from microsoft#13895, microsoft#13899, and microsoft#13905 into a single PR.

Fixes microsoft#13894 — FT.CREATE PREFIX sent one prefix per character of
collection name. IndexDefinition(prefix=string) iterates the string
character-by-character; changed to prefix=[string] in both the main
connector and the legacy memory_stores connector.

Fixes microsoft#13896 — Vector search (FT.SEARCH) broken due to two issues:
1. redisvl 0.5+ changed process_results() to expect IndexSchema, not
   StorageType. Construct a minimal IndexSchema before calling it.
2. RedisHashsetCollection._deserialize_store_models_to_dicts called
   buffer_to_array() unconditionally for vector fields. When
   include_vectors=False (default for search), the field is absent,
   causing KeyError. Added guard: if field.name in rec.
Updated redisvl pin from ~=0.4 to >=0.5.

Fixes microsoft#13904 — RedisJsonCollection._inner_delete did not call
_get_redis_key() to prefix keys (already committed on this branch).

Test updates:
- Removed _SEARCH_XFAIL markers (search now works)
- Fixed exception types in connection failure handling per Copilot
  review (VectorStoreInitializationException, not
  MemoryConnectorConnectionException)
- Added unit tests for prefix-is-list, vector field guard, and
  IndexSchema in process_results
@daric93 daric93 closed this Apr 29, 2026
@daric93
Copy link
Copy Markdown
Author

daric93 commented Apr 29, 2026

Closing — this fix has been combined into #13905 along with the other Redis connector bug fixes (#13896, #13904) so the integration tests can validate everything end-to-end.
Sorry for confusion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Bug: Redis connector sends malformed FT.CREATE PREFIX (one prefix per character of collection name)

3 participants