Python: Fix Redis FT.CREATE PREFIX sending one prefix per character of collec…#13895
Python: Fix Redis FT.CREATE PREFIX sending one prefix per character of collec…#13895daric93 wants to merge 6 commits intomicrosoft:mainfrom
Conversation
…tion name Signed-off-by: Daria Korenieva <daric2612@gmail.com>
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 94%
✓ Correctness
The change from
prefix=f"{self.collection_name}:"(a string) toprefix=[f"{self.collection_name}:"](a list) is a correct bug fix. The upstreamIndexDefinition._append_prefixmethod (redis-py) callslen(prefix)to emit the PREFIX count argument and then iteratesfor p in prefix. When a bare string is passed,len()returns the character count and iteration yields individual characters, producing a malformed Redis command. Wrapping it in a list ensures the count is 1 and the single prefix string is emitted correctly.
✓ Security Reliability
This is a correct and important bug fix. The
IndexDefinition._append_prefix()method (in redis-py) expects an iterable of prefix strings (typically a list), not a bare string. When a bare string is passed,len()returns the character count and iteration yields individual characters, producing a malformed Redis FT.CREATE command (e.g.,PREFIX 7 m y n a m e :instead ofPREFIX 1 myname:). Changing to[f"{self.collection_name}:"]fixes this. The legacy memory store atredis_memory_store.py:132has the same latent bug withprefix=f"{collection_name}:"but is outside this diff's scope. No security or reliability concerns with the change itself.
✓ Test Coverage
The change from a string to a list for the
prefixparameter inIndexDefinitionis a bug fix (redis-py expects a list for prefix), but the existing unit testtest_create_indexdoes not verify the arguments passed toIndexDefinitionorcreate_index. Sincecreate_indexis fully mocked, the test only checks that no exception is raised — it never asserts thatprefixis a list. This means the fix has no regression coverage; reverting it would not cause any test to fail.
✓ Design Approach
The change correctly fixes a bug where a bare string was passed to
IndexDefinition(prefix=...). The redis-pyIndexDefinition._append_prefixmethod callslen(prefix)(expecting a list length) and then iterates overprefixelement-by-element. Passing a string like"test:"caused the count to be the string length (e.g. 5) and iterated character-by-character, producing a malformed Redis FT.CREATE command. Wrapping the prefix in a list[f"{self.collection_name}:"]is the correct fix per the library's API contract.
Suggestions
- The legacy
redis_memory_store.py(line 132) has the same bug:IndexDefinition(prefix=f"{collection_name}:", ...)should also be wrapped in a list. Consider fixing it in this PR for consistency. - The
test_create_indextest (line 294) should assert thatcreate_indexwas called with anIndexDefinitionwhoseprefixis a list (e.g.,["test_collection:"]). Currently the mock makes zero assertions on its arguments, so this fix has no regression protection. Additionally,test_create_index_manual(line 298) passesprefix="test:"(a bare string); consider updating it to a list to match the production code and document the expected format.
Automated review by daric93's agents
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Pull request overview
Fixes a Redis Search index creation bug in the Python Redis vector store connector where FT.CREATE received one PREFIX entry per character of the collection name (due to passing a plain string into redis-py’s IndexDefinition(prefix=...)).
Changes:
- Update
RedisCollection.ensure_collection_exists()to passprefixas a list ([f"{collection_name}:"]) when constructingIndexDefinition. - Add a unit test asserting the generated
FT.CREATE ... PREFIXarguments represent a single prefix, not a character-split list. - Update the manual
IndexDefinition(prefix=...)usage in unit tests to use a list.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/semantic_kernel/connectors/redis.py | Fixes index definition creation to pass prefix as a list to redis-py. |
| python/tests/unit/connectors/memory/test_redis_store.py | Adds/updates unit tests to validate correct PREFIX argument formatting and manual IndexDefinition usage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| index_definition = IndexDefinition( | ||
| prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type] | ||
| ) | ||
| await self.redis_database.ft(self.collection_name).create_index(fields, definition=index_definition, **kwargs) |
There was a problem hiding this comment.
There is at least one other Redis connector in this repo that still constructs IndexDefinition(prefix=f"{collection_name}:") (e.g., python/semantic_kernel/connectors/memory_stores/redis/redis_memory_store.py:132). That code path will exhibit the same character-by-character PREFIX bug against redis-py. Consider updating that call to pass a list (and/or adding coverage) so the fix is consistent across Redis connectors.
…e prefix Combines fixes from microsoft#13895, microsoft#13899, and microsoft#13905 into a single PR. Fixes microsoft#13894 — FT.CREATE PREFIX sent one prefix per character of collection name. IndexDefinition(prefix=string) iterates the string character-by-character; changed to prefix=[string] in both the main connector and the legacy memory_stores connector. Fixes microsoft#13896 — Vector search (FT.SEARCH) broken due to two issues: 1. redisvl 0.5+ changed process_results() to expect IndexSchema, not StorageType. Construct a minimal IndexSchema before calling it. 2. RedisHashsetCollection._deserialize_store_models_to_dicts called buffer_to_array() unconditionally for vector fields. When include_vectors=False (default for search), the field is absent, causing KeyError. Added guard: if field.name in rec. Updated redisvl pin from ~=0.4 to >=0.5. Fixes microsoft#13904 — RedisJsonCollection._inner_delete did not call _get_redis_key() to prefix keys (already committed on this branch). Test updates: - Removed _SEARCH_XFAIL markers (search now works) - Fixed exception types in connection failure handling per Copilot review (VectorStoreInitializationException, not MemoryConnectorConnectionException) - Added unit tests for prefix-is-list, vector field guard, and IndexSchema in process_results
Fixes #13894.
The Python Redis connector was passing a plain string to
redis-py'sIndexDefinition(prefix=...).redis-pytreats that argument as an iterable of prefixes and iterates the string character-by-character. As a result,FT.CREATEwas registering one prefix per character of the collection name instead of one prefix equal to"<collection_name>:".Observed on the wire for collection
redis_json_list_data_model:Expected:
RediSearch on Redis Stack silently accepts the malformed command, so the bug has been latent for single-collection users.
Contribution Checklist