Skip to content

Python: fix(redis): wrap IndexDefinition prefix in list; use _get_redis_key in JSON delete#13970

Open
nileshpatil6 wants to merge 3 commits intomicrosoft:mainfrom
nileshpatil6:fix/redis-connector-prefix-and-delete-key
Open

Python: fix(redis): wrap IndexDefinition prefix in list; use _get_redis_key in JSON delete#13970
nileshpatil6 wants to merge 3 commits intomicrosoft:mainfrom
nileshpatil6:fix/redis-connector-prefix-and-delete-key

Conversation

@nileshpatil6
Copy link
Copy Markdown

Summary

Two bugs in python/semantic_kernel/connectors/redis.py:

Bug 1 — IndexDefinition prefix iterates character-by-character (fixes #13894)

ensure_collection_exists passed a bare str as the prefix argument:

# before
index_definition = IndexDefinition(
    prefix=f"{self.collection_name}:", index_type=...
)

redis-py's IndexDefinition accepts an iterable of prefixes. When passed a plain string it iterates character by character, generating a PREFIX N c o l l e c t i o n ... clause instead of PREFIX 1 collection_name:. This causes the index to match nothing (or incorrectly).

# after
index_definition = IndexDefinition(
    prefix=[f"{self.collection_name}:"], index_type=...
)

Bug 2 — RedisJsonCollection._inner_delete silently no-ops (fixes #13904)

_inner_delete passed the raw caller key to json().delete():

# before
async def _inner_delete(self, keys: Sequence[str], **kwargs: Any) -> None:
    await asyncio.gather(*[self.redis_database.json().delete(key, **kwargs) for key in keys])

But _single_upsert stores keys as {collection_name}:{key} (via _get_redis_key). The delete always targeted a non-existent key and returned 0 without raising. RedisHashsetCollection._inner_delete already uses _get_redis_key correctly.

# after
async def _inner_delete(self, keys: Sequence[str], **kwargs: Any) -> None:
    await asyncio.gather(*[self.redis_database.json().delete(self._get_redis_key(key), **kwargs) for key in keys])

Testing

Both fixes are straightforward and can be verified with a live Redis instance by:

  1. Creating a RedisJsonCollection, upserting a record, then deleting it and confirming it's gone.
  2. Checking FT.INFO <collection_name> to confirm the PREFIX clause is 1 <collection_name>: after ensure_collection_exists.

…y in JSON delete

Two bugs in the Redis connector:

1. ensure_collection_exists passed prefix as a bare str to IndexDefinition.
   redis-py iterates over it character by character, producing a PREFIX
   clause with one entry per character instead of a single collection prefix.
   Fix: wrap in a list.

2. RedisJsonCollection._inner_delete called json().delete(key) with the raw
   caller-supplied key, but upsert stores keys as {collection_name}:{key}.
   The delete always targeted a non-existent key and silently returned 0.
   Fix: use self._get_redis_key(key), consistent with RedisHashsetCollection.

Fixes microsoft#13894, microsoft#13904
Copilot AI review requested due to automatic review settings May 8, 2026 10:43
@nileshpatil6 nileshpatil6 requested a review from a team as a code owner May 8, 2026 10:43
@moonbox3 moonbox3 added the python Pull requests for the Python Semantic Kernel label May 8, 2026
@github-actions github-actions Bot changed the title fix(redis): wrap IndexDefinition prefix in list; use _get_redis_key in JSON delete Python: fix(redis): wrap IndexDefinition prefix in list; use _get_redis_key in JSON delete May 8, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two Redis connector bugs in the Python vector store implementation (RedisCollection / RedisJsonCollection) to ensure RediSearch indexes are created with the intended key prefix and JSON deletions target the actual stored keys.

Changes:

  • Wrap IndexDefinition.prefix in a list so redis-py doesn’t iterate the prefix string character-by-character.
  • Apply _get_redis_key() when deleting JSON records so deletes work when key prefixing is enabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 279 to 283
fields = _definition_to_redis_fields(self.definition, self.collection_type)
index_definition = IndexDefinition(
prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type]
prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type]
)
await self.redis_database.ft(self.collection_name).create_index(fields, definition=index_definition, **kwargs)
Comment on lines 280 to 283
index_definition = IndexDefinition(
prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type]
prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type]
)
await self.redis_database.ft(self.collection_name).create_index(fields, definition=index_definition, **kwargs)
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 94%

✓ Correctness

Both fixes are correct and well-motivated. Bug 1 wraps the IndexDefinition prefix string in a list, which is required because redis-py's IndexDefinition iterates over the prefix argument — a bare string would be iterated character-by-character. Bug 2 adds the missing _get_redis_key() call in RedisJsonCollection._inner_delete, making it consistent with RedisHashsetCollection._inner_delete (line 581) and RedisJsonCollection._inner_get (line 699), both of which already use _get_redis_key to prepend the collection name prefix.

✓ Security Reliability

Both changes are correct and well-scoped bug fixes. Fix 1 wraps the IndexDefinition prefix in a list to prevent character-by-character iteration of the string — consistent with redis-py's expected API. Fix 2 adds the missing _get_redis_key() call in RedisJsonCollection._inner_delete, making it consistent with _inner_get (line 699) and the hashset collection's _inner_delete (line 581). No security or reliability concerns are introduced by these changes.

✓ Test Coverage

Both fixes are correct and address real bugs, but neither has unit test coverage that would actually catch a regression. Bug 1 (IndexDefinition prefix as list): test_create_index mocks create_index entirely, so the IndexDefinition is constructed but never inspected — the test passes whether prefix is a string or a list. Bug 2 (JSON delete with _get_redis_key): test_delete only uses non-prefix collections (prefix_collection_name_to_key_names=False), making _get_redis_key a no-op. The fix is only meaningful with prefixing enabled, but no such test exists for delete. Compare with test_get, which does parametrize over prefix=True/False for both collection types. Both fixes should have corresponding tests that verify the corrected behavior.

✗ Design Approach

The JSON delete change follows the existing key-normalization pattern correctly, but the index-definition fix is still incomplete: it wraps the prefix in a list, yet the auto-created index prefix remains unconditional and does not match the default write path when prefix_collection_name_to_key_names is left at its default False.

Flagged Issues

  • python/semantic_kernel/connectors/redis.py:281 hard-codes an index prefix of collection_name: even though both concrete collections default prefix_collection_name_to_key_names=False (redis.py:524, redis.py:652) and store keys via _get_redis_key, which returns the raw key when that flag is false (redis.py:250-253, :599, :723). The default upsert writes id1 while the index only matches test:-prefixed keys. The auto-generated index definition should derive its prefix from prefix_collection_name_to_key_names instead of always forcing collection_name:.

Automated review by nileshpatil6's agents

@override
async def _inner_delete(self, keys: Sequence[str], **kwargs: Any) -> None:
await asyncio.gather(*[self.redis_database.json().delete(key, **kwargs) for key in keys])
await asyncio.gather(*[self.redis_database.json().delete(self._get_redis_key(key), **kwargs) for key in keys])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage. This fix is correct — RedisHashsetCollection._inner_delete (line 581) already uses _get_redis_key. However, the existing test_delete (test_redis_store.py:274) only uses non-prefix collections (collection_hash/collection_json), where _get_redis_key returns the key unchanged. The test passes identically with the old and new code. Please add a test with collection_with_prefix_json that asserts json().delete() receives 'test:id1', not 'id1'.

fields = _definition_to_redis_fields(self.definition, self.collection_type)
index_definition = IndexDefinition(
prefix=f"{self.collection_name}:", index_type=INDEX_TYPE_MAP[self.collection_type]
prefix=[f"{self.collection_name}:"], index_type=INDEX_TYPE_MAP[self.collection_type]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design mismatch & missing test coverage. Wrapping the prefix in a list fixes the redis-py API usage, but the index prefix is still unconditionally set to collection_name:. Both RedisHashsetCollection and RedisJsonCollection default prefix_collection_name_to_key_names=False (redis.py:524, redis.py:652), so _get_redis_key returns the raw key (redis.py:250-253). Default upsert writes id1 while the index only matches test:-prefixed keys. The prefix should be derived from prefix_collection_name_to_key_names. Additionally, test_create_index (test_redis_store.py:294) mocks create_index without inspecting its arguments — consider asserting on the mock's call_args to verify the definition was constructed with the correct prefix.

…x_collection_name_to_key_names

When prefix_collection_name_to_key_names=False (the default for both
RedisHashsetCollection and RedisJsonCollection), _get_redis_key() returns
the raw key, so upsert stores e.g. "id1". The previous fix still
unconditionally set prefix=["collection_name:"] in ensure_collection_exists,
meaning the RediSearch index only matched prefixed keys while records were
stored under raw keys — vector search would return nothing by default.

Fix: only set the prefix when prefix_collection_name_to_key_names=True;
otherwise create the IndexDefinition without a prefix so the index matches
all keys regardless of naming scheme.
…elete key

- test_create_index_respects_prefix_flag: asserts that ensure_collection_exists
  passes a list prefix ["test:"] to create_index when
  prefix_collection_name_to_key_names=True, and no prefix when False.
- test_delete_json_with_prefix: asserts that RedisJsonCollection._inner_delete
  calls json().delete() with the prefixed key "test:id1" when
  prefix_collection_name_to_key_names=True.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

3 participants