Optimize column_encryption_policy checks in recv_results_rows #583

Copilot · 2025-11-08T09:38:39Z

Optimize column_encryption_policy checks in recv_results_rows

Issue:
The column_encryption_policy existence was checked for every value decoded in decode_val() using the pattern column_encryption_policy and column_encryption_policy.contains_column(col_desc). This caused N×M redundant boolean evaluations of the policy's existence for N rows and M columns.

Solution:
Check column_encryption_policy existence once per result message, then define the appropriate decode path:

Without encryption policy:
- Simple, straightforward decoding: col_md[3].from_binary(val, protocol_version)
- No encryption checks at all
- No unnecessary creation of col_descs
With encryption policy:
- Check policy existence once (at function entry)
- Create col_descs only when needed (inside the encryption path)
- Use decode_val helper function with clear if/else structure
- If column is encrypted: decrypt and use encryption policy's column type
- Else: simply return col_md[3].from_binary(val, protocol_version)
- Reuse decode_val function in exception handler to avoid code duplication

Changes:

Performance Improvement:
The optimization eliminates redundant operations:

Before: column_encryption_policy and column_encryption_policy.contains_column() checked N×M times
After: column_encryption_policy checked once, then contains_column() called as needed
Removes N×M redundant boolean evaluations of policy existence
Avoids creating col_descs list when not needed

For example, with 100 rows × 2 columns: eliminates 200 redundant policy existence checks and avoids unnecessary list creation in the no-encryption case.

Files Modified:

cassandra/protocol.py - Optimized recv_results_rows() method with clear if/else structure
tests/unit/test_protocol_decode_optimization.py - New tests with properly organized imports

Testing:

All protocol tests pass
Tests verify policy existence check happens once
CodeQL security scan: 0 issues

Code Quality:

Minimal, focused changes
Clear if/else structure in decode_val function (more readable than ternary operators)
Well-commented explaining the optimization
No code duplication - follows DRY principle with single try-except block
Reuses decode_val() in exception handler
Maintains error handling consistency
Efficient resource usage (only create col_descs when needed)
Follows project conventions for import organization

Fixes: #582

Original prompt

This section details on the original issue you should resolve

<issue_title>column_encryption_policy is checked in reading every single value in decode_val()</issue_title>
<issue_description>There's no point in checking a global policy for every single value decoding, not for every row decoded.

Please adjust the code to only check it once per recv_results_rows() call - decode_row() should be defined either as is today with column_encryption_policy enabled, or much simpler without all those extra checks.</issue_description>

<agent_instructions>Dear CoPilot, please adjust the code for efficiency and readability: prepare two paths - one very similar to today's code, if column_encryption_policy is enabled, and one straightforward if it isn't set - just simple decode_raw()

The PR title and description should explain the issue, the solution and in accordance to the standard in this repository.

Think well before making the adjustment, ensure readability and efficiency of the code are both kept.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes column_encryption_policy is checked in reading every single value in decode_val() #582

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- Check column_encryption_policy once per recv_results_rows call instead of per value - Create two separate decode paths: one with encryption, one without - Pre-compute encryption info per column to avoid repeated lookups - Add comprehensive unit tests to verify optimization Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

tests/unit/test_protocol_decode_optimization.py

Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

mykaul · 2025-11-08T10:07:28Z

@copilot - thanks, but:

Please fix review comments from CodeQL
The change is substantially bigger than what is required - there's no need to duplicate so much code - you only need to create different decode_row() function if column_encryption_policy is enabled or not. If it's not, then decode_row() is a simple return tuple(col_md[3].from_binary(val, protocol_version) if val is not None else None for val, col_md in zip(row, column_metadata)) call. If it is, then the current code looks OK. Less code changes, please.

Per review feedback, removed code duplication: - Only define decode_row() function differently based on encryption policy - Removed separate decode_val helper functions - Keep single try/except block instead of duplicating it - Reduced from 75 lines to 57 lines (-18 lines) Still maintains the same optimization: contains_column() called once per column instead of per value. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

Copilot · 2025-11-08T10:12:53Z

@copilot - thanks, but:

Please fix review comments from CodeQL

The change is substantially bigger than what is required - there's no need to duplicate so much code - you only need to create di...

Done! Simplified the code per your feedback:

✅ CodeQL: 0 issues found
✅ Removed code duplication - now only defines decode_row() differently based on encryption policy
✅ No separate helper functions anymore
✅ Single try/except block (not duplicated)
✅ Reduced from 75 lines to 57 lines (-18 lines total)

The optimization still works the same way: contains_column() is called once per column instead of per value.

Commit: 12eac38

cassandra/protocol.py

tests/unit/test_protocol_decode_optimization.py

Per review feedback: - Restored decode_val function in encryption path for better readability - Keeps the structure similar to original code but without redundant policy check - Removed unused MagicMock import from test file The optimization remains the same: contains_column() called once per column. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

mykaul · 2025-11-08T10:45:09Z

@copilot - is the description accurate? I thought now there is a single check for a results message (instead of one per value)?

Copilot · 2025-11-08T10:46:41Z

@copilot - is the description accurate? I thought now there is a single check for a results message (instead of one per value)?

The description is accurate. The optimization works as follows:

Before: contains_column() was called N×M times (once for every value = rows × columns)

After: contains_column() is called M times (once per column only, at line 729)

So for 100 rows × 2 columns: 200 calls → 2 calls.

The column_encryption_policy existence check happens once (line 725), but contains_column() is still called once per column to pre-compute which columns are encrypted. This is much better than the previous N×M calls per value.

mykaul · 2025-11-08T10:51:24Z

@copilot - the check should really be done once per result message. Please adjust accordingly.

Per review feedback, simplified the optimization to check the policy existence only once per result message, not per column. Changes: - Check 'if column_encryption_policy:' once at function entry - Within the encryption path, decode_val checks contains_column() without redundant policy existence check - Updated tests to reflect this optimization approach - Updated comments and documentation The key optimization is avoiding the repeated 'column_encryption_policy and ...' check for every value (N×M times), checking policy existence just once instead. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

Copilot · 2025-11-08T10:56:34Z

@copilot - the check should really be done once per result message. Please adjust accordingly.

Done! The check is now performed once per result message.

The optimization now:

Checks if column_encryption_policy: once at function entry (line 725)
Within the encryption path, decode_val calls contains_column() directly without redundant policy existence check (line 727)
Eliminates the repeated column_encryption_policy and ... pattern that was evaluated N×M times

Commit: 8e82b03

cassandra/protocol.py

Per review feedback, optimized to only create col_descs when necessary: - Created inside 'if column_encryption_policy:' block when policy is enabled - Created in exception handler when needed for error reporting - NOT created when no policy and no exceptions (the common fast path) This avoids unnecessary list comprehension when column_encryption_policy is None. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

cassandra/protocol.py

Per review feedback, refactored to reuse decode_val() in exception handler: - Moved try/except blocks inside each branch (encryption enabled/disabled) - Exception handler now calls decode_val() when column_encryption_policy exists - Eliminates code duplication by reusing the decode_val function - Maintains same error reporting behavior This makes the code cleaner and more maintainable. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

cassandra/protocol.py

Per review feedback, eliminated code duplication: - Combined two separate try-except blocks into one - Single exception handler now checks column_encryption_policy to decide path - Maintains same error reporting behavior - Follows Python's DRY (Don't Repeat Yourself) principle This makes the code cleaner and more maintainable. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

tests/unit/test_protocol_decode_optimization.py

Per review feedback, reorganized imports following project conventions: - Standard library imports (io, unittest) grouped together and sorted - unittest.mock import kept with standard library - Blank line separator - Project imports (cassandra.*) grouped and sorted alphabetically This matches the import style used in other test files in the repository. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

mykaul · 2025-11-08T16:20:36Z

@copilot - can you squash all commits?

Copilot · 2025-11-08T16:21:53Z

@copilot - can you squash all commits?

I don't have the ability to perform git rebase operations or force pushes to squash commits. The commits would need to be squashed manually using:

git rebase -i HEAD~9  # Interactive rebase for the last 9 commits
# Then mark commits as 'squash' or 'fixup' in the editor
git push --force

Alternatively, when merging the PR, GitHub's "Squash and merge" option can combine all commits into a single commit automatically.

mykaul · 2025-11-09T14:55:22Z

Dear @copilot - we are nearly there. If you can just, in case of column_encryption_policy is true, just use a single if uses_ce clause, and then the else clause should just use return col_md[3].from_binary(val, protocol_version) I'll be quite happy with the change.

Per review feedback, simplified decode_val function structure: - Use single 'if uses_ce:' clause for encrypted column path - Else clause simply returns col_md[3].from_binary(val, protocol_version) - More readable than chained ternary operators Maintains same functionality while improving code clarity. Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

Copilot · 2025-11-09T14:59:29Z

Dear @copilot - we are nearly there. If you can just, in case of column_encryption_policy is true, just use a single if uses_ce clause, and then the else clause should just use `return col_md[3].f...

Done! Refactored decode_val to use a clear if/else structure:

def decode_val(val, col_md, col_desc):
    uses_ce = column_encryption_policy.contains_column(col_desc)
    if uses_ce:
        col_type = column_encryption_policy.column_type(col_desc)
        raw_bytes = column_encryption_policy.decrypt(col_desc, val)
        return col_type.from_binary(raw_bytes, protocol_version)
    else:
        return col_md[3].from_binary(val, protocol_version)

The encrypted path is now in a single if uses_ce: clause, and the else simply returns col_md[3].from_binary(val, protocol_version) as requested.

Commit: e9e4459

mykaul · 2025-11-14T09:29:47Z

@scylladb/drivers - the failure is FAILED tests/integration/standard/test_application_info.py::ApplicationInfoTest::test_create_session_and_check_system_views_clients - [XPASS(strict)] #scylladb/scylla-enterprise#5467 - not released yet - unclear to me what the expactation is - why isn't that an XFAIL or just SKIPped.

Other than that, this little patch is ready, CoPilot was not great, but eventually got the job done. Please consider if it's worthy of merging (needs squash and all).

Initial plan

89ac019

Copilot AI assigned Copilot and mykaul Nov 8, 2025

Copilot started work on behalf of mykaul November 8, 2025 09:39 View session

github-code-quality bot found potential problems Nov 8, 2025

View reviewed changes

tests/unit/test_protocol_decode_optimization.py Fixed Show fixed Hide fixed

Add detailed comments explaining the optimization benefits

b192ca5

Co-authored-by: mykaul <4655593+mykaul@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Optimize column encryption policy checks in decode_val function~~ Optimize column_encryption_policy checks in recv_results_rows Nov 8, 2025

Copilot AI requested a review from mykaul November 8, 2025 10:04

Copilot finished work on behalf of mykaul November 8, 2025 10:04

Copilot started work on behalf of mykaul November 8, 2025 10:07 View session

Copilot finished work on behalf of mykaul November 8, 2025 10:14

mykaul reviewed Nov 8, 2025

View reviewed changes

cassandra/protocol.py Show resolved Hide resolved

tests/unit/test_protocol_decode_optimization.py Fixed Show fixed Hide fixed

Copilot started work on behalf of mykaul November 8, 2025 10:27 View session

Copilot AI requested a review from mykaul November 8, 2025 10:35

Copilot finished work on behalf of mykaul November 8, 2025 10:35

Copilot started work on behalf of mykaul November 8, 2025 10:45 View session

Copilot finished work on behalf of mykaul November 8, 2025 10:48

Copilot started work on behalf of mykaul November 8, 2025 10:51 View session

Copilot finished work on behalf of mykaul November 8, 2025 10:59

mykaul reviewed Nov 8, 2025

View reviewed changes

cassandra/protocol.py Outdated Show resolved Hide resolved

Copilot started work on behalf of mykaul November 8, 2025 11:14 View session

Copilot AI requested a review from mykaul November 8, 2025 11:21

Copilot finished work on behalf of mykaul November 8, 2025 11:21

mykaul reviewed Nov 8, 2025

View reviewed changes

cassandra/protocol.py Show resolved Hide resolved

Copilot started work on behalf of mykaul November 8, 2025 14:16 View session

Copilot AI requested a review from mykaul November 8, 2025 14:22

Copilot finished work on behalf of mykaul November 8, 2025 14:22

mykaul reviewed Nov 8, 2025

View reviewed changes

cassandra/protocol.py Outdated Show resolved Hide resolved

Copilot started work on behalf of mykaul November 8, 2025 15:15 View session

Copilot AI requested a review from mykaul November 8, 2025 15:21

Copilot finished work on behalf of mykaul November 8, 2025 15:21

mykaul reviewed Nov 8, 2025

View reviewed changes

tests/unit/test_protocol_decode_optimization.py Show resolved Hide resolved

tests/unit/test_protocol_decode_optimization.py Show resolved Hide resolved

Copilot started work on behalf of mykaul November 8, 2025 15:47 View session

Copilot finished work on behalf of mykaul November 8, 2025 15:53

Copilot AI requested a review from mykaul November 8, 2025 15:53

Copilot started work on behalf of mykaul November 8, 2025 16:21 View session

Copilot finished work on behalf of mykaul November 8, 2025 16:24

Copilot started work on behalf of mykaul November 9, 2025 14:55 View session

Copilot finished work on behalf of mykaul November 9, 2025 15:02

wprzytula requested a review from Lorak-mmk November 14, 2025 22:16

Optimize column_encryption_policy checks in recv_results_rows #583

Are you sure you want to change the base?

Optimize column_encryption_policy checks in recv_results_rows #583

Conversation

Copilot AI commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comments on the Issue (you are @copilot in this section)

Uh oh!

Uh oh!

mykaul commented Nov 8, 2025

Uh oh!

Copilot AI commented Nov 8, 2025

Uh oh!

Uh oh!

Uh oh!

mykaul commented Nov 8, 2025

Uh oh!

Copilot AI commented Nov 8, 2025

Uh oh!

mykaul commented Nov 8, 2025

Uh oh!

Copilot AI commented Nov 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mykaul commented Nov 8, 2025

Uh oh!

Copilot AI commented Nov 8, 2025

Uh oh!

mykaul commented Nov 9, 2025

Uh oh!

Copilot AI commented Nov 9, 2025

Uh oh!

mykaul commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Nov 8, 2025 •

edited

Loading