perf: pre-allocate values list in BoundStatement.bind()#12
Open
perf: pre-allocate values list in BoundStatement.bind()#12
Conversation
Replace empty list + repeated append() with pre-allocated list and index assignment in BoundStatement.bind(). For protocol v4+, the list is initialized to [UNSET_VALUE] * col_meta_len, eliminating the separate trailing UNSET_VALUE padding loop entirely. For protocol v3, the list is initialized to [None] * value_len. This avoids repeated list resizing and reduces Python bytecode overhead per bound value (index assignment vs method lookup + call for append). The routing key validation for UNSET_VALUE is preserved: explicit UNSET_VALUE binds are checked inline, and implicitly padded trailing columns are validated in a separate loop after the main bind loop. Part of: scylladb#751
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
valueslist inBoundStatement.bind()using[UNSET_VALUE] * col_meta_len(proto v4+) or[None] * value_len(proto v3) instead of starting with an empty list and calling.append()per value_append_unset_value()called in afor _ in range(diff)loop)result[i] = col_bytes) instead of method lookup + call (self.values.append(col_bytes))Motivation
Each
.append()call involves a Python method lookup and function call, plus potential list resizing when capacity is exceeded. For prepared statements with many columns (common in LWT queries), this overhead is measurable. Pre-allocating with a known size and using index assignment avoids both the method dispatch overhead and all list resizing.This is part of the LWT prepared statement performance improvement effort documented in scylladb#751 (optimization B5).
Changes
cassandra/query.py-BoundStatement.bind():resultlist with known final size instead ofself.values = []self.values.append(...)withresult[i] = ...usingenumerate(zip(...))result = [UNSET_VALUE] * col_meta_len— trailing unbound columns are already paddedresult = [None] * value_len— only provided values_append_unset_value())self.values = resultat the end (single attribute write)Testing
All existing tests pass:
tests/unit/test_parameter_binding.py— 37/37 passed (V3, V4, V5 protocol versions)tests/unit/test_query.py— 6/6 passedtests/unit/test_resultset.py— 14/14 passed