-
Notifications
You must be signed in to change notification settings - Fork 22
FIX: Binary data padding + tests #218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… bewithgaurav/fix_blank_columns
… bewithgaurav/fix_executemany
…icrosoft/mssql-python into bewithgaurav/fix_executemany
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes binary data handling in the mssql-python library, specifically addressing issues with empty strings and binary data that previously caused assertion failures. The fix improves UTF-16 conversion on Unix platforms and ensures proper handling of zero-length data.
- Fixes assertion failures when handling empty strings and binary data
- Improves UTF-16 string conversion for Unix platforms (Linux/macOS)
- Updates SQL type mapping to use VARBINARY instead of BINARY for better compatibility
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
tests/test_004_cursor.py | Adds comprehensive test cases for empty string/binary handling and fixes existing test expectations |
mssql_python/pybind/ddbc_bindings.cpp | Fixes core data handling logic for empty strings/binary and improves UTF-16 conversion |
mssql_python/cursor.py | Updates SQL type mapping to use VARBINARY and ensures minimum column sizes |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments
Work Item / Issue Reference
Summary
This pull request improves how binary data (Python
bytes
andbytearray
) is handled in the MSSQL Python driver, especially for edge cases like empty values, mixed types, and large binaries. It also significantly expands the test suite to cover these scenarios and documents current driver limitations regarding parameter and fetch buffer sizes.Binary Data Handling Improvements
_map_sql_type
incursor.py
to always useVARBINARY
for Pythonbytes
/bytearray
, avoiding storage waste and ensuring correct handling of variable-length data. This removes previous logic that sometimes used fixed-lengthBINARY
and simplifies type mapping._select_best_sample_value
to correctly handle columns that contain only binary types (bytes
orbytearray
).Test Suite Enhancements for Binary Data
test_longvarbinary
test to expect the correct number of rows and removed assumptions about zero-padding in returned binary data.bytes
.bytes
andbytearray
types in the same column, confirming consistent storage and retrieval asbytes
.