Correctly calculate string length based on encoded length #15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current implementation calculates the length of the string to send by simply using len() on the string. However, the correct calculation (that kdb expects) is the length of the encoded string.
In latin-1 and ascii, these two are the same. However for extended characters such as those in the utf-8 char-set, a single character in a python string may be 2 or more encoded characters. Since QConnection now supports passing alternate encoding parameters, e.g. utf-8, this should be supported here as well -- otherwise qPython reports an error when trying to send extended chars.
E.g.:
qcon = qpython.qconnection.QConnection(host = Host, port = Port, username = User, password = Pass, encoding = 'UTF-8')
This PR replicates the one from exxeleron#77
In addition, this change has been in use in sublime-q since 2020.