New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle invalid utf-8 byte sequences in sql summarizer and DB statement #896
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
What about statement
? Do we need to handle binary there too?
(
|
not unless we call methods on the string. Simply setting it and passing it along in the context won't matter. I'll double check though. |
We evetually convert it to JSON which will fail, I think. And we catch those failures, but maybe we want to do it upfront instead? |
💔 Tests Failed
Expand to view the summary
Build stats
Test stats 🧪
Test errorsExpand to view the tests failures> Show only the first 10 test failures
|
Yes, you're right. You'd get this with invalid utf-8 bytes when serializing to JSON: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Remember changelog entry |
Fixes #895
Note that we must do this when the
Sql::Signature
object is created, as opposed to in theSql::Tokenizer
because in case the string cannot be parsed, the first character is returned via@sql.split
. That will fail if the first character is an invalid utf-8 byte sequence.