Skip to content

Conversation

@jalaziz
Copy link
Contributor

@jalaziz jalaziz commented Dec 29, 2021

Commit Message

Fix UnicodeDecodeError in aioredis integration

Escape non-unicode bytes when decoding aioredis args. This prevents the
UnicodeDecodeError while still offering some semblance of a
human-readable string for tracing.

fixes #3088

Checklist

  • Added to the correct milestone.
  • Tests provided or description of manual testing performed is included in the code or PR.
  • Library documentation is updated.
  • Corp site documentation is updated (link to the PR).

@jalaziz jalaziz requested a review from a team as a code owner December 29, 2021 13:47
Escape non-unicode bytes when decoding aioredis args. This prevents the
`UnicodeDecodeError` while still offering some semblance of a
human-readable string for tracing.
Copy link
Contributor

@majorgreys majorgreys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching the issue and submitting this fix @jalaziz!

Elsewhere in the code, we replace errors with the replacement character . Can we do the same here?

And, it would be great to include tests for the issue you found with how channels-redis calls the patched functions. It can be synthetic test where we check args. Or pulling out the decoding code and adding unit tests for such a function.

@jalaziz
Copy link
Contributor Author

jalaziz commented Dec 29, 2021

Elsewhere in the code, we replace errors with the replacement character . Can we do the same here?

@majorgreys I had thought about that, but that's lossy. I figured capturing the actual bytes may be more useful when debugging using traces.

HOWEVER, for the pipeline commands, the args are simply passed to format_command_args which effectively simply calls str on all the args. I was thinking that might be another option. In the case of bytes, it will return something like "b'abc\x80a.

Another option would be to use format_command_args and do the "safe" decoding there. The only real benefit would be dropping the b prefix for UTF-8 strings.

And, it would be great to include tests for the issue you found with how channels-redis calls the patched functions. It can be synthetic test where we check args. Or pulling out the decoding code and adding unit tests for such a function.

Will do, was looking at adding tests yesterday but it was 5am and I wanted to get the PR out first to get feedback 😄

@majorgreys
Copy link
Contributor

@majorgreys I had thought about that, but that's lossy. I figured capturing the actual bytes may be more useful when debugging using traces.

I agree that having the unicode input string escaped with a backslash has a benefit when looking at trace data. For the purposes of your fix here, your choice is fine!

Another option would be to use format_command_args and do the "safe" decoding there.

That does seem like an option but we'd then have a utility function in the redis integration that must support the arguments other integrations need to handle. Still, the decoding of args could be handled instead by not calling stringify at

cmd = stringify(arg)
and instead using ensure_text as we do elsewhere.

This discussion over whether the change for this fix should happen in the aioredis integration patching code or the utility function it uses from the redis integration is one I'd like to have @Kyle-Verhoog and @brettlangdon's thoughts on as well.

@brettlangdon
Copy link
Member

This discussion over whether the change for this fix should happen in the aioredis integration patching code or the utility function it uses from the redis integration is one I'd like to have @Kyle-Verhoog and @brettlangdon's thoughts on as well.

Centralizing the fix and not just fixing it for aioredis would be best. Since this should be a problem for other redis clients as well.

@jalaziz
Copy link
Contributor Author

jalaziz commented Dec 29, 2021

Added tests.

Happy to centralize the handling for this, but where should it live and should it use ensure_text instead of stringify?

@majorgreys
Copy link
Contributor

@jalaziz Thanks for adding the tests!

You'd want to use ddtrace.internal.compat.ensure_text which just refers to six.ensure_text which will do the necessary coercion rather than stringify which refers to the six.text_type.

@codecov-commenter
Copy link

Codecov Report

Merging #3089 (232ce84) into master (425964a) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3089   +/-   ##
=======================================
  Coverage   84.55%   84.55%           
=======================================
  Files         638      638           
  Lines       46289    46314   +25     
=======================================
+ Hits        39138    39163   +25     
  Misses       7151     7151           
Impacted Files Coverage Δ
ddtrace/contrib/aioredis/patch.py 93.16% <100.00%> (ø)
tests/contrib/aioredis/test_aioredis.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 425964a...232ce84. Read the comment docs.

@jalaziz
Copy link
Contributor Author

jalaziz commented Dec 31, 2021

@jalaziz Thanks for adding the tests!

You'd want to use ddtrace.internal.compat.ensure_text which just refers to six.ensure_text which will do the necessary coercion rather than stringify which refers to the six.text_type.

@majorgreys Turns out ensure_text does not fully solve the problem. It will fail to handle numbers. The latest commit uses ensure_text first then falls back to stringify if that fails. It's a bit ugly, but at least it gets rid of the calls to decode in the aioredis patch.

Copy link
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments, otherwise lgtm

@brettlangdon
Copy link
Member

We do need to add a changelog entry:

pip install reno
reno new <some-slug-name>

This would be a "fixes:" entry.

@jalaziz jalaziz force-pushed the aioredis-decoding branch from cb1a6c5 to 64cd8a3 Compare January 5, 2022 22:01
@jalaziz
Copy link
Contributor Author

jalaziz commented Jan 5, 2022

We do need to add a changelog entry:

Done!

@jalaziz
Copy link
Contributor Author

jalaziz commented Jan 6, 2022

Seems like random integration checks are failing, but not quite sure why.

@brettlangdon
Copy link
Member

@Mergifyio backport 0.57

@mergify
Copy link
Contributor

mergify bot commented Jan 6, 2022

backport 0.57

🟠 Waiting for conditions to match

  • merged [:pushpin: backport requirement]

@jalaziz
Copy link
Contributor Author

jalaziz commented Jan 7, 2022

@brettlangdon Anything else needed to merge this? Should I squash?

@brettlangdon
Copy link
Member

@jalaziz we need one more approving review. No worries on squashing, we squash merge all PRs.

@mergify mergify bot merged commit 92390b2 into DataDog:master Jan 7, 2022
@mergify
Copy link
Contributor

mergify bot commented Jan 7, 2022

backport 0.57

✅ Backports have been created

@jalaziz jalaziz deleted the aioredis-decoding branch January 7, 2022 16:56
@brettlangdon
Copy link
Member

@jalaziz this is now merged / backported into the 0.57 release branch. I am going to wait until Monday to do a bug fix release since we have a few other aioredis open PRs, want to see if we can get any of those merged as well.

(I'll probably do a bug fix release Monday no matter what, bug fix releases are cheap we won't hold this fix for long)

@jalaziz
Copy link
Contributor Author

jalaziz commented Jan 7, 2022

@jalaziz this is now merged / backported into the 0.57 release branch. I am going to wait until Monday to do a bug fix release since we have a few other aioredis open PRs, want to see if we can get any of those merged as well.

(I'll probably do a bug fix release Monday no matter what, bug fix releases are cheap we won't hold this fix for long)

Thanks! I saw the other PRs (subscribed to yours). Would indeed be good to get them all in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UnicodeDecodeError with aioredis tracing enabled

5 participants