Skip to content

fix: fix the bytes encode/decode for redis cache#153

Merged
gontarzpawel merged 4 commits intoContentSquare:masterfrom
wangxinalex:fix/redis-cache-bytes-encode
Apr 3, 2022
Merged

fix: fix the bytes encode/decode for redis cache#153
gontarzpawel merged 4 commits intoContentSquare:masterfrom
wangxinalex:fix/redis-cache-bytes-encode

Conversation

@wangxinalex
Copy link
Contributor

Using string(data) to convert the byte array to string introduces error in json marshal/unmarshal,
hence causes error when returning cached response from redis.

The reason is Unmarshal function in
encode/json would replace invalid UTF-8 or invalid UTF-16 pairs with U+FFFD, therefore the
payload string in redisCachePayload will actually change after json marshal/unmarshal since the
byte array may contain invalid UTF-8/UTF-16 byte, the length of payload will thereby change,
resulting the http server to find the declared length in header Content-Length mismatches the
actual length of payload.

The fix is to base64-encode/decode the byte array to string, thereby
eliminates invalid UTF-8/UTF-16 bytes.

@wangxinalex
Copy link
Contributor Author

A example to reproduce the json marshal/unmarshal bug.
https://gist.github.com/wangxinalex/8924e649192bd39527313cf49b4125a5

@gontarzpawel
Copy link
Contributor

Hello @wangxinalex! Thank you for contribution.

I'd like to understand couple of aspects:

  • what was the payload returned from Clickhouse that made you find out the invalid utf-8/16 bytes? would it be possible to provide a failing test fixed by your change?
  • it seems that json encoder can be configured to not replace invalid bytes: SetEscapeHTML(false).

@wangxinalex
Copy link
Contributor Author

wangxinalex commented Mar 31, 2022 via email

Copy link
Contributor

@gontarzpawel gontarzpawel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your clear explanation!

I've launched tests locally to verify your change does not break anything and there's one test failing.

make test

truncated output

....
--- FAIL: TestServe (0.47s)
    --- FAIL: TestServe/http_requests_with_caching_in_redis_ (0.01s)
        main_test.go:369: result from cache query is wrong: {"l":4,"t":"text/plain; charset=utf-8","enc":"","payload":"T2suCg=="}

Could you adapt it to your change please?
It'd be also beneficial if we could have the failing example that you provided, added as unit test. Could you also do that?

FYI we're working on adding CI step to verify tests.

EDIT:
rebased your PR on master please to have CI (github actions) enabled

@pull-request-size pull-request-size bot added size/M and removed size/S labels Mar 31, 2022
@wangxinalex
Copy link
Contributor Author

Dear @Garnek20, the failing case is adapted and a new test case is added for the changed behavior.

Using `string(data)` to convert the byte array to string introduces error in json marshal/unmarshal,
hence causes error when returning cached response from redis.

The reason is `Unmarshal` function in
`encode/json` would replace invalid UTF-8 or invalid UTF-16 pairs with `U+FFFD`, therefore the
`payload` string in `redisCachePayload` will actually change after json marshal/unmarshal since the
byte array may contain invalid UTF-8/UTF-16 byte, the length of payload will thereby change,
resulting the http server to find the declared length in header `Content-Length` mismatches the
actual length of payload.

The fix is to base64-encode/decode the byte array to string, thereby
eliminates invalid UTF-8/UTF-16 bytes.
add test cases for base64 encode/decode the cached value
@wangxinalex wangxinalex force-pushed the fix/redis-cache-bytes-encode branch from 8226e08 to 648df1b Compare March 31, 2022 15:58
@wangxinalex
Copy link
Contributor Author

@Garnek20 It seems that --- FAIL: TestReverseProxy_ServeHTTP1/queue_overflow_for_user (0.12s)
case failed in ci workflow, but this case passes on my local machine, do you have any ideas?

@wangxinalex
Copy link
Contributor Author

@Garnek20 My best guess is

time.Sleep(time.Millisecond * 5)
the cpu of ci runner makes the time.Sleep(time.Millisecond * 5) to sleep longer than it should be, so as to suppress the queue_overflow_error. So my suggestion is maybe should minimize the sleep time and try again.

…s ci

minimize the waiting time between two consecutive requests
Copy link
Contributor

@gontarzpawel gontarzpawel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the test! One last comment and we will be ready to merge it 🙂

@wangxinalex wangxinalex requested a review from gontarzpawel April 3, 2022 06:56
Copy link
Contributor

@gontarzpawel gontarzpawel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wangxinalex for you contribution!

@gontarzpawel gontarzpawel merged commit 6cfac12 into ContentSquare:master Apr 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

2 participants