Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When reusing the decoder, gojay does not flush the buffer #108

Open
xaionaro opened this issue Apr 2, 2019 · 12 comments
Open

When reusing the decoder, gojay does not flush the buffer #108

xaionaro opened this issue Apr 2, 2019 · 12 comments
Assignees
Labels
duplicate This issue or pull request already exists

Comments

@xaionaro
Copy link

xaionaro commented Apr 2, 2019

Hello. I've tried to replace the standard JSON decoder to gojay's. The result:

  • It works really much faster
  • RAM is leaking. In 10 minutes a simple application started to use 9GB RAM (RSS), while with the standard JSON decoder it consumes only 40MB.

Here's the code:

https://github.com/trafficstars/put2ch/blob/79f2e1d100bbd181df3cb54e8199995d5f868ded/input_raw_json.go

@xaionaro xaionaro changed the title gojay consumes a lot of RAM gojay consumes a lot of RAM (leaks) Apr 2, 2019
@francoispqt
Copy link
Owner

Gojay is used in production on a highly instrumented system handling thousands of requests per second. In the context it's being used it does not leak memory, if it would, we would have noticed. It could be linked with something you're doing specifically. Let me try to reproduce what you are noticing.

@francoispqt
Copy link
Owner

francoispqt commented Apr 2, 2019

So the reason is you are reusing the same decoder. It's actually a duplicate of this issue: #79

At the time I suggested to add a reset method. But the OP implemented his own solution. As I want gojay to stick more to the behaviour of the standard API, I will implement the reset after each decode so that the buffer doesn't grow forever if you reuse the Decoder, in the meantime you can get a new decoder between each loop and test.

@xaionaro
Copy link
Author

xaionaro commented Apr 2, 2019

@francoispqt I cannot get new decoder because I have a byte stream of JSON messages and on dropping the old decoder I will crop a part of the next message (I've already tried that before creating the issue).

@francoispqt
Copy link
Owner

Yes, of course, because it reads the bytes available in the reader into the decoder's internal buffer.

So basically, what we'd need to do, is everytime an item is read, drop the bytes already read by resetting the cursor, this is actually what the stream api already does. But you're not using the stream api. I will implement that in the regular API and evaluate the perf impact.

@xaionaro
Copy link
Author

xaionaro commented Apr 2, 2019

But you're not using the stream api. I will implement that in the regular API and evaluate the perf impact.

In my quick-research I did not understand how to use the stream API for my case. If that is the API that I was supposed to use for my case then don't waste time on me: I will just try to use it.

So am I understand correctly that I just should use the stream API? :)

@francoispqt
Copy link
Owner

Yes, the best for you would be to use the stream api, it will basically push messages to a channel whenever it's there, so you just have to read from that channel.

But if you want, I have just implemented the reset to reuse the Decoder because I do think there is a usecase for that and I want it to be consistent with the standard json. I have tested and it works correctly with no impact on performance, I've added a test as well. You can try it in the feature/reset-decoder branch.

If it's ok for you, I will merge it to master and tag a new release tomorrow.

@xaionaro
Copy link
Author

xaionaro commented Apr 2, 2019

I will test it tomorrow, thank you very much :)

@francoispqt francoispqt added the duplicate This issue or pull request already exists label Apr 3, 2019
@francoispqt francoispqt self-assigned this Apr 3, 2019
@francoispqt francoispqt changed the title gojay consumes a lot of RAM (leaks) When reusing the decoder, gojay does not flush the buffer Apr 3, 2019
@xaionaro
Copy link
Author

xaionaro commented Apr 3, 2019

Well, I've switched the branch and it did not help.

gojay# git log | head -2
commit 1cb50f7b876bd5abaa4ffa66c9e28cbb0c3547fe
Author: francoispqt <francois@parquet.ninja>
# grep RSS /proc/`pgrep logger`/status
VmRSS:   3023848 kB                                                                                  

I will try to use the steam API. Anyway, thanks :)

@francoispqt
Copy link
Owner

Are you running it in a unit test (mocking your udp connection) or you're really testing your stream?

@xaionaro
Copy link
Author

xaionaro commented Apr 3, 2019

Are you running it in a unit test (mocking your udp connection) or you're really testing your stream?

The second one:
I have a testing environment with real traffic. So I just run my application and watch how it works.

$ go version
go version go1.12 linux/amd64

@francoispqt
Copy link
Owner

Ok. Let me setup an environment quickly. It shouldn't grow like that. I know it doesn't with the stream api, as we are using it, and you can try the example in examples/websocket, just run go run main.go and you'll see it doesn't grow.

@francoispqt
Copy link
Owner

francoispqt commented Apr 3, 2019

Ok, so I have pushed updates to the branch feature/reset-decoder.

I've added an example which replicates what you are doing, you can find it here: https://github.com/francoispqt/gojay/blob/9df89bce7ea024f84eb4d44efb460bbf755e7b58/examples/reuse-decoder/main.go

You can run the example doing: go run main.go
The memory usage does not grow. Please let me know if it works in your case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants