-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[discuss] Return a request reference instead of an opaque record #47
Comments
Are memory overhead and performance degradation due to message passing significant? If not, advantages are too good to ignore them. |
I would like to call out for caution here for the way the GC handles big data. Since it will propably mean sending big binaries around it can lead to keeping a lot of things in memory - I faced exactly that issue [2] when sending chunks read from hackney over messages to another process. Worst case this could mean that the entire streamed body stays in memory until the hackney streaming process dies (and GC's obviouslly). @dieswaytoofast has an excelent blog post [1] on that issue (that saved my the behind when I ran into this issue) and I highly recommend reading before this change. I would opt for keeping both around, let people choose what's right for them like other network sockets work with either async or non async giving different ways to handle reading from the socket. That gives the advantage of staying in line with gen_tcp/udp at least from the philosophy. [1] http://dieswaytoofast.blogspot.de/2012/12/erlang-binaries-and-garbage-collection.html |
Adding 2 links around this topic to keep a trace somewhere: http://jlouisramblings.blogspot.fr/2013/10/embrace-copying.html from @jlouis and https://blog.heroku.com/archives/2013/11/7/logplex-down-the-rabbit-hole from @ferd |
Actually, when you're not creating an async request, hackney return:
or when you stream a body:
Each succesive call in hackney return a new
Client
opaque record,. For example when you fetch a body:It has the advantage that at this point no process is created to handle the request, a client record is basically wrapping a socket to code and decode data sent and receiced from the remote HTTP server. So there is no message passing.
But the recent async feature is now returning a stream reference instead of a client record:
A new supervised stream process is created when the response start and you receive the messages containing this stream reference. Until the response start, the request is still sent like the synchronous requests without message passing.
What I would like to discuss is the possibility to only return a stream reference instead of an opaque client record. A new process will be created each time to handle a request. And the request body will be streamed by sending some messages to it.
The main advantages of this change would be:
With the main disadvantage of having to use the message passing to send to your request PId which will use more memory and slow down a little the process.
Any feedback about this would be really appreciated. Let me know :)
The text was updated successfully, but these errors were encountered: