Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Allocating buffer to send a response #198

Closed
kazu-yamamoto opened this Issue Nov 15, 2013 · 9 comments

Comments

Projects
None yet
2 participants
Contributor

kazu-yamamoto commented Nov 15, 2013

sendRsp has one FIXME:

-- FIXME perhaps alloca a buffer per thread and reuse that in all
-- functions below. Should lessen greatly the GC burden (I hope)

Now we have a mechanism to allocate/free a buffer for receiving per connection: allocateRecvBuffer and freeRecvBuffer. @snoyberg I think you can reuse them for sending.

These code uses malloc()/free(). This means that GC of GHC does not care about them. Recent malloc() has an arena per core. If we use the same size both for receiving and sending, malloc()/free() uses an arena like a free list for buffers. That is, allocation is done in a very efficient way.

@snoyberg would you try this? I think this improvement is important for Yesod applications which produce dynamic contents.

snoyberg added a commit to snoyberg/blaze-builder that referenced this issue Nov 15, 2013

Expose the Buffer constructor.
This is necessary for an optimization in Warp, where we already have a
pool of raw buffers available which I'd like to reuse for blaze-builder.
Relevant issue: yesodweb/wai#198

@snoyberg snoyberg referenced this issue in meiersi/blaze-builder Nov 15, 2013

Merged

Expose the Buffer constructor. #13

Owner

snoyberg commented Nov 15, 2013

I've pushed a change to make this happen, though it depends on a pull request I've sent against blaze-builder: meiersi/blaze-builder#13. Two other questions:

  • Do you think it would be more efficient to use the same logic for RspBuilder as well?
  • The default buffer size is 4096, which is a number I completely made up. Does increasing that value affect performance?
Contributor

kazu-yamamoto commented Nov 15, 2013

First of all, it seems to me that this patch does not implement what I suggested. This patch allocates a buffer per response, not per connection. I will fix it.

A.1) RspBuilder uses toByteStringIO which allocates a buffer by mallocByteString. mallocByteString is heavy because it depends on a global lock. If we can use the per-connection buffer, it would be nice.

A.2) It depends on WAI applications. If they typically return contents whose size is less than or equal to 4,096, increasing the buffer size has no effect. Moreover, this buffer is split into 1500 bytes packets on network. It is hard to consider the best size of buffer. So, I think we need not to consider it at this moment. Let's just pick up 4,096. If users think this is a bottleneck, we can revisit this later.

Contributor

kazu-yamamoto commented Nov 15, 2013

Please review this: 06355b1

Contributor

kazu-yamamoto commented Nov 15, 2013

One concern is that if an application sends static files only, this write buffer is overhead. It would be nice if we dynamically allocates a write buffer on demand. But I don't know how to ensure to free the write buffer when the connection is finished.

Owner

snoyberg commented Nov 15, 2013

Thanks for that fix, you're right, I misread your original comment.

Is there a reason we can't use the same buffer for both receiving and sending? It seems like the two should never overlap. That would reduce the overall of this feature to a call to newForeignPtr_.

Contributor

kazu-yamamoto commented Nov 15, 2013

Good point. I also thought about this but I was not sure whether or not they conflict. Probably, you are right. I will fix and let's see what will happen.

BTW. If we can ask to change toByteStringIO so as to take a Buffer parameter, we can share the buffer on RspBuilder.

Contributor

kazu-yamamoto commented Nov 16, 2013

Done in e9470ca.

I don't understand why we can remove newForeignPtr_. So, please review this patch, too.

Owner

snoyberg commented Nov 16, 2013

Looks like what I'd expect. What I meant about newForeignPtr_ is that the only extra overhead this feature would be introducing versus the previous code was that we'd be making a call to newForeignPtr_ when allocating the Buffer.

Contributor

kazu-yamamoto commented Nov 17, 2013

Thanks. Let's close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment