Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Rack::ETag middleware doesn't play nice with Live streaming when commit! is called before write #14358

Open
rosenfeld opened this Issue Mar 12, 2014 · 41 comments

Comments

Projects
None yet
7 participants
Contributor

rosenfeld commented Mar 12, 2014

This is similar to #9713, except that that code doesn't fix the issue I'm experiencing.

Currently that middleware will only skip the body digest (and not call body.each) if Cache-Control header is set to no-cache:

https://github.com/rack/rack/blob/master/lib/rack/etag.rb#L56

So, unless you set that header explicitly in the controller's action streaming won't work out of the box. I spent several hours yesterday digging in several gem sources (railties, actionpack, active_support, rack) to find out what was preventing streaming from working in my application, so I would appreciate if Rails could make this work transparently if possible.

My first proposal is for Rails to automatically add Cache-Control: no-cache on the first commit! call to the response object, so that ETag will skip the digest. This should work since the action will wait for the first commit before returning to the dispatch method.

If that's not possible for some reason, then it would be great if we could flag actions which are supposed to be used with live streaming. Something like:

def my_action
 # this would set the Cache-Control header and do whatever else is needed to make streaming just work.
  enable_streaming!
  # ...
end

@robin850 robin850 added the actionpack label Mar 16, 2014

judofyr commented Jul 4, 2014

Rails should probably send streaming responses with Cache-Control: no-cache anyway, because exceptions will be rendered as a 200 OK response.

Contributor

rosenfeld commented Jul 7, 2014

Sending streaming responses with Cache-Control: no-cache will break on IE8 for HTTPS requests:

http://support.microsoft.com/kb/323308

Contributor

rosenfeld commented Jul 8, 2014

When I first noticed this bug I only needed streamed responses in a small side application which didn't need to support ETag so I simply disabled that middleware to fix the issue in that JRuby on Rails application. But now I need live streaming support in the main application and disabling the ETag middleware is not an option. So, if others are being affected by this bug, here's the work-around we're using in our application (config/application.rb):

class LiveStreamPreventETagAfter
  def initialize(app)
    @app = app
  end

  def call(env)
    status, headers, body = @app.call(env)
    if body.respond_to?(:stream) && ActionController::Live::Buffer === body.stream
      headers['Cache-Control-Old'] = headers['Cache-Control']
      headers['Cache-Control'] = 'no-cache'
    end
    [status, headers, body]
  end
end

class LiveStreamPreventETagBefore
  def initialize(app)
    @app = app
  end

  def call(env)
    status, headers, body = @app.call(env)
    if body.respond_to?(:stream) && ActionController::Live::Buffer === body.stream
      if (old = headers.delete 'Cache-Control-Old')
        headers['Cache-Control'] = old
      else
        headers.delete 'Cache-Control'
      end
    end
    [status, headers, body]
  end
end
#...
    config.middleware.insert_before Rack::ETag, LiveStreamPreventETagBefore
    config.middleware.insert_after Rack::ETag, LiveStreamPreventETagAfter

Since we have to support IE8, simply setting the Cache-Control to no-cache is not enough for some actions that add the Content-Disposition: attachment header due to the IE8 bug mentioned above. That's why we set it back to the previous value after the ETag middleware finished processing.

It would be great though if Rails could provide an official supported way for middlewares to detect whether the response is streamed or not. It could be some thread local or custom header maybe...

Owner

matthewd commented Jul 8, 2014

As far as I can see, the expected behaviour is that streaming will set no-cache automatically: http://api.rubyonrails.org/classes/ActionController/Streaming.html#module-ActionController::Streaming-label-Middlewares

And while AC::Live doesn't seem to document the behaviour, it does likewise: https://github.com/rails/rails/blob/v4.0.0/actionpack/lib/action_controller/metal/live.rb#L41-45

Contributor

rosenfeld commented Jul 8, 2014

But the write will happen in a separate thread, after the middleware is run.

Contributor

rosenfeld commented Jul 8, 2014

What about support for IE8? If Cache-Control is set to no-cache and the Content-Disposition header is set to attachment in the action, it won't work. Here's another link to complement the one I mentioned above from MS itself:

https://code.google.com/p/dompdf/issues/detail?id=235

Owner

matthewd commented Jul 8, 2014

Hmmm... can you please create an example that shows the header not getting set?

The question of IE8 behaviour seems only tangential. A change to help that doesn't sound eligible for back-port, either.

Contributor

rosenfeld commented Jul 8, 2014

I'll try to find some time this week to create a sample application. Should I use latest released Rails or master?

Contributor

rosenfeld commented Jul 8, 2014

With regards to IE8 support, depending on how things are implemented in Rails it may not be possible to find a work-around for supporting IE8 if required... I don't think it's ok for Rails to simply ignore IE8 bugs...

Owner

matthewd commented Jul 8, 2014

As far as I can see, it would be a bug for either version to fail to set the header, so whichever you can show that behaviour on, would be great.

I'm happy to discuss a change to work around IE8's misbehaviour, but it doesn't seem like it belongs on an issue titled "Rack::ETag middleware doesn't play nice with Live streaming"... so a separate new issue seems appropriate. And it would be a feature, not a bug/regression, so that one would only be a candidate for fixing on master.

@rosenfeld rosenfeld changed the title from Rack::ETag middleware doesn't play nice with Live streaming to Rack::ETag middleware doesn't play nice with Live streaming when commit! is called before write Jul 8, 2014

Contributor

rosenfeld commented Jul 9, 2014

Ha, while I was simplifying my sample app I found out that this problem is not that much generic but happens due to our specific use case. I changed the issue title to reflect this.

Some actions require quite a few seconds before starting sending some data to the client. To improve the UX we use a response.commit! as the first thing in the action before any initial processed required before streaming the response. So, the first write will happen after response.commit!. When I simplified this by removing this initial commit for the sample application I noticed that streaming was working after that change so I was able to specify more details about this bug. You can try this:

class MainController < ActionController::Base

  include ActionController::Live

  def test
    response.headers['Content-Type'] = 'text/html'
    response.commit! # it works if you comment this line
    response.stream.write "test1\n"
    response.commit!
    sleep 1
    response.stream.write "test2\n"
    response.commit!
    response.stream.close
  rescue IOError
    # ignore closed stream errors from curl -I
  end
end

I know that the reason for the await_commit before returning from process is to allow Rails to redirect to the error page if something bad happens before the start of the streaming, but for some of our actions this behavior is annoying as it makes the UX worse. So we disable it by promptly commiting before any processing. Since the Cache-Control: no-cache is only set on the first write before any commit this will cause the ETag middleware to act over the action. Does it make sense?

Contributor

rosenfeld commented Jul 9, 2014

Tomorrow I'll try to reproduce the IE8 bug and create a new issue for it if I can confirm it. Do you still need a sample application for this issue, or my last comment is enough for you to be able to reproduce it?

Owner

matthewd commented Jul 9, 2014

Ah okay, that makes sense.

So, the workaround would be to use response.stream.write "" instead of the initial response.commit!.

@tenderlove is commit! intended to be public API? sending! and sent! sure don't sound like they should be, and they all lack documentation, but they're not :nodoc:. (ref)

Contributor

rosenfeld commented Jul 9, 2014

What's the reason for write calling commit! before filling the buffer rather than after?

isaiah commented Jul 24, 2014

Rails live stream is built upon rack.hijack API, rack middlewares would not be able to touch the response. You can find the sample code here.

Off topic, why do you explicitly commit the response? According to the document, "Calling write or close on the response stream will cause the response object to be committed".

Owner

matthewd commented Jul 24, 2014

Rails live stream is built upon rack.hijack API

No, it's not. Though it (probably) eventually should be.

Contributor

rosenfeld commented Jul 24, 2014

@isaiah first let me thank you for reminding me about that thread on the rack.hijack API. I'll answer your questions in a minute but let me first answer another question in rails-core list.

Contributor

rosenfeld commented Jul 24, 2014

Humm... nevermind. I just noticed rack.hijack? is always true for all Rails requests and I couldn't find anything different in env that would allow a middleware to know whether the request is streamed or not.

Contributor

rosenfeld commented Jul 24, 2014

Off topic, why do you explicitly commit the response? According to the document, "Calling write or close on the response stream will cause the response object to be committed".

Well, let me handle those statements separately. First, the reason why I commit first is because for some actions the first chunk could take a few seconds to be processed before it's sent. I want to provide the user a better experience so I commit! to send the headers instantly. There are 3 goals for this:

1 - if the user opts for selecting the download destination they can immediately start looking for the destination directory while the server is still processing the request;
2 - they get the feeling the request was handled very quickly and that downloading speed is what's preventing the file to be ready ;)
3 - I don't have to worry about adding some kind of spinner/wait message for the user to know that his request has been processed yet.

Now about your other statement: for some reason write is implemented by first commiting and then writing to the buffer, which means that the last chunk will be buffered rather than sent until next write/commit!/close. I don't understand the reason for this behavior of write.

Owner

matthewd commented Jul 24, 2014

commit! doesn't mean "send something from the buffer", it means "we've finished building the headers; send them, and move on to a blocking read of the body buffer".

Contributor

rosenfeld commented Jul 24, 2014

ah, ok, makes sense. I just feel it's weird to call write '' instead of write! for the first call. Also, I'm not sure it wouldn't change the output of a generated Excel file...

isaiah commented Jul 24, 2014

@matthewd I think you are confusing live streaming with http streaming, which is triggered by

render stream: true

@rosenfeld "rack.hijack?" is to identify if the rack server running the rails application supports rack.hijack api. To see if the socket is hijacked, you need to read the status code, in this case it would be -1. There is a great blog post about this http://blog.phusion.nl/2013/01/23/the-new-rack-socket-hijacking-api/.

Plase try the script provided in my gist, it works perfectly.

Contributor

rosenfeld commented Jul 24, 2014

I always get status 200 in Rails from a Rack middleware regardless of using Live Streaming or not.

Contributor

rosenfeld commented Jul 24, 2014

I think you are confusing live streaming with http streaming

Both are about http streaming from my understanding. The difference is that ActionController::Live will allow me to use response.stream to send the chunks. Am I missing something?

Contributor

rosenfeld commented Jul 24, 2014

I'm using JRuby and Apache POI with a streamed buffer API to generate the Excel output so I don't think I would be able to use the render stream: true variant for this case...

Owner

matthewd commented Jul 24, 2014

@isaiah rack.hijack doesn't occur anywhere in the Rails repository. My recollection is that when we tried to make AC::Live use it, we ran into issues.

@rosenfeld I'm sure it won't affect your Excel file -- it's zero bytes. But I won't argue that it's not ugly. 😄

Contributor

rosenfeld commented Jul 24, 2014

@isaiah I won't comment in your gist because notifications do not work in gists so let me discuss it here if you don't mind:

Rack::Etag has no effect on the streaming request at all!

I didn't understand your statement. You seem to have disabled Rack::ETag in your gist example:

https://gist.github.com/isaiah/c02d0aed6f2eefe6037b#file-rails-ru-L21-L22

Also, you're not calling response.commit! before the first call to response.stream.write. Try including the Rack::ETag middleware in your example app and calling response.commit! as the first thing in your controller's action and let me know if it works as intended.

Contributor

rosenfeld commented Jul 24, 2014

I think this discussion is relevant to this issue, so let me link it here:

https://groups.google.com/forum/#!topic/rubyonrails-core/4ej7ztw0oBM

If AC::Live could follow whatever Rack convention for identifying streamed/chunked requests it would be very helpful. Maybe by implementing rack.hijack. Maybe by using the -1 status if this is what Rack recommends. Or even some Rails specific flag would already be a progress if there's no consensus about how to let Rack middlewares know the request is a streamed one.

isaiah commented Jul 24, 2014

@matthewd Indeed, now it use a different trick, a separate thread.

@rosenfeld Though you can still use it as an alternative to live streaming:

app = lambda do |env|
  io = env["rack.hijack"].call
  io.write "HTTP/1.1 200\r\n\r\nBLAH\r\n"
  10.times{ io.write "BLAH\r\n"; sleep 0.2 }
  io.close
  [-1, {}, []]
end

run app
Contributor

rosenfeld commented Jul 24, 2014

@isaiah I appreciate but actually my actions must go through the regular filters (authentication, logging, etc) and after all I already have a work-around solution for my application. I'm just reporting the issue in the hope it could be fixed in future Rails releases.

isaiah commented Jul 24, 2014

@rosenfeld If I put response.commit! before response.stream.write, it hangs. I just found that if you add response.stream.write("") before your first write, it send the headers without the etag. Maybe that's a better hack for you.

This issue has been automatically marked as stale because it has not been commented on for at least
three months.

The resources of the Rails team are limited, and so we are asking for your help.

If you can still reproduce this error on the 4-1-stable, 4-0-stable branches or on master,
please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions.

@rails-bot rails-bot added the stale label Nov 19, 2014

Contributor

rosenfeld commented Nov 24, 2014

This is still a bug in 4.1.8.

Owner

rafaelfranca commented Nov 24, 2014

Yeah, it seems to be a bug. Thanks, pinned the issue. Pinned issue will be keep open for 6 months, if they are not solved in this timeframe our bot will close automatically.

@rafaelfranca rafaelfranca added pinned and removed stale labels Nov 24, 2014

Contributor

rosenfeld commented Nov 24, 2014

I really don't understand this policy of closing confirmed bugs just because no one fixed them in a long while... :-(

Owner

rafaelfranca commented Nov 24, 2014

It is easy. Keeping open just to keep open will not fix the bug and just make harder to find bugs on the tracker.

I don't like to close them as much you don't like they to be closed but it is the only way to keep this tracker sane for our contributors.

There is also a hidden reason. Usually the person more interested in fixing the bug is the reporter. The Rails team usually fix other people bugs but we don't have time to fix all of them, so we give a warning in the issues and the reporters can take some action like trying to fix their own bugs. After all, it is open source and we are not paid to solve other people problems, we do because we want, but it is impossible to fix everything.

Contributor

rosenfeld commented Nov 24, 2014

I'm by no means expecting anyone to fix this bug in a reasonable timeframe. I know how open-source work but this is the first OS project I know about who adopts such policy. Usually big projects use real issue tracker systems like Redmine, Jira, in-house, etc who can deal with issues like this by lowering the priority after a while or by providing other features to avoid them to pollute the listed tickets from those willing to fix tickets but not willing to see those same bugs over and over when they already decided they are not interested in fixing them themselves. There's nothing wrong in not willing to fix bugs affecting others. I'm just surprised with the decision to close known, unfixed, bugs just because no one has worked on it for a while. Maybe 2 years from now there will be someone else affected by this bug and willing to fix it. Or someone willing to help fixing Rails bugs and interested in ActionController or Data Streaming and would like to focus in such bugs. They might be interested in tackling such bug in the future.

I believe most of Rails contributors pain in dealing with Rails bug comes from the fact that GitHub, while awesome in lots of things, is not a proper issue tracking system for big projects. After all, bugs will still exist and closing them doesn't add any real benefit in my opinion. If the problem is that it becomes hard to track so many open bugs in GitHub a better solution would be to improve the issues tracking system rather then limiting the amount of open bugs...

Owner

rafaelfranca commented Nov 24, 2014

I'd use this same policy in any issue tracker. We are running the project like this for a long time and I believe we are doing very well. Closing issues doesn't disallow people that have the same issue of fixing it, they will just open a new issue, or comment in the closed issue. This is perfectly fine for us.

Contributor

rosenfeld commented Nov 24, 2014

But it prevent existing bugs to be potentially fixed in some eventual community driven bug triage process and this happened already with Rails a few times...

Owner

rafaelfranca commented Nov 24, 2014

@rosenfeld you gave me a good idea. I could use pinned issues to drive bug triages. I believe it is a good way to deal with these long running issues. I'll try to organize something 👍

Contributor

rosenfeld commented Nov 24, 2014

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment