Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP basic auth convert response body to string is hanging #152

Closed
JuanitoFatas opened this issue Aug 17, 2014 · 21 comments
Closed

HTTP basic auth convert response body to string is hanging #152

JuanitoFatas opened this issue Aug 17, 2014 · 21 comments
Assignees
Labels
Milestone

Comments

@JuanitoFatas
Copy link
Contributor

Hello!

I am doing a basic auth, when I get the response body back. And convert to string. It hangs.

HTTP.auth(:basic, authenticate_params).get(FEED_URL)

returns

=> #<HTTP::Response/1.0 200 OK headers=#<HTTP::Headers {"Date"=>"Sun, 17 Aug 2014 04:21:49 GMT", "Server"=>"Apache", "X-Powered-By"=>"PHP/5.3.10-1ubuntu3.12", "Set-Cookie"=>"symfony=4h9fj1r2vvkb2fn0h12bkulgs6; path=/", "Expires"=>"Thu, 19 Nov 1981 08:52:00 GMT", "Cache-Control"=>"no-store, no-cache, must-revalidate, post-check=0, pre-check=0", "Pragma"=>"no-cache", "Vary"=>"Accept-Encoding,User-Agent", "P3p"=>"CP=\"NOI CURa OUR NOR UNI\"", "Connection"=>"close", "Content-Type"=>"application/rss+xml; charset=UTF-8"}>>

And HTTP.auth(:basic, authenticate_params).get(FEED_URL).body

returns

=> #<HTTP::Response::Body:3fc442389c6c @streaming=false>

And

HTTP.auth(:basic, authenticate_params).get(FEED_URL).body.to_s

Hangs at to_s.

Any advices? Thank you!

The relevant code, please see here.

@JuanitoFatas
Copy link
Contributor Author

I find it stucks in this while loop: https://github.com/tarcieri/http/blob/master/lib/http/response/body.rb#L40-L42

@ixti
Copy link
Member

ixti commented Aug 17, 2014

What http version is it? There was a bug with chunked responses (fixed in master and in 0.6.2).
If you still experience this issue even with 0.6.2 or master let me know, but I believe it should be OK.

PS You don't need to get body explicitly, just call #to_s on response directly.

@JuanitoFatas
Copy link
Contributor Author

Thanks for reply. Sorry I was using 0.6.1. Update to 0.6.2 still hangs (already change to HTTP.auth(:basic, authenticate_params).get(FEED_URL).to_s. It stucks at:

while (chunk = @client.readpartial)
  @contents << chunk
end

@parser.finished? seems always be false.

No one calls on_message_complete.

def on_message_complete
  @finished = true
end

@ixti
Copy link
Member

ixti commented Aug 17, 2014

on_message_complete is a callback called by underlying http_parser. Will need to see server-client exchange (headers, chunks received) in order to understand where the bug might come from.

UPDATE: Actually I just need raw data that is consumed from the socket as is.

@JuanitoFatas
Copy link
Contributor Author

@ixti Thanks! Did not know the http_parser, cool!!!

I have put the data in this gist. https://gist.github.com/JuanitoFatas/b21a7f8e7c8523deb3b1

Thanks!!!

@ixti
Copy link
Member

ixti commented Aug 18, 2014

Sorry I was a bit unclear :D I wanted to see the full data stream sent to socket. In other words somthing like this:

require "http/response/parser"

module HTTP
  class Response
    class Parser
      def add(data)
        puts data.inspect
        @parser << data
      end
      alias_method :<<, :add
    end
  end
end

I'll try to do that myself once I'll get some time (need to get some kind of trial access to that server).

@Oliviergg
Copy link

Hi,
I have the same problem when trying to get data from a URL. I don't understand the origin of the problem.

My sample code :

url = "http://188.165.196.212:8000/index.html"
user_agent = "Mozilla/5.0 (iPad; CPU OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) CriOS/30.0.1599.12 Mobile/11A465 Safari/8536.25"
accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
headers = {
            "User-Agent" => user_agent,
            accept:accept,
            "DNT" => 1,
            "Cache-Control" => "max-age=0",
            "Accept-Encoding" => "gzip,deflate,sdch",
            "Accept-Language"=> "en-US,en;q=0.8,fr;q=0.6",
            }
r = HTTP.with(headers).get(url)
r.to_s # Hang and ruby process consums 100% of CPU

With the indication you give : here the sockets data :

Reloading...
"HTTP/1.0 200 OK\r\ncontent-type:text/html\r\n\r\n<HTML><HEAD><meta http-equiv=\"Content-Language\" content=\"en-us\"><meta http-equiv=\"Content-Type\" content=\"text/html; charset=windows-1252\"><meta http-equiv=\"Pragma\" content=\"no-cache\"><meta http-equiv=\"Expires\" content=\"Mon, 01 Jan 1990 12:00:00 GMT\"><title>SHOUTcast Administrator</title><style type=\"text/css\"><!--a:link {color: blue; font-family:Arial, Helvetica; font-size:9pt;}a:visited {color: blue; font-family:Arial, Helvetica; font-size:9pt;}a:hover {color: red; font-family:Arial, Helvetica; font-size:9pt; }.default {color: White; font-family:Arial, Helvetica; font-size:9pt; font-weight: normal}.ST {color: White; font-family:Arial, Helvetica; font-size:8pt; font-weight: normal}.logoText {color: red; font-family: Arial Black, Helvetica, sans-serif; font-size: 25pt; font-weight: normal; letter-spacing : -2.5px;}.flagText {color: blue; font-family: webdings; font-size: 36pt; font-weight: normal; }.ltv {color: blue; font-family: Arial, Helvetica, sans-serif; font-size: 9pt; font-weight: normal;}.tnl {color: black; font-family: Arial, Helvetica, sans-serif; font-size: 10pt; font-weight: bold; text-decoration: none;}--></style></HEAD><BODY topmargin=0 leftmargin=0 marginheight=0 marginwidth=0 bgcolor=#000000 text=#EEEEEE link=#001155 vlink=#001155 alink=#FF0000><font class=default><table width=100% border=0 cellpadding=0 cellspacing=0><tr><td height=50><font class=flagText>U</font><fo"
=> #<HTTP::Response/1.0 200 OK headers=#<HTTP::Headers {"Content-Type"=>"text/html"}>>
[2] pry(main)> _.to_s
 "nt class=logoText>&nbsp;SHOUTcast D.N.A.S. Status</font></td></tr><tr><td height=14 align=right><font class=ltv><a id=ltv href=\"http://www.shoutcast.com/\">SHOUTcast Server Version 1.9.8/Linux</a></font></td></tr><tr><td bgcolor=#DDDDDD height=20 align=center><table width=100% border=0 cellpadding=0 cellspacing=0><tr><td align=center><font class=tnl><a id=tnl href=\"index.html\">Status</a></font></td><td align=center><font class=tnl>&nbsp;|&nbsp;</font></td><td align=center><font class=tnl><a id=tnl href=\"played.html\">Song History</a></font></td><td align=center><font class=tnl>&nbsp;|&nbsp;</font></td><td align=center><font class=tnl><a id=tnl href=\"listen.pls\">Listen</a></font></td><td align=center><font class=tnl>&nbsp;|&nbsp;</font></td><td align=center><font class=tnl><a id=tnl href=\"home.html\">Stream URL</a></font></td><td align=center><font class=tnl>&nbsp;|&nbsp;</font></td><td align=center><font class=tnl><a id=tnl href=\"admin.cgi\">Admin Login</a></font></td></tr></table></td></tr></table><br><table cellpadding=5 cellspacing=0 border=0 width=100%><tr><td bgcolor=#000025 colspan=2 align=center><font class=ST>Current Stream Information</font></td></tr></table><table cellpadding=2 cellspacing=0 border=0 align=center><tr><td width=100 nowrap><font class=default>Server Status: </font></td><td><font class=default><b>Server is currently up and public.</b></td></tr><tr><td width=100 nowrap><font class=default>Stream Status: </font></td><td><font class=default><b>Stream is up at 128 kbps with <B>22 of 1000 listeners (18 unique)</b></b></td></tr><tr><td width=100 nowrap><font class=default>Listener Peak: </font></td><td><font class=default><b>274</b></td></tr><tr><td width=100 nowrap><font class=default>Average Listen Time: </font></td><td><font class=default><b>2h&nbsp;37m&nbsp;29s</b></td></tr><tr><td width=100 nowrap><font class=default>Stream Title: </font></td><td><font class=default><b>ATLANTICA</b></td></tr><tr><td width=100 nowrap><font class=default>Content Type: </font></td><td><font class=default><b>audio/mpeg</b></td></tr><tr><td width=100 nowrap><font class=default>Stream Genre: </font></td><td><font class=default><b>Rock</b></td></tr><tr><td width=100 nowrap><font class=default>Stream URL: </font></td><td><font class=default><b><a href=\"http://www.atlantica-radio.fr\">http://www.atlantica-radio.fr</a></b></td></tr><tr><td width=100 nowrap><font class=default>Stream AIM: </font></td><td><font class=default><b><a href=\"aim:goim?screenname=N/A\">N/A</a></b></td></tr><tr><td width=100 nowrap><font class=default>Stream IRC: </font></td><td><font class=default><b><a href=\"http://www.shoutcast.com/chat.phtml?dc=N%2FA\">N/A</a></b></td></tr><tr><td width=100 nowrap><font class=default>Current Song: </font></td><td><font class=default><b>Avicii - Lay Me Down</b></td></tr></table><br><table cellpadding=0 cellspacing=0 border=0 width=100%>\t<tr><td bgcolor=#DDDDDD  nowrap colspan=5 align=center><table cellspacing=0 cellpadding=0 border=0><tr><td><font class=ltv>Written by Stephen 'Tag Loomis, Tom Pepper and Justin Frankel</font></td></tr></table></td></tr><tr><td nowrap colspan=5 align=center><font class=ST><b><a href=\"http://www.shoutcast.com/disclaimer.phtml\">Copyright Nullsoft Inc</a><a href=\"/llamacookie\">.</a> 1998-2004</b></font></td></tr></table></font></body></html>"

Nothing written after this. In Body.to_s chunk always == ""

module HTTP
  class Response
    class Body

      # Eagerly consume the entire body as a string
      def to_s
        return @contents if @contents
        fail StateError, 'body is being streamed' unless @streaming.nil?

        begin
          @streaming = false
          @contents = ''
          binding.pry
          while (chunk = @client.readpartial) 
# @client.readpartial always return ""
            @contents << chunk 
          end
        rescue
          @contents = nil
          raise
        end

        @contents
      end

    end
  end
end

@ixti
Copy link
Member

ixti commented Aug 27, 2014

Now that's something interesting.
@JuanitoFatas can you provide headers you receive?:

require "yaml"
puts YAML.dump response.headers.to_h

@Oliviergg
Copy link

It's seem to me that http_parser never call on_message_complete.But http_parser core is in C. I'm not able to try to debug this to understand.
Perhaps, there is a problem on server side ? Some persistant connection or Socket ?

Just a very quick and dirty monkey patch that work for me.

module HTTP
  class Response
    class Parser

      def chunk
        chunk, @chunk = @chunk, nil
        @finished = true if chunk.nil?
        chunk
      end

    end
  end
end

@tarcieri
Copy link
Member

Ugh. I think we should probably dig into http_parser and see what's up

@ixti
Copy link
Member

ixti commented Aug 28, 2014

Yeah, I suspect issue is that content is neither chunked nor headers tells content length.
At least in case of headers which @Oliviergg provided it seems to be a reason. As in this case http_parser will not be able to determine the end. So we need to track this situation and handle that manually.

@ixti ixti added the Bug label Aug 28, 2014
@JuanitoFatas
Copy link
Contributor Author

@ixti The headers I get:

> puts YAML.dump response.headers.to_h
---
Date: Thu, 28 Aug 2014 01:42:49 GMT
Server: Apache
X-Powered-By: PHP/5.3.10-1ubuntu3.13
Set-Cookie: symfony=20hvqk591qdlgr0akgf3s01r53; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding,User-Agent
P3p: CP="NOI CURa OUR NOR UNI"
Connection: close
Content-Type: application/rss+xml; charset=UTF-8
=> nil

😀

@ixti
Copy link
Member

ixti commented Sep 15, 2014

As I said before, issue is exactly when HTTP Gem (and underlying http_parser) can't determine that response finished. Here's a simple example of a server that makes HTTP Gem run into infinitive loop:

require "socket"

server = TCPServer.new 8080

loop do
  Thread.start server.accept do |client|
    client << "HTTP/1.1 200 OK\r\n"
    client << "Content-Type: text/plain\r\n"
    client << "\r\n"
    client << "Hello!"
  end
end

So, running HTTP.get against that server will allow you to get headers but not the body:

HTTP.get("http://localhost:8080/").headers
# => #<HTTP::Headers {"Content-Type"=>"text/plain"}>

HTTP.get("http://localhost:8080/").to_s
# freeze...

I don't think that HTTP Gem should handle this situation by default. For example Firefox is getting into infinitive loop as well. But we can have some kind of optional no more data timeout. Which will look something like this (just an idea, not a real code):

def readpartial(...)
  chunk = @socket.read(...)

  if chunk
    @no_data_at = nil
    return chunk
  end

  now = Time.now.to_i
  @no_data_at ||= now

  fail_at = @no_data_at + @options[:wait_for_data].to_i
  fail NoDataTimeot if fail_at < now

  nil
end

Again this should not be handled by default at all. Only if explicitly specified in options. And it should be configurable, what to do on NoDataTimeout -- treat response as finished, or fail loudly. Of course, in case of failing loudly I would expose response with state as it was on the moment of exception as exception's property:

begin
  puts HTTP.response_timeout(:fail_after => 10).get("http://defective-server.tld").to_s
rescue HTTP::NoDataTimeout => e
  puts "Shit happens. Can't wait any longer... What we have so far is:"
  puts e.response.to_s
end

@chrchr
Copy link

chrchr commented Oct 1, 2014

Content-Length being absent isn't a terribly unusual condition. The correct handling is typically to terminate the body when the connection is closed. That's what Oliviergg's monkey patch does.

@tarcieri
Copy link
Member

tarcieri commented Oct 1, 2014

Unfortunately the Content-Length handling seems to break with every rewrite we've done of the internals 😉 You're correct about that behavior and it'd be great if there were a test case capturing it. I'm kind of surprised there isn't

@ixti
Copy link
Member

ixti commented Oct 3, 2014

The problem is that (!) this bug happens when connection is not closed by server. Like in the sample I have shared before. If you will change that broken server code to have socket.close in the end, the HTTP Gem will handle that correctly. So what we really need is an optional timeout.

@ixti
Copy link
Member

ixti commented Nov 12, 2014

Everybody affected by this issue, please, check if it was addressed with #163/#166 (in master and 0-6-stable branches). If so, I would like to close this issue.

@ixti ixti self-assigned this Nov 12, 2014
@JuanitoFatas
Copy link
Contributor Author

Will verify this weekend, thank you @ixti !!! ❤️

@sferik
Copy link
Contributor

sferik commented Mar 27, 2015

@JuanitoFatas Any update on this?

@JuanitoFatas
Copy link
Contributor Author

Any update on this?

Just verified, now works like a charm. Thanks for the reminder.

@sferik sferik added this to the v0.7 milestone Mar 27, 2015
@sferik
Copy link
Contributor

sferik commented Mar 27, 2015

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants