Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

undefined method 'headers' for nil:NilClass #178

Closed
canavese opened this Issue Jun 21, 2012 · 21 comments

Comments

Projects
None yet
3 participants

I am getting the error "undefined method 'headers' for nil:NilClass" when trying to re-use a cassette on a particular request. Apparently there is something wrong with the second request in the cassette (the first request succeeds). I have many other requests working with VCR, so the issue seems specific to this request.

Note that when I run my test without the cassette, the request works fine, my test passes, and the cassette is generated. Running the test again (using the existing cassette) fails.

My environment:

  • VCR 2.2.2 (but problem also occurred on 2.2.0 and 2.2.1)
  • Mechanize 2.5 (but problem also occurred on 2.5.1)
  • FakeWeb 1.3.0.

Here is a gist with the cassette, VCR log, and backtrace. I have anonymized it a little:
https://gist.github.com/2968371

It appears that VCR (or mechanize) is trying to fulfill the request twice, and not finding it the second time. Again, it works fine when the cassette is generated. And my code is only requesting the page once. Note that the returned HTML is a mess with multiple blocks, but I'm not sure how that would cause this behavior.

Let me know what else I can do to help track this down.

Thanks,

Paul

Owner

myronmarston commented Jun 22, 2012

Thanks for reporting this. I'll try to take a look at it tomorrow.

Member

Jacobkg commented Jun 23, 2012

I've been trying to reproduce this issue in a toy example. Can you give us the test code that is triggering this problem?

Member

Jacobkg commented Jun 23, 2012

I seem to be able to reproduce the same exception with this code:

require 'rubygems'
require 'vcr'
require 'fakeweb'
require 'mechanize'

VCR.configure do |c|
  c.cassette_library_dir = 'fixtures'
  c.hook_into :fakeweb
  c.debug_logger = File.open('issue.log', 'w')
end

describe "Trying to repro the Issue" do
  it "should fail?" do
    VCR.use_cassette('undefined_method_headers') do
      Mechanize.new.get("https://ecare.testdomain.com/MyAccount/acct_summary.asp")
    end
  end
end

Glad you could reproduce it. Unfortunately, this particular case is logging in to a private external system that I don't have control over. But hopefully you can track down the issue based on what I have provided. I'd be happy to test any workarounds/fixes

Member

Jacobkg commented Jun 23, 2012

I have reduced the cassette down to this:

--- 
http_interactions: 
- request: 
    method: get
    uri: https://ecare.testdomainz.com/foo.php
    body: 
      encoding: US-ASCII
      string: ""
    headers: 
  response: 
    status: 
      code: 200
      message: OK
    headers: 
      content-length: 
      - "1"
    body: 
      encoding: US-ASCII
      string: ab

    http_version: "1.1"
  recorded_at: Thu, 21 Jun 2012 20:24:43 GMT
recorded_with: VCR 2.2.2

If I change the response string to "a", so it matches the content-length of 1, then my test passes. If I make it "" or "ab", then it fails with that header exception.

Owner

myronmarston commented Jun 23, 2012

Hmmm...that's starting to sound like #43 all over again. That one was due to a bug in syck that would cause it to modify strings as they are serialized (see this comment and following).

Is your ruby compiled with psych support?

Also, you might try using the update_content_length_header option as a work around.

Member

Jacobkg commented Jun 23, 2012

Machinist uses net-http-persistent, which automatically retries requests on failure (perhaps in the same thread). Could this be a factor?

Member

Jacobkg commented Jun 23, 2012

Two pieces of the puzzle:

  1. net-http-persistent causes a retry of the request in the same thread on a failure (in this case I think caused by the wrong content length

  2. Fakeweb caches the request_type as :stubbed_by_vcr in an instance variable so the instead of the second request getting a VCR::Errors::UnhandledHTTPRequestError, we get a nil response variable.

Owner

myronmarston commented Jun 23, 2012

@Jacobkg -- nice slething. I'll have to play with the example you posted a bit (hopefully, later this weekend) so I can see what's going on in more detail before I can recommend any ways to fix this. Feel free to come up with something on your own, though.

@canavese -- have you tried using the :psych serializer or the :update_content_length_header option as work arounds?

Hey guys, thanks for jumping on this. Don't have too much time to fuss around with this before Monday, but tried a few things quickly.

Using the update_content_length_header with the use_vcr_cassette RSpec macro didn't seem to work. I'm not sure if that's supposed to affect the generated cassette or how a cassette is interpreted, but I tried regenerating the cassette with that option to be sure.

I did try manually deleting the content-length headers from the cassette, and that did result in the test passing.

Owner

myronmarston commented Jun 24, 2012

Using the update_content_length_header with the use_vcr_cassette RSpec macro didn't seem to work. I'm not sure if that's supposed to affect the generated cassette or how a cassette is interpreted, but I tried regenerating the cassette with that option to be sure.

It should work...I tried it with the example @Jacobkg posted and it worked just fine. Your statement makes me wonder if there's more going on here than we've discovered so far.

BTW, it only affects play back, not recording, so you didn't need to re-record the cassette, but it was a good idea to check that anyway.

@Jacobkg -- any ideas how to fix this? This is a pretty crazy bug!

Member

Jacobkg commented Jun 24, 2012

When I run this test file (with :update_content_length_header => true) against the exact cassette from the gist, I get no errors:

require 'rubygems'
require 'vcr'
require 'fakeweb'
require 'mechanize'

VCR.configure do |c|
  c.cassette_library_dir = 'fixtures'
  c.hook_into :fakeweb
end

describe "Trying to repro the Issue" do
  it "should fail?" do
    VCR.use_cassette('undefined_method_headers', :update_content_length_header => true) do
      agent = Mechanize.new
      agent.get("https://ecare.testdomain.com/MyAccount/acct_summary.asp")
    end
  end
end

If I set update_content_length_header to false, then I get the undefined method 'headers' exception.

I took another look, and I had somehow set update_content_length_header to false :-(. So setting it to true indeed works for me. So that solves my immediate problems although, obviously, it would be nice if the workaround was not necessary. Let me know if I can help further in debugging the root problem.

Owner

myronmarston commented Jun 25, 2012

I took another look, and I had somehow set update_content_length_header to false :-(. So setting it to true indeed works for me. So that solves my immediate problems although, obviously, it would be nice if the workaround was not necessary. Let me know if I can help further in debugging the root problem.

Glad to know the option works for you :).

Re: the root problem--I mentioned this above, but there's a bug in syck that can cause it to sometimes truncate whitespace when serializing strings. Usually it's not a big deal (HTML/XML/JSON are not whitespace sensitive) but it can change the content length. Are you using syck? You can try using the psych or json serializers instead.

Even though the underlying problem is not due to a VCR bug, it exposes a bug in VCR--at least, I consider it to be so since the error is a NoMethodError deep in VCR's internals. So I still want to fix or improve this somehow, but it's not clear to me how to do that just yet...

Ok, so forcing psych also avoids the problem. I am on ruby 1.9.3, but just discovered that I had an override of the default YAML serializer for legacy reasons. So removing that also avoids the problem. So I'm all good now ;-)

In terms of helping others avoid this in the future, could you detect if syck is being used and if so default update_content_length_header to true? Aside from being a little confusing to explain those defaults, that would make sense to me...

Owner

myronmarston commented Jun 25, 2012

In terms of helping others avoid this in the future, could you detect if syck is being used and if so default update_content_length_header to true? Aside from being a little confusing to explain those defaults, that would make sense to me...

It's an interesting idea, but I'm concerned that it might lead to false positive tests. Consider the case where the server has a bug that causes it to mis-report the content length. If the client uses the content length (as mechanize does) then you'll have to work around it somehow. If VCR defaulted :update_content_length_header, it could potentially lead to a false positive test, where the test passes (because VCR is "correcting" the content length header) but in production the code fails because it does not have that correction happening.

Member

Jacobkg commented Jun 26, 2012

@myronmarston Do you have a sense for what you think the behavior should be? For webmock, what happens is you get a VCR::Errors::UnhandledHTTPRequestError: (since we are not matching requests multiple times).

The bug seems to be in the way that the Fakeweb request handler implements set_typed_request_for_after_hook. It stores the request type in the @__vcr_request_type instance variable on the request, which it then uses to set the default request type when you initialize a new Fakeweb request handler. If you could elaborate on why this happens (and why for fakeweb but not webmock or the others) that would be helpful. This is the code, from fakeweb.rb

if ([:@__vcr_request_type, "@__vcr_request_type"] & request.instance_variables).any?
  @request_type = request.instance_variable_get(:@__vcr_request_type)
else
  @request_type = nil
end
Owner

myronmarston commented Jun 26, 2012

@Jacobkg -- that dates back to this commit. Net::HTTP has some weird behavior where it will call itself recursively, so with the way we hook into it, our handler can be called twice for a single request. Due to how the http interaction list works, it's important that it's only queried per request, so we have to cache the result in the request itself so we can re-use it on the recursive invocation of the handler.

Member

Jacobkg commented Jun 27, 2012

I added this test to fakeweb_spec and it failed with the expected error:

it "Make request twice against cassette using the same http request object" do
      uri = URI.parse("http://localhost:#{VCR::SinatraApp.port}/foo")
      http = Net::HTTP.new(uri.host, uri.port)
      VCR.use_cassette("new_cassette", :record => :once) do
        request = Net::HTTP::Get.new(uri.request_uri)
        response = http.request(request)
      end

      VCR.use_cassette("new_cassette", :record => :once) do
        request = Net::HTTP::Get.new(uri.request_uri)
        response = http.request(request)
        response = http.request(request)
      end
    end
Member

Jacobkg commented Jun 28, 2012

I guess I don't really understand why we want to avoid querying the httpinteractionlist for a second time when we recurse. As far as I can tell, the recursing only happens when we are recording, which means that the httpinteractionlist will just be returning that there is no matching request. I couldn't find any reason why doing that multiple times would be a problem. Is it for performance reasons or is there something else?

Owner

myronmarston commented Jun 28, 2012

I guess I don't really understand why we want to avoid querying the httpinteractionlist for a second time when we recurse.

HTTPInteractionList#response_for has a side effect (it "consumes" the HTTP interaction so that it is no longer available to be used again)...which means that calling it twice for a single HTTP request will cause bugs. Beyond that, I think it makes things easier to reason about (conceptually, each request should only ask for a response once!), and probably improves performance, since asking the HTTPInteractionList for a match is an O(N) operation--obviously, we don't want to ask it over and over again for the same information for a single HTTP request. In practice, I imagine querying it a second time will not show a noticeable perf impact. Also, with the debug logging VCR supports, I think it would be confusing for end-users to read the log and see that one HTTP request is asking for a response multiple times.

As far as I can tell, the recursing only happens when we are recording, which means that the httpinteractionlist will just be returning that there is no matching request.

You may be right; this may not be an issue in practice (particularly with code changes that have happened since I first wrote that logic). I need to dig into it more to think through the implications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment