Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Add a failing test case around cursors and secondary reads. #134

Merged
merged 1 commit into from

2 participants

@nelhage

If we do a secondary read that is large enough to require sending a
GETMORE, and then do another query before the GETMORE, the secondary
connection gets unpinned, and the GETMORE gets sent to the wrong
server, resulting in CURSOR_NOT_FOUND, even though the cursor still
exists on the server that was initially queried.

(I don't know if there is a way to mark a test that is expected to fail for now; I expect this shouldn't be merged as-is, but a pull request seemed like the right way to submit this, since I'd like to see some sort of test merged once the bug is fixed)

@nelhage nelhage Add a failing test case around cursors and secondary reads.
If we do a secondary read that is large enough to require sending a
GETMORE, and then do another query before the GETMORE, the secondary
connection gets unpinned, and the GETMORE gets sent to the wrong
server, resulting in CURSOR_NOT_FOUND, even though the cursor still
exists on the server that was initially queried.
d5e8222
@brandonblack brandonblack merged commit 04da941 into from
@brandonblack

@nelhage Thanks.

I went ahead and merged this because it looks like it makes sense and the test isn't actually failing for me. :-/ That said, let's track the issue you're having and see if we can get to the bottom of it.

We track all of our issues here:
https://jira.mongodb.org/browse/RUBY

Can you create a ticket? That's the best way to keep it on our radar. You can just reference this pull request in the ticket created.

@nelhage

I've opened https://jira.mongodb.org/browse/RUBY-505

Interesting that this isn't failing for you. It is failing consistently for me, like so:

[nelhage@anarchique:~/stripe/mongo-ruby-driver]$ rake test:replica_set TESTS=test/replica_set/query_test.rb  TESTOPTS='-ntest_secondary_getmore'
Loaded suite /home/nelhage/.rbenv/versions/1.8.7-p370/lib/ruby/gems/1.8/gems/rake-10.0.1/lib/rake/rake_test_loader
Started
all output going to: data/mongods-3003/mongods.log
all output going to: data/mongods-3004/mongods.log
all output going to: data/mongods-3000/mongods.log
all output going to: data/mongods-3001/mongods.log
all output going to: data/mongods-3002/mongods.log
E
===============================================================================
Error: test_secondary_getmore(ReplicaSetQueryTest)
Mongo::OperationFailure: Query response returned CURSOR_NOT_FOUND. Either an invalid cursor was specified, or the cursor may have timed out on the server.
./lib/mongo/networking.rb:192:in `check_response_flags'
     189:
     190:     def check_response_flags(flags)
     191:       if flags & Mongo::Constants::REPLY_CURSOR_NOT_FOUND != 0
  => 192:         raise Mongo::OperationFailure, "Query response returned CURSOR_NOT_FOUND. " +
     193:           "Either an invalid cursor was specified, or the cursor may have timed out on the server."
     194:       elsif flags & Mongo::Constants::REPLY_QUERY_FAILURE != 0
     195:         # Getting odd failures when a exception is raised here.
./lib/mongo/networking.rb:185:in `receive_response_header'
./lib/mongo/networking.rb:152:in `receive'
./lib/mongo/networking.rb:118:in `receive_message'
./lib/mongo/cursor.rb:532:in `send_get_more'
./lib/mongo/cursor.rb:466:in `refresh'
./lib/mongo/cursor.rb:128:in `next'
./lib/mongo/cursor.rb:289:in `each'
/home/nelhage/stripe/mongo-ruby-driver/test/replica_set/query_test.rb:60:in `test_secondary_getmore'
./lib/mongo/collection.rb:267:in `find'
/home/nelhage/stripe/mongo-ruby-driver/test/replica_set/query_test.rb:59:in `test_secondary_getmore'
/home/nelhage/.rbenv/versions/1.8.7-p370/lib/ruby/gems/1.8/gems/mocha-0.12.7/lib/mocha/integration/test_unit/gem_version_230_to_252.rb:25:in `run'
===============================================================================


Finished in 87.151103 seconds.

1 tests, 0 assertions, 0 failures, 1 errors, 0 pendings, 0 omissions, 0 notifications
0% passed

Out of curiosity, can you get it to fail if you increase the "200" to something like 1000? I am not positive, but it may be non-determinstic, in that it it may check out a random secondary connection, which means it will worth half the time.

@brandonblack

@nelhage Awesome. Thanks for opening that ticket.

So I just ran it again, and I was able to see the failure. I was using 1.9.3 when I initially merged this. It still passes under 1.9.3 with no issues, but it looks like it fails quite consistently on 1.8.x.

We have a release coming out in the next few days (a big one) and I don't think I'll be able to dig into this before hand, but we'll get to the bottom of it shortly.

@brandonblack brandonblack referenced this pull request from a commit
@brandonblack brandonblack disabling failing test under 1.8.x for #134 for now. will circle back…
… on this post 1.8 release
c339df3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 18, 2012
  1. @nelhage

    Add a failing test case around cursors and secondary reads.

    nelhage authored
    If we do a secondary read that is large enough to require sending a
    GETMORE, and then do another query before the GETMORE, the secondary
    connection gets unpinned, and the GETMORE gets sent to the wrong
    server, resulting in CURSOR_NOT_FOUND, even though the cursor still
    exists on the server that was initially queried.
This page is out of date. Refresh to see the latest.
Showing with 21 additions and 0 deletions.
  1. +21 −0 test/replica_set/query_test.rb
View
21 test/replica_set/query_test.rb
@@ -44,6 +44,27 @@ def test_query
end
end
+ # Create a large collection and do a secondary query that returns
+ # enough records to require sending a GETMORE. In between opening
+ # the cursor and sending the GETMORE, do a :primary query. Confirm
+ # that the cursor reading from the secondary continues to talk to
+ # the secondary, rather than trying to read the cursor from the
+ # primary, where it does not exist.
+ def test_secondary_getmore
+ 200.times do |i|
+ @coll.save({:a => i}, :safe => {:w => 3})
+ end
+ as = []
+ # Set an explicit batch size, in case the default ever changes.
+ @coll.find({}, { :batch_size => 100, :read => :secondary }) do |c|
+ c.each do |result|
+ as << result['a']
+ @coll.find({:a => result['a']}, :read => :primary).map
+ end
+ end
+ assert_equal(as.sort, 0.upto(199).to_a)
+ end
+
def benchmark_queries
t1 = Time.now
10000.times { @coll.find_one }
Something went wrong with that request. Please try again.