Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Commit

Permalink
Merge Replica Set Refactor
Browse files Browse the repository at this point in the history
* Removes Server, and Socket; replaced with Node, and Connection.

  Replica sets are now much more robustly supported, including failover
  and discovery.

* Refactors specs.

  Internal APIs are now tested with integration specs through the public
  APIs.

* More documentation.
  • Loading branch information
bernerdschaefer committed Apr 17, 2012
1 parent 9084773 commit dd5a7c1
Show file tree
Hide file tree
Showing 42 changed files with 2,057 additions and 3,322 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,6 @@ Gemfile.lock
doc
.yardoc
.rvmrc
.env
perf/results
tmp/
160 changes: 160 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,21 @@ session[:artists].find(name: "Syd Vicious").
)
```

## Features

* Automated replica set node discovery and failover.
* No C or Java extensions
* No external dependencies
* Simple, stable, public API.

### Unsupported Features

* GridFS
* Map/Reduce

These features are possible to implement, but outside the scope of Moped's
goals. Consider them perfect opportunities to write a companion gem!

# Project Breakdown

Moped is composed of three parts: an implementation of the [BSON
Expand Down Expand Up @@ -43,6 +58,31 @@ id.generation_time # => 2012-04-11 13:14:29 UTC
id == Moped::BSON::ObjectId.from_string(id.to_s) # => true
```

<table><tbody>

<tr><th>new</th>
<td>Creates a new object id.</td></tr>

<tr><th>from_string</th>
<td>Creates a new object id from an object id string.
<br>
<code>Moped::BSON::ObjectId.from_string("4f8d8c66e5a4e45396000009")</code>
</td></tr>

<tr><th>from_time</th>
<td>Creates a new object id from a time.
<br>
<code>Moped::BSON::ObjectId.from_time(Time.new)</code>
</td></tr>

<tr><th>legal?</th>
<td>Validates an object id string.
<br>
<code>Moped::BSON::ObjectId.legal?("4f8d8c66e5a4e45396000009")</code>
</td></tr>

</tbody></table>

### Moped::BSON::Code

The `Code` class is used for working with javascript on the server.
Expand Down Expand Up @@ -299,6 +339,126 @@ scope.one # nil

</tbody></table>

# Exceptions

Here's a list of the exceptions generated by Moped.

<table><tbody>

<tr><th>Moped::Errors::ConnectionFailure</th>
<td>Raised when a node cannot be reached or a connection is lost.
<br>
<strong>Note:</strong> this exception is only raised if Moped could not
reconnect, so you shouldn't attempt to rescue this.</td></tr>

<tr><th>Moped::Errors::OperationFailure</th>
<td>Raised when a command fails or is invalid, such as when an insert fails in
safe mode.</td></tr>

<tr><th>Moped::Errors::QueryFailure</th>
<td>Raised when an invalid query was sent to the database.</td></tr>

<tr><th>Moped::Errors::AuthenticationFailure</th>
<td>Raised when invalid credentials were passed to `session.login`.</td></tr>

<tr><th>Moped::Errors::SocketError</th>
<td>Not a real exception, but a module used to tag unhandled exceptions inside
of a node's networking code. Allows you to `rescue Moped::SocketError` which
preserving the real exception.</td></tr>

</tbody></table>

Other exceptions are possible while running commands, such as IO Errors around
failed connections. Moped tries to be smart about managing its connections,
such as checking if they're dead before executing a command; but those checks
aren't foolproof, and Moped is conservative about handling unexpected errors on
its connections. Namely, Moped will *not* retry a command if an unexpected
exception is raised. Why? Because it's impossible to know whether the command
was actually received by the remote Mongo instance, and without domain
knowledge it cannot be safely retried.

Take for example this case:

```ruby
session.with(safe: true)["users"].insert(name: "John")
```

It's entirely possible that the insert command will be sent to Mongo, but the
connection gets closed before we read the result for `getLastError`. In this
case, there's no way to know whether the insert was actually successful!

If, however, you want to gracefully handle this in your own application, you
could do something like:

```ruby
document = { _id: Moped::BSON::ObjectId.new, name: "John" }

begin
session["users"].insert(document)
rescue Moped::Errors::SocketError
session["users"].find(_id: document[:_id]).upsert(document)
end
```

# Replica Sets

Moped has full support for replica sets including automatic failover and node
discovery.

## Automatic Failover

Moped will automatically retry lost connections and attempt to detect dead
connections before sending an operation. Note, that it will *not* retry
individual operations! For example, these cases will work and not raise any
exceptions:

```ruby
session[:users].insert(name: "John")
# kill primary node and promote secondary
session[:users].insert(name: "John")
session[:users].find.count # => 2.0

# primary node drops our connection
session[:users].insert(name: "John")
```

However, you'll get an operation error in a case like:

```ruby
# primary node goes down while reading the reply
session.with(safe: true)[:users].insert(name: "John")
```

And you'll get a connection error in a case like:

```ruby
# primary node goes down, no new primary available yet
session[:users].insert(name: "John")
```

If your session is running with eventual consistency, read operations will
never raise connection errors as long as any secondary or primary node is
running. The only case where you'll see a connection failure is if a node goes
down while attempting to retrieve more results from a cursor, because cursors
are tied to individual nodes.

When two attempts to connect to a node fail, it will be marked as down. This
removes it from the list of available nodes for `:down_interval` (default 30
seconds). Note that the `:down_interval` only applies to normal operations;
that is, if you ask for a primary node and none is available, all nodes will be
retried. Likewise, if you ask for a secondary node, and no secondary or primary
node is available, all nodes will be retreied.

## Node Discovery

The addresses you pass into your session are used as seeds for setting up
replica set connections. After connection, each seed node will return a list of
other known nodes which will be added to the set.

This information is cached according to the `:refresh_interval` option (default:
5 minutes). That means, e.g., that if you add a new node to your replica set,
it should be represented in Moped within 5 minutes.

# Thread-Safety

Moped is thread-safe -- depending on your definition of thread-safe. For Moped,
Expand Down
6 changes: 4 additions & 2 deletions lib/moped.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,16 @@
require "moped/bson"
require "moped/cluster"
require "moped/collection"
require "moped/connection"
require "moped/cursor"
require "moped/database"
require "moped/errors"
require "moped/indexes"
require "moped/logging"
require "moped/node"
require "moped/protocol"
require "moped/query"
require "moped/server"
require "moped/session"
require "moped/socket"
require "moped/session/context"
require "moped/threaded"
require "moped/version"
82 changes: 31 additions & 51 deletions lib/moped/bson/object_id.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,31 +8,33 @@ class ObjectId
# Formatting string for outputting an ObjectId.
@@string_format = ("%02x" * 12).freeze

attr_reader :data

class << self
def from_string(string)
raise Errors::InvalidObjectId.new(string) unless legal?(string)
data = []
data = ""
12.times { |i| data << string[i*2, 2].to_i(16) }
new data
from_data data
end

def from_time(time)
from_data @@generator.generate(time.to_i)
end

def legal?(str)
!!str.match(/^[0-9a-f]{24}$/i)
!!str.match(/\A\h{24}\Z/i)
end
end

def initialize(data = nil, time = nil)
if data
@data = data
elsif time
@data = @@generator.generate(time.to_i)
else
@data = @@generator.next
def from_data(data)
id = allocate
id.instance_variable_set :@data, data
id
end
end

def data
@data ||= @@generator.next
end

def ==(other)
BSON::ObjectId === other && data == other.data
end
Expand All @@ -43,78 +45,56 @@ def hash
end

def to_s
@@string_format % data
@@string_format % data.unpack("C12")
end

# Return the UTC time at which this ObjectId was generated. This may
# be used instread of a created_at timestamp since this information
# is always encoded in the object id.
def generation_time
Time.at(@data.pack("C4").unpack("N")[0]).utc
Time.at(data.unpack("N")[0]).utc
end

class << self
def __bson_load__(io)
new io.read(12).unpack('C*')
from_data(io.read(12))
end

end

def __bson_dump__(io, key)
io << Types::OBJECT_ID
io << key
io << NULL_BYTE
io << data.pack('C12')
io << data
end

# @api private
class Generator
def initialize
# Generate and cache 3 bytes of identifying information from the current
# machine.
@machine_id = Digest::MD5.digest(Socket.gethostname).unpack("C3")
@machine_id = Digest::MD5.digest(Socket.gethostname).unpack("N")[0]

@mutex = Mutex.new
@last_timestamp = nil
@counter = 0
end

# Return object id data based on the current time, incrementing a
# counter for object ids generated in the same second.
# Return object id data based on the current time, incrementing the
# object id counter.
def next
now = Time.new.to_i

counter = @mutex.synchronize do
last_timestamp, @last_timestamp = @last_timestamp, now

if last_timestamp == now
@counter += 1
else
@counter = 0
end
@mutex.lock
begin
counter = @counter = (@counter + 1) % 0xFFFFFF
ensure
@mutex.unlock rescue nil
end

generate(now, counter)
generate(Time.new.to_i, counter)
end

# Generate object id data for a given time using the provided +inc+.
def generate(time, inc = 0)
pid = Process.pid % 0xFFFF

[
time >> 24 & 0xFF, # 4 bytes time (network order)
time >> 16 & 0xFF,
time >> 8 & 0xFF,
time & 0xFF,
@machine_id[0], # 3 bytes machine
@machine_id[1],
@machine_id[2],
pid >> 8 & 0xFF, # 2 bytes process id
pid & 0xFF,
inc >> 16 & 0xFF, # 3 bytes increment
inc >> 8 & 0xFF,
inc & 0xFF,
]
# Generate object id data for a given time using the provided +counter+.
def generate(time, counter = 0)
[time, @machine_id, Process.pid, counter << 8].pack("N NX lXX NX")
end
end

Expand Down
Loading

0 comments on commit dd5a7c1

Please sign in to comment.