Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for GridFS append mode #154

Closed
wants to merge 13 commits into from
Closed
4 changes: 0 additions & 4 deletions Gemfile
Expand Up @@ -17,10 +17,6 @@ group :testing do
gem 'test-unit'
gem 'mocha', ">=0.13.0", :require => 'mocha/setup'
gem 'shoulda', ">=3.3.2"

gem 'growl'
gem 'guard-rspec'
gem 'rb-fsevent'
gem 'rspec'

gem 'sfl'
Expand Down
4 changes: 0 additions & 4 deletions Guardfile

This file was deleted.

29 changes: 9 additions & 20 deletions README.md
@@ -1,11 +1,16 @@
[![Build Status][travis-img]][travis-url] [![Jenkins Status][jenkins-img]][jenkins-url] [![Code Climate][codeclimate-img]][codeclimate-url]
[![Build Status][travis-img]][travis-url]
[![Jenkins Status][jenkins-img]][jenkins-url]
[![Code Climate][codeclimate-img]][codeclimate-url]
[![Latest Version][version-img]][version-url]

[travis-img]: https://secure.travis-ci.org/mongodb/mongo-ruby-driver.png
[travis-url]: http://travis-ci.org/mongodb/mongo-ruby-driver
[codeclimate-img]: https://codeclimate.com/badge.png
[codeclimate-url]: https://codeclimate.com/github/mongodb/mongo-ruby-driver
[jenkins-img]: https://jenkins.10gen.com/job/mongo-ruby-driver/badge/icon
[jenkins-url]: https://jenkins.10gen.com/job/mongo-ruby-driver/
[version-img]: https://badge.fury.io/rb/mongo.png
[version-url]: http://badge.fury.io/rb/mongo
[api-url]: http://api.mongodb.org/ruby/current

# Documentation
Expand Down Expand Up @@ -97,30 +102,14 @@ you can install it as a gem from the source by typing:

For extensive examples, see the [MongoDB Ruby Tutorial](https://github.com/mongodb/mongo-ruby-driver/wiki/Tutorial).

Bundled with the driver are many examples, located in the "docs/examples" subdirectory. Samples include using
the driver and using the GridFS class GridStore. MongoDB must be running for
these examples to work, of course.

Here's how to start MongoDB and run the "simple.rb" example:

$ cd path/to/mongo
$ ./mongod run
... then in another window ...
$ cd path/to/mongo-ruby-driver
$ ruby docs/examples/simple.rb

See also the test code, especially test/test_db_api.rb.

# GridFS

The Ruby driver include two abstractions for storing large files: Grid and GridFileSystem.

The Grid class is a Ruby implementation of MongoDB's GridFS file storage
specification. GridFileSystem is essentially the same, but provides a more filesystem-like API
and assumes that filenames are unique.
specification. GridFileSystem is essentially the same, but provides a more filesystem-like API and assumes that filenames are unique.

An instance of both classes represents an individual file store. See the API reference
for details, and see examples/gridfs.rb for code that uses many of the Grid
features (metadata, content type, seek, tell, etc).
An instance of both classes represents an individual file store. See the API reference for details.

Examples:

Expand Down
2 changes: 1 addition & 1 deletion VERSION
@@ -1 +1 @@
1.8.1
1.8.2
2 changes: 1 addition & 1 deletion ext/cbson/version.h
Expand Up @@ -14,4 +14,4 @@
* limitations under the License.
*/

#define VERSION "1.8.1"
#define VERSION "1.8.2"
Empty file modified lib/mongo.rb 100755 → 100644
Empty file.
8 changes: 7 additions & 1 deletion lib/mongo/cursor.rb
Expand Up @@ -560,7 +560,8 @@ def construct_query_message
BSON::BSON_RUBY.serialize_cstr(message, "#{@db.name}.#{@collection.name}")
message.put_int(@skip)
message.put_int(@limit)
message.put_binary(BSON::BSON_CODER.serialize(construct_query_spec, false).to_s)
spec = query_contains_special_fields? ? construct_query_spec : @selector
message.put_binary(BSON::BSON_CODER.serialize(spec, false).to_s)
message.put_binary(BSON::BSON_CODER.serialize(@fields, false).to_s) if @fields
message
end
Expand Down Expand Up @@ -593,6 +594,11 @@ def construct_query_spec
spec
end

def query_contains_special_fields?
@order || @explain || @hint || @snapshot || @show_disk_loc ||
@max_scan || @return_key || @comment || @connection.mongos?
end

def close_cursor_if_query_complete
if @limit > 0 && @returned >= @limit
close
Expand Down
61 changes: 56 additions & 5 deletions lib/mongo/gridfs/grid_io.rb
Expand Up @@ -71,8 +71,9 @@ def initialize(files, chunks, filename, mode, opts={})
case @mode
when 'r' then init_read
when 'w' then init_write(opts)
when 'a' then init_append(opts)
else
raise GridError, "Invalid file mode #{@mode}. Mode should be 'r' or 'w'."
raise GridError, "Invalid file mode #{@mode}. Mode should be 'r', 'w' or 'a'."
end
end

Expand Down Expand Up @@ -116,7 +117,8 @@ def read(length=nil)
# @return [Integer]
# the number of bytes written.
def write(io)
raise GridError, "file not opened for write" unless @mode[0] == ?w
raise GridError, "file not opened for write" unless @mode[0] == ?w or @mode[0] == ?a

if io.is_a? String
if Mongo::WriteConcern.gle?(@write_concern)
@local_md5.update(io)
Expand Down Expand Up @@ -234,13 +236,27 @@ def getc
# on GridIO#open is passed a block. Otherwise, it must be called manually.
#
# @return [BSON::ObjectId]
#def close
# if @current_chunk['n'].zero? && @chunk_position.zero?
# warn "Warning: Storing a file with zero length."
# end
# @upload_date = Time.now.utc
# object = to_mongo_object
# id = @files.update({_id: object['_id']}, object, {upsert: true})
# id
#end
def close
if @mode[0] == ?w
if @mode[0] == ?w or @mode[0] == ?a
if @current_chunk['n'].zero? && @chunk_position.zero?
warn "Warning: Storing a file with zero length."
end
@upload_date = Time.now.utc
id = @files.insert(to_mongo_object)
if @mode[0] == ?w
id = @files.insert(to_mongo_object)
elsif @mode[0] == ?a
object = to_mongo_object
id = @files.update({_id: object['_id']}, object, {upsert: true})
end
end
id
end
Expand Down Expand Up @@ -285,7 +301,7 @@ def save_chunk(chunk)

def get_chunk(n)
chunk = @chunks.find({'files_id' => @files_id, 'n' => n}).next_document
@chunk_position = 0
@chunk_position = (@mode == ?a ? chunk['data'].size : 0) unless chunk.nil?
chunk
end

Expand Down Expand Up @@ -438,7 +454,41 @@ def init_write(opts)
@current_chunk = create_chunk(0)
@file_position = 0
end

# Initialize the class for appending to a file.
def init_append(opts)
doc = @files.find(@query, @query_opts).next_document
return init_write(opts) unless doc

opts = doc.dup

@files_id = opts.delete('_id')
@content_type = opts.delete('contentType')
@chunk_size = opts.delete('chunkSize')
@upload_date = opts.delete('uploadDate')
@aliases = opts.delete('aliases')
@file_length = opts.delete('length')
@metadata = opts.delete('metadata')
@md5 = opts.delete('md5')
@filename = opts.delete('filename')
@custom_attrs = opts

# recalculate the md5 of the previous chunks
if Mongo::WriteConcern.gle?(@write_concern)
@current_chunk = get_chunk(0)
@file_position = 0
@local_md5 = Digest::MD5.new
@local_md5.update(read_all)
end

# position at the end of the file (last chunk)
last_chunk = @file_length / @chunk_size
@current_chunk = get_chunk(last_chunk)
chunk = get_chunk(last_chunk-1) if @current_chunk.nil?
@current_chunk ||= create_chunk(last_chunk)
@file_position = @chunk_size * last_chunk + @current_chunk['data'].size
end

def check_existing_file
if @files.find_one('_id' => @files_id)
raise GridError, "Attempting to overwrite with Grid#put. You must delete the file first."
Expand Down Expand Up @@ -466,6 +516,7 @@ def get_md5
md5_command['filemd5'] = @files_id
md5_command['root'] = @fs_name
@server_md5 = @files.db.command(md5_command)['md5']

if Mongo::WriteConcern.gle?(@write_concern)
@client_md5 = @local_md5.hexdigest
if @local_md5 == @server_md5
Expand Down
12 changes: 9 additions & 3 deletions lib/mongo/mongo_client.rb
Expand Up @@ -181,7 +181,7 @@ def self.multi(nodes, opts={})

# Initialize a connection to MongoDB using the MongoDB URI spec.
#
# Since MongoClient.new cannot be used with any <code>ENV["MONGODB_URI"]</code> that has multiple hosts (implying a replicaset),
# Since MongoClient.new cannot be used with any <code>ENV["MONGODB_URI"]</code> that has multiple hosts (implying a replicaset),
# you may use this when the type of your connection varies by environment and should be determined solely from <code>ENV["MONGODB_URI"]</code>.
#
# @param uri [String]
Expand Down Expand Up @@ -479,7 +479,10 @@ def connect
raise ConnectionFailure, "Failed to connect to a master node at #{host_port.join(":")}"
end
end
alias :reconnect :connect

# Ensures that the alias carries over to the overridden connect method when using
# the replica set or sharded clients.
def reconnect; connect end

# It's possible that we defined connected as all nodes being connected???
# NOTE: Do check if this needs to be more stringent.
Expand Down Expand Up @@ -511,7 +514,10 @@ def active?
def read_primary?
@read_primary
end
alias :primary? :read_primary?

# Ensures that the alias carries over to the overridden connect method when using
# the replica set or sharded clients.
def primary?; read_primary? end

# The socket pool that this connection reads from.
#
Expand Down
1 change: 0 additions & 1 deletion lib/mongo/mongo_replica_set_client.rb 100755 → 100644
Expand Up @@ -286,7 +286,6 @@ def nodes
def read_primary?
@manager.read_pool == @manager.primary_pool
end
alias :primary? :read_primary?

# Close the connection to the database.
def close(opts={})
Expand Down
Empty file modified lib/mongo/mongo_sharded_client.rb 100755 → 100644
Empty file.
22 changes: 10 additions & 12 deletions lib/mongo/util/pool.rb 100755 → 100644
@@ -1,5 +1,3 @@
# encoding: UTF-8

# --
# Copyright (C) 2008-2012 10gen Inc.
#
Expand Down Expand Up @@ -267,22 +265,16 @@ def checkout
@client.connect if !@client.connected?
start_time = Time.now
loop do
if (Time.now - start_time) > @timeout
raise ConnectionTimeoutError, "could not obtain connection within " +
"#{@timeout} seconds. The max pool size is currently #{@size}; " +
"consider increasing the pool size or timeout."
end

@connection_mutex.synchronize do
if socket_for_thread = thread_local[:sockets][self.object_id]
if !@checked_out.include?(socket_for_thread)
socket = checkout_existing_socket(socket_for_thread)
end
else # First checkout for this thread
if @checked_out.size < @sockets.size
socket = checkout_existing_socket
elsif @sockets.size < @size
else
if @sockets.size < @size
socket = checkout_new_socket
elsif @checked_out.size < @sockets.size
socket = checkout_existing_socket
end
end

Expand All @@ -307,6 +299,12 @@ def checkout
@queue.wait(@connection_mutex)
end
end

if (Time.now - start_time) > @timeout
raise ConnectionTimeoutError, "could not obtain connection within " +
"#{@timeout} seconds. The max pool size is currently #{@size}; " +
"consider increasing the pool size or timeout."
end
end
end

Expand Down
Empty file modified lib/mongo/util/pool_manager.rb 100755 → 100644
Empty file.
Empty file modified lib/mongo/util/ssl_socket.rb 100755 → 100644
Empty file.
Empty file modified lib/mongo/util/tcp_socket.rb 100755 → 100644
Empty file.
Empty file modified lib/mongo/util/thread_local_variable_manager.rb 100755 → 100644
Empty file.
2 changes: 1 addition & 1 deletion mongo.gemspec
Expand Up @@ -12,7 +12,7 @@ Gem::Specification.new do |s|

s.files = ['mongo.gemspec', 'LICENSE', 'VERSION']
s.files += ['README.md', 'Rakefile', 'bin/mongo_console']
s.files += ['lib/mongo.rb'] + Dir['lib/mongo/**/*.rb'] + Dir['examples/**/*.rb']
s.files += ['lib/mongo.rb'] + Dir['lib/mongo/**/*.rb']

s.test_files = Dir['test/**/*.rb']
s.executables = ['mongo_console']
Expand Down
1 change: 0 additions & 1 deletion tasks/testing.rake
Expand Up @@ -59,7 +59,6 @@ namespace :test do

Rake::TestTask.new(:functional) do |t|
t.test_files = FileList['test/functional/*_test.rb'] - [
"test/functional/pool_test.rb",
"test/functional/grid_io_test.rb",
"test/functional/grid_test.rb"
]
Expand Down
37 changes: 37 additions & 0 deletions test/functional/grid_io_test.rb
Expand Up @@ -151,6 +151,43 @@ class GridIOTest < Test::Unit::TestCase
end
end

context "Appending" do
setup do
@filename = 'test'
@length = nil
@times = 2
end

should "correctly append two chunks" do

@times.times do |t|
file = GridIO.new(@files, @chunks, @filename, 'a')
file.write "#{t}" * 100
file.close
end

file = GridIO.new(@files, @chunks, @filename, 'r')
data = file.read
file.close

assert_equal data, @times.times.map {|i| "#{i}" * 100}.join
end

should "append two chunks of exactly chunk_size" do
@times.times do |t|
file = GridIO.new(@files, @chunks, @filename, 'a')
file.write "#{t}" * file.chunk_size
file.close
end

file = GridIO.new(@files, @chunks, @filename, 'r')
data = file.read
file.close

assert_equal data, @times.times.map {|i| "#{i}" * file.chunk_size}.join
end
end

context "Seeking" do
setup do
@filename = 'test'
Expand Down