Skip to content

Loading…

Support for GridFS append mode #154

Closed
wants to merge 13 commits into from

5 participants

@alor

This add the ability to open a gridfs file in append mode and add data to it, without having to pull it from the db and put it again once modified.

Currently the only way to append some data to a file is to dump the entire file to a temporary file, append the data and then put it again into the gridfs, a very expensive way to add some bytes to a file.

Hope it will be useful to everyone.

best regards.

@alor

oops, i've created the branch from 1.8.2 release, better if i recreate it against master (it included lot of commits that are unneeded) closing this request. Sorry.

@alor alor closed this
@alor alor deleted the alor:1.8.2_append branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
View
4 Gemfile
@@ -17,10 +17,6 @@ group :testing do
gem 'test-unit'
gem 'mocha', ">=0.13.0", :require => 'mocha/setup'
gem 'shoulda', ">=3.3.2"
-
- gem 'growl'
- gem 'guard-rspec'
- gem 'rb-fsevent'
gem 'rspec'
gem 'sfl'
View
4 Guardfile
@@ -1,4 +0,0 @@
-guard 'rspec', :rvm => ['1.8.7@mongo-ruby-driver', '1.9.3@mongo-ruby-driver'] do
- watch(%r{^spec/.+_spec\.rb$})
- watch('spec/spec_helper.rb') { "spec" }
-end
View
29 README.md
@@ -1,4 +1,7 @@
-[![Build Status][travis-img]][travis-url] [![Jenkins Status][jenkins-img]][jenkins-url] [![Code Climate][codeclimate-img]][codeclimate-url]
+[![Build Status][travis-img]][travis-url]
+[![Jenkins Status][jenkins-img]][jenkins-url]
+[![Code Climate][codeclimate-img]][codeclimate-url]
+[![Latest Version][version-img]][version-url]
[travis-img]: https://secure.travis-ci.org/mongodb/mongo-ruby-driver.png
[travis-url]: http://travis-ci.org/mongodb/mongo-ruby-driver
@@ -6,6 +9,8 @@
[codeclimate-url]: https://codeclimate.com/github/mongodb/mongo-ruby-driver
[jenkins-img]: https://jenkins.10gen.com/job/mongo-ruby-driver/badge/icon
[jenkins-url]: https://jenkins.10gen.com/job/mongo-ruby-driver/
+[version-img]: https://badge.fury.io/rb/mongo.png
+[version-url]: http://badge.fury.io/rb/mongo
[api-url]: http://api.mongodb.org/ruby/current
# Documentation
@@ -97,30 +102,14 @@ you can install it as a gem from the source by typing:
For extensive examples, see the [MongoDB Ruby Tutorial](https://github.com/mongodb/mongo-ruby-driver/wiki/Tutorial).
-Bundled with the driver are many examples, located in the "docs/examples" subdirectory. Samples include using
-the driver and using the GridFS class GridStore. MongoDB must be running for
-these examples to work, of course.
-
-Here's how to start MongoDB and run the "simple.rb" example:
-
- $ cd path/to/mongo
- $ ./mongod run
- ... then in another window ...
- $ cd path/to/mongo-ruby-driver
- $ ruby docs/examples/simple.rb
-
-See also the test code, especially test/test_db_api.rb.
-
# GridFS
The Ruby driver include two abstractions for storing large files: Grid and GridFileSystem.
+
The Grid class is a Ruby implementation of MongoDB's GridFS file storage
-specification. GridFileSystem is essentially the same, but provides a more filesystem-like API
-and assumes that filenames are unique.
+specification. GridFileSystem is essentially the same, but provides a more filesystem-like API and assumes that filenames are unique.
-An instance of both classes represents an individual file store. See the API reference
-for details, and see examples/gridfs.rb for code that uses many of the Grid
-features (metadata, content type, seek, tell, etc).
+An instance of both classes represents an individual file store. See the API reference for details.
Examples:
View
2 VERSION
@@ -1 +1 @@
-1.8.1
+1.8.2
View
2 ext/cbson/version.h
@@ -14,4 +14,4 @@
* limitations under the License.
*/
-#define VERSION "1.8.1"
+#define VERSION "1.8.2"
View
0 lib/mongo.rb 100755 → 100644
File mode changed.
View
8 lib/mongo/cursor.rb
@@ -560,7 +560,8 @@ def construct_query_message
BSON::BSON_RUBY.serialize_cstr(message, "#{@db.name}.#{@collection.name}")
message.put_int(@skip)
message.put_int(@limit)
- message.put_binary(BSON::BSON_CODER.serialize(construct_query_spec, false).to_s)
+ spec = query_contains_special_fields? ? construct_query_spec : @selector
+ message.put_binary(BSON::BSON_CODER.serialize(spec, false).to_s)
message.put_binary(BSON::BSON_CODER.serialize(@fields, false).to_s) if @fields
message
end
@@ -593,6 +594,11 @@ def construct_query_spec
spec
end
+ def query_contains_special_fields?
+ @order || @explain || @hint || @snapshot || @show_disk_loc ||
+ @max_scan || @return_key || @comment || @connection.mongos?
+ end
+
def close_cursor_if_query_complete
if @limit > 0 && @returned >= @limit
close
View
61 lib/mongo/gridfs/grid_io.rb
@@ -71,8 +71,9 @@ def initialize(files, chunks, filename, mode, opts={})
case @mode
when 'r' then init_read
when 'w' then init_write(opts)
+ when 'a' then init_append(opts)
else
- raise GridError, "Invalid file mode #{@mode}. Mode should be 'r' or 'w'."
+ raise GridError, "Invalid file mode #{@mode}. Mode should be 'r', 'w' or 'a'."
end
end
@@ -116,7 +117,8 @@ def read(length=nil)
# @return [Integer]
# the number of bytes written.
def write(io)
- raise GridError, "file not opened for write" unless @mode[0] == ?w
+ raise GridError, "file not opened for write" unless @mode[0] == ?w or @mode[0] == ?a
+
if io.is_a? String
if Mongo::WriteConcern.gle?(@write_concern)
@local_md5.update(io)
@@ -234,13 +236,27 @@ def getc
# on GridIO#open is passed a block. Otherwise, it must be called manually.
#
# @return [BSON::ObjectId]
+ #def close
+ # if @current_chunk['n'].zero? && @chunk_position.zero?
+ # warn "Warning: Storing a file with zero length."
+ # end
+ # @upload_date = Time.now.utc
+ # object = to_mongo_object
+ # id = @files.update({_id: object['_id']}, object, {upsert: true})
+ # id
+ #end
def close
- if @mode[0] == ?w
+ if @mode[0] == ?w or @mode[0] == ?a
if @current_chunk['n'].zero? && @chunk_position.zero?
warn "Warning: Storing a file with zero length."
end
@upload_date = Time.now.utc
- id = @files.insert(to_mongo_object)
+ if @mode[0] == ?w
+ id = @files.insert(to_mongo_object)
+ elsif @mode[0] == ?a
+ object = to_mongo_object
+ id = @files.update({_id: object['_id']}, object, {upsert: true})
+ end
end
id
end
@@ -285,7 +301,7 @@ def save_chunk(chunk)
def get_chunk(n)
chunk = @chunks.find({'files_id' => @files_id, 'n' => n}).next_document
- @chunk_position = 0
+ @chunk_position = (@mode == ?a ? chunk['data'].size : 0) unless chunk.nil?
chunk
end
@@ -438,7 +454,41 @@ def init_write(opts)
@current_chunk = create_chunk(0)
@file_position = 0
end
+
+ # Initialize the class for appending to a file.
+ def init_append(opts)
+ doc = @files.find(@query, @query_opts).next_document
+ return init_write(opts) unless doc
+
+ opts = doc.dup
+
+ @files_id = opts.delete('_id')
+ @content_type = opts.delete('contentType')
+ @chunk_size = opts.delete('chunkSize')
+ @upload_date = opts.delete('uploadDate')
+ @aliases = opts.delete('aliases')
+ @file_length = opts.delete('length')
+ @metadata = opts.delete('metadata')
+ @md5 = opts.delete('md5')
+ @filename = opts.delete('filename')
+ @custom_attrs = opts
+
+ # recalculate the md5 of the previous chunks
+ if Mongo::WriteConcern.gle?(@write_concern)
+ @current_chunk = get_chunk(0)
+ @file_position = 0
+ @local_md5 = Digest::MD5.new
+ @local_md5.update(read_all)
+ end
+ # position at the end of the file (last chunk)
+ last_chunk = @file_length / @chunk_size
+ @current_chunk = get_chunk(last_chunk)
+ chunk = get_chunk(last_chunk-1) if @current_chunk.nil?
+ @current_chunk ||= create_chunk(last_chunk)
+ @file_position = @chunk_size * last_chunk + @current_chunk['data'].size
+ end
+
def check_existing_file
if @files.find_one('_id' => @files_id)
raise GridError, "Attempting to overwrite with Grid#put. You must delete the file first."
@@ -466,6 +516,7 @@ def get_md5
md5_command['filemd5'] = @files_id
md5_command['root'] = @fs_name
@server_md5 = @files.db.command(md5_command)['md5']
+
if Mongo::WriteConcern.gle?(@write_concern)
@client_md5 = @local_md5.hexdigest
if @local_md5 == @server_md5
View
12 lib/mongo/mongo_client.rb
@@ -181,7 +181,7 @@ def self.multi(nodes, opts={})
# Initialize a connection to MongoDB using the MongoDB URI spec.
#
- # Since MongoClient.new cannot be used with any <code>ENV["MONGODB_URI"]</code> that has multiple hosts (implying a replicaset),
+ # Since MongoClient.new cannot be used with any <code>ENV["MONGODB_URI"]</code> that has multiple hosts (implying a replicaset),
# you may use this when the type of your connection varies by environment and should be determined solely from <code>ENV["MONGODB_URI"]</code>.
#
# @param uri [String]
@@ -479,7 +479,10 @@ def connect
raise ConnectionFailure, "Failed to connect to a master node at #{host_port.join(":")}"
end
end
- alias :reconnect :connect
+
+ # Ensures that the alias carries over to the overridden connect method when using
+ # the replica set or sharded clients.
+ def reconnect; connect end
# It's possible that we defined connected as all nodes being connected???
# NOTE: Do check if this needs to be more stringent.
@@ -511,7 +514,10 @@ def active?
def read_primary?
@read_primary
end
- alias :primary? :read_primary?
+
+ # Ensures that the alias carries over to the overridden connect method when using
+ # the replica set or sharded clients.
+ def primary?; read_primary? end
# The socket pool that this connection reads from.
#
View
1 lib/mongo/mongo_replica_set_client.rb 100755 → 100644
@@ -286,7 +286,6 @@ def nodes
def read_primary?
@manager.read_pool == @manager.primary_pool
end
- alias :primary? :read_primary?
# Close the connection to the database.
def close(opts={})
View
0 lib/mongo/mongo_sharded_client.rb 100755 → 100644
File mode changed.
View
22 lib/mongo/util/pool.rb 100755 → 100644
@@ -1,5 +1,3 @@
-# encoding: UTF-8
-
# --
# Copyright (C) 2008-2012 10gen Inc.
#
@@ -267,22 +265,16 @@ def checkout
@client.connect if !@client.connected?
start_time = Time.now
loop do
- if (Time.now - start_time) > @timeout
- raise ConnectionTimeoutError, "could not obtain connection within " +
- "#{@timeout} seconds. The max pool size is currently #{@size}; " +
- "consider increasing the pool size or timeout."
- end
-
@connection_mutex.synchronize do
if socket_for_thread = thread_local[:sockets][self.object_id]
if !@checked_out.include?(socket_for_thread)
socket = checkout_existing_socket(socket_for_thread)
end
- else # First checkout for this thread
- if @checked_out.size < @sockets.size
- socket = checkout_existing_socket
- elsif @sockets.size < @size
+ else
+ if @sockets.size < @size
socket = checkout_new_socket
+ elsif @checked_out.size < @sockets.size
+ socket = checkout_existing_socket
end
end
@@ -307,6 +299,12 @@ def checkout
@queue.wait(@connection_mutex)
end
end
+
+ if (Time.now - start_time) > @timeout
+ raise ConnectionTimeoutError, "could not obtain connection within " +
+ "#{@timeout} seconds. The max pool size is currently #{@size}; " +
+ "consider increasing the pool size or timeout."
+ end
end
end
View
0 lib/mongo/util/pool_manager.rb 100755 → 100644
File mode changed.
View
0 lib/mongo/util/ssl_socket.rb 100755 → 100644
File mode changed.
View
0 lib/mongo/util/tcp_socket.rb 100755 → 100644
File mode changed.
View
0 lib/mongo/util/thread_local_variable_manager.rb 100755 → 100644
File mode changed.
View
2 mongo.gemspec
@@ -12,7 +12,7 @@ Gem::Specification.new do |s|
s.files = ['mongo.gemspec', 'LICENSE', 'VERSION']
s.files += ['README.md', 'Rakefile', 'bin/mongo_console']
- s.files += ['lib/mongo.rb'] + Dir['lib/mongo/**/*.rb'] + Dir['examples/**/*.rb']
+ s.files += ['lib/mongo.rb'] + Dir['lib/mongo/**/*.rb']
s.test_files = Dir['test/**/*.rb']
s.executables = ['mongo_console']
View
1 tasks/testing.rake
@@ -59,7 +59,6 @@ namespace :test do
Rake::TestTask.new(:functional) do |t|
t.test_files = FileList['test/functional/*_test.rb'] - [
- "test/functional/pool_test.rb",
"test/functional/grid_io_test.rb",
"test/functional/grid_test.rb"
]
View
37 test/functional/grid_io_test.rb
@@ -151,6 +151,43 @@ class GridIOTest < Test::Unit::TestCase
end
end
+ context "Appending" do
+ setup do
+ @filename = 'test'
+ @length = nil
+ @times = 2
+ end
+
+ should "correctly append two chunks" do
+
+ @times.times do |t|
+ file = GridIO.new(@files, @chunks, @filename, 'a')
+ file.write "#{t}" * 100
+ file.close
+ end
+
+ file = GridIO.new(@files, @chunks, @filename, 'r')
+ data = file.read
+ file.close
+
+ assert_equal data, @times.times.map {|i| "#{i}" * 100}.join
+ end
+
+ should "append two chunks of exactly chunk_size" do
+ @times.times do |t|
+ file = GridIO.new(@files, @chunks, @filename, 'a')
+ file.write "#{t}" * file.chunk_size
+ file.close
+ end
+
+ file = GridIO.new(@files, @chunks, @filename, 'r')
+ data = file.read
+ file.close
+
+ assert_equal data, @times.times.map {|i| "#{i}" * file.chunk_size}.join
+ end
+ end
+
context "Seeking" do
setup do
@filename = 'test'
View
54 test/functional/pool_test.rb
@@ -5,51 +5,43 @@ class PoolTest < Test::Unit::TestCase
include Mongo
def setup
- @connection = standard_connection
+ @client ||= standard_connection({:pool_size => 500, :pool_timeout => 5})
+ @db = @client.db(MONGO_TEST_DB)
+ @collection = @db.collection("pool_test")
end
def test_pool_affinity
- @pool = Pool.new(@connection, TEST_HOST, TEST_PORT, :size => 5)
-
- @threads = []
+ pool = Pool.new(@client, TEST_HOST, TEST_PORT, :size => 5)
+ threads = []
10.times do
- @threads << Thread.new do
- original_socket = @pool.checkout
- @pool.checkin(original_socket)
+ threads << Thread.new do
+ original_socket = pool.checkout
+ pool.checkin(original_socket)
5000.times do
- socket = @pool.checkout
+ socket = pool.checkout
assert_equal original_socket, socket
- @pool.checkin(socket)
+ pool.checkin(socket)
end
end
end
- @threads.each { |t| t.join }
+ threads.each { |t| t.join }
end
- def test_pool_thread_pruning
- @pool = Pool.new(@connection, TEST_HOST, TEST_PORT, :size => 5)
-
- @threads = []
-
- 10.times do
- @threads << Thread.new do
- 50.times do
- socket = @pool.checkout
- @pool.checkin(socket)
- end
- end
+ def test_pool_affinity_max_size
+ 8000.times {|x| @collection.insert({:value => x})}
+ threads = []
+ threads << Thread.new do
+ @collection.find({"value" => {"$lt" => 100}}).each {|e| e}
+ Thread.pass
+ sleep(5)
+ @collection.find({"value" => {"$gt" => 100}}).each {|e| e}
end
-
- @threads.each { |t| t.join }
- assert_equal 10, @pool.instance_variable_get(:@threads_to_sockets).size
-
- # Thread-socket pool
- 10000.times do
- @pool.checkin(@pool.checkout)
+ sleep(1)
+ threads << Thread.new do
+ @collection.find({'$where' => "function() {for(i=0;i<8000;i++) {this.value};}"}).each {|e| e}
end
-
- assert_equal 1, @pool.instance_variable_get(:@threads_to_sockets).size
+ threads.each(&:join)
end
end
View
8 test/functional/timeout_test.rb
@@ -2,18 +2,16 @@
class TestTimeout < Test::Unit::TestCase
def test_op_timeout
- connection = standard_connection(:op_timeout => 2)
+ connection = standard_connection(:op_timeout => 1)
admin = connection.db('admin')
- command = BSON::OrderedHash.new
- command[:sleep] = 1
- command[:secs] = 1
+ command = {:eval => "sleep(500)"}
# Should not timeout
assert admin.command(command)
# Should timeout
- command[:secs] = 3
+ command = {:eval => "sleep(1500)"}
assert_raise Mongo::OperationTimeout do
admin.command(command)
end
View
22 test/replica_set/client_test.rb
@@ -28,6 +28,28 @@ def test_connect_bad_name
end
end
+ def test_reconnect_method_override
+ rescue_connection_failure do
+ @client = MongoReplicaSetClient.new(@rs.repl_set_seeds)
+ end
+
+ MongoReplicaSetClient.any_instance.expects(:connect)
+ MongoClient.any_instance.expects(:connect).never
+ assert_nothing_raised Mongo::ConnectionFailure do
+ @client.reconnect
+ end
+ end
+
+ def test_primary_method_override
+ rescue_connection_failure do
+ @client = MongoReplicaSetClient.new(@rs.repl_set_seeds)
+ end
+
+ MongoReplicaSetClient.any_instance.expects(:read_primary?)
+ MongoClient.any_instance.expects(:read_primary?).never
+ @client.primary?
+ end
+
def test_connect_with_first_secondary_node_terminated
@rs.secondaries.first.stop
View
2 test/tools/mongo_config.rb
@@ -102,7 +102,7 @@ def self.make_mongod(kind, opts)
noprealloc = opts[:noprealloc] || true
smallfiles = opts[:smallfiles] || true
quiet = opts[:quiet] || true
- fast_sync = opts[:fastsync] || true
+ fast_sync = opts[:fastsync] || false
params.merge(
:command => mongod,
Something went wrong with that request. Please try again.