Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Last-Modified and RightAws::Key::exists? changes #9

Open
wants to merge 14 commits into from

2 participants

@kierse

Hi there,

I made three changes:

  1. I fixed a typo in xml_adapter.rb which resulted in an incorrectly named Key element (LastModified)

  2. I added some code to identify and create objects for all keys found within a given bucket.

  3. I updated the s3object class with a last_modified attribute and refactored the file_store, server, and xml_adapter classes to utilize the new parameter

I'm using RightAWS and these fixes were necessary in order to be able to use the last_modified propery and the exists? method on key objects

kierse added some commits
@kierse kierse fixed typo in XML tag name e9fc740
@kierse kierse added lazy bucket key load
when the bucket is opened for the first time (LS_BUCKET), all keys will be
identified and loaded. This fix is needed so that RightAWS::Key object exists?
calls return valid data.
d85b516
@kierse kierse setting creation and modification time on s3 objects
updated xml_adapter and server to use new modified_date attribute when
responding to requests
6180ab4
@kierse kierse used predefined variable 9e9eddb
@kierse kierse new unit tests
added test to check that last modified date is being set for returned key data
added test to check that the exists? method on Right::Aws::Key objects works
as expected
278b12e
@jubos
Owner

These patches look good. Thanks for sending them over Unfortunately, I am getting a test case failure on line 37-38 of right_aws_commands_test.rb
when I test with ruby 1.9.3 and right_aws 3.0.4. On a rake test, I get


rake test TEST=test/right_aws_commands_test.rb TEST_OPTS="--name=/test_exists*/"

Run options: --name=/test_exists*/

Running tests:

E

Finished tests in 0.011442s, 87.3973 tests/s, 0.0000 assertions/s.

1) Error:
test_exists(RightAWSCommandsTest):
ArgumentError: no time information in ""
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:436:in initialize'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:243:in
new'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:243:in block (2 levels) in keys_and_service'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:241:in
each'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:241:in block in keys_and_service'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3_interface.rb:369:in
incrementally_list_bucket'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:239:in keys_and_service'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:225:in
keys'
/Users/curtis/.rvm/gems/ruby-1.9.3-p0/gems/right_aws-3.0.4/lib/s3/right_s3.rb:266:in key'
/Users/curtis/code/fakes3/test/right_aws_commands_test.rb:37:in
test_exists'

@kierse

I'm having trouble reproducing the error. I ran the problematic rake command without incident:

Running ruby 1.9.3-p0 (x86_64), rake 0.9.2.2, and right_aws 3.0.4.

> rake test TEST=test/right_aws_commands_test.rb TEST_OPTS="--name=/test_exists*/"
Run options: --name=/test_exists*/

# Running tests:

.

Finished tests in 0.016760s, 59.6659 tests/s, 119.3317 assertions/s.
1 tests, 2 assertions, 0 failures, 0 errors, 0 skips

Can you give me any more information to help debug?

@kierse

Is there anything else I can do to help move this along? If you're interested in the changes I'd like to get them merged back into the primary repo

@jubos

This is a dangerous call. On the test cases on OSX ruby 1.8.7, I get a "Too many open files limit" hit because the test cases puts a lot of files in one of the buckets. The objects maintain a pointer to the file to allow webrick to do streaming, so this would need to be implemented at a higher level to not actually open the files.

@jubos
Owner

Thanks for checking in on this merge request.

Put a comment on one of the changesets. I still get the failure that I mentioned before, but also stress testing yielded some open file limit issues when the buckets get very large. I think the right way to go about it is to maintain a set of file names in the bucket rather than full scale objects, to avoid hitting this limit.

When I run the test cases I use a vagrant box to do permutations of OSX, Linux, Ruby 1.8.7 and Ruby 1.9.2-1.9.3

@kierse

Thanks for getting back to me. I did some more digging and am now able to reproduce the error you previously reported. Not sure how I didn't see the problem before as it was pretty trivial.

I'll look into the open file limit issue as well. To be honest I thought I was maintaining a list of files, clearly that's not the case.

kierse added some commits
@kierse kierse added missing creation and modification times to object
when a file was uploaded, it was correctly being added to the parent bucket's
file list. However, when the file object was created, no creation or modification
time data was set. This is incorrect and caused an issue for RightAWS
b2554c2
@kierse kierse refactored unit tests to be more clear and properly cleanup after the…
…mselves
b7f0138
@kierse kierse refactored how object metadata is retrieved from disk 26e7c57
@kierse kierse refactored to simplify code that loops over directory b47144b
@kierse kierse refactored RateLimitableFile to avoid creating a file handle
the server was reaching it's alloted file handle limit during a heavy unit test.
When a openning a bucket for the first time, the server creates an object representing
each file found. Part of creating each object involves openning a file handle to the
uploaded content. Refactored to lazily open the file when the content is actually
requested.
13a8c45
@kierse kierse changed variable names to be consistent with codebase c9a1dc9
@kierse kierse refactored the copy_object method to make it simpler 90af615
@kierse kierse Merge remote-tracking branch 'jubos/master'
Conflicts:
	lib/fakes3/file_store.rb
	lib/fakes3/server.rb
9316f4a
@kierse

I made several changes since you last commented

  1. fixed the test error you originally reported

  2. refactored all code in file_store.rb that interacts with the filesystem to avoid manually opening files if at all possible. In cases where files must be opened, I updated the code to ensure they are
    a) open for as short amount of time as possible, and
    b) properly closed

  3. refactored copy_object() to simplify algorithm and reuse existing code where possible.

  4. refactored RateLimitableFile. Whenever a new S3Object was created, a new RateLimitableFile object was created and opened. This behaviour was the source of the open file limit warnings you were seeing. I rewrote the class so that it lazily opens the file when the server requests the object content.

  5. some minor code cleanup

I've run the full suite of tests and am happy to report that everything runs without error, with the exception of the boto_test.rb script (which seems to be failing in the repo as well). One final note: there does not appear to be any test coverage for setting a range on file requests. Now that the RateLimitableFile objects are properly being loaded and used, we might want to introduce some tests to ensure this functionality works as expected.

@jubos
Owner

Kieran,

Thanks for this. I will test it shortly.

-Curtis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Apr 24, 2012
  1. @kierse

    fixed typo in XML tag name

    kierse authored
  2. @kierse

    added lazy bucket key load

    kierse authored
    when the bucket is opened for the first time (LS_BUCKET), all keys will be
    identified and loaded. This fix is needed so that RightAWS::Key object exists?
    calls return valid data.
Commits on Apr 26, 2012
  1. @kierse

    setting creation and modification time on s3 objects

    kierse authored
    updated xml_adapter and server to use new modified_date attribute when
    responding to requests
  2. @kierse

    used predefined variable

    kierse authored
  3. @kierse

    new unit tests

    kierse authored
    added test to check that last modified date is being set for returned key data
    added test to check that the exists? method on Right::Aws::Key objects works
    as expected
Commits on Jun 25, 2012
  1. @kierse
Commits on Jun 26, 2012
  1. @kierse

    added missing creation and modification times to object

    kierse authored
    when a file was uploaded, it was correctly being added to the parent bucket's
    file list. However, when the file object was created, no creation or modification
    time data was set. This is incorrect and caused an issue for RightAWS
  2. @kierse
Commits on Jul 1, 2012
  1. @kierse
  2. @kierse
  3. @kierse

    refactored RateLimitableFile to avoid creating a file handle

    kierse authored
    the server was reaching it's alloted file handle limit during a heavy unit test.
    When a openning a bucket for the first time, the server creates an object representing
    each file found. Part of creating each object involves openning a file handle to the
    uploaded content. Refactored to lazily open the file when the content is actually
    requested.
  4. @kierse
  5. @kierse
Commits on Jul 2, 2012
  1. @kierse

    Merge remote-tracking branch 'jubos/master'

    kierse authored
    Conflicts:
    	lib/fakes3/file_store.rb
    	lib/fakes3/server.rb
This page is out of date. Refresh to see the latest.
View
3  lib/fakes3/bucket.rb
@@ -5,7 +5,7 @@
module FakeS3
class Bucket
- attr_accessor :name,:creation_date,:objects
+ attr_accessor :name,:creation_date,:objects,:opened
def initialize(name,creation_date,objects)
@name = name
@@ -14,6 +14,7 @@ def initialize(name,creation_date,objects)
objects.each do |obj|
@objects.add(obj)
end
+ @opened = false
@mutex = Mutex.new
end
View
87 lib/fakes3/file_store.rb
@@ -53,6 +53,35 @@ def get_bucket(bucket)
@bucket_hash[bucket]
end
+ def get_sorted_object_list(bucket)
+ list = SortedObjectList.new
+ for object in get_objects_under_path(bucket, "")
+ list.add(object)
+ end
+ return list
+ end
+
+ def get_objects_under_path(bucket, path)
+ objects = []
+ current = File.join(@root, bucket.name, path)
+ Dir.entries(current).each do |file|
+ next if file =~ /^\./
+ if path.empty?
+ new_path = file
+ else
+ new_path = File.join(path, file)
+ end
+ if File.directory?(File.join(current, file, SHUCK_METADATA_DIR))
+ objects.push(get_object(bucket.name, new_path, ""))
+ else
+ objects |= get_objects_under_path(bucket, new_path)
+ end
+ end
+
+ return objects
+ end
+ private :get_objects_under_path
+
def create_bucket(bucket)
FileUtils.mkdir_p(File.join(@root,bucket))
bucket_obj = Bucket.new(bucket,Time.now,[])
@@ -71,16 +100,16 @@ def delete_bucket(bucket_name)
@bucket_hash.delete(bucket_name)
end
- def get_object(bucket,object_name, request)
+ def get_object(bucket_name,object_name, request)
begin
real_obj = S3Object.new
- obj_root = File.join(@root,bucket,object_name,SHUCK_METADATA_DIR)
- metadata = YAML.load(File.open(File.join(obj_root,"metadata"),'rb'))
+ obj_root = File.join(@root,bucket_name,object_name,SHUCK_METADATA_DIR)
+ metadata = YAML.load_file(File.join(obj_root,"metadata"))
real_obj.name = object_name
real_obj.md5 = metadata[:md5]
real_obj.content_type = metadata.fetch(:content_type) { "application/octet-stream" }
#real_obj.io = File.open(File.join(obj_root,"content"),'rb')
- real_obj.io = RateLimitableFile.open(File.join(obj_root,"content"),'rb')
+ real_obj.io = RateLimitableFile.new(File.join(obj_root,"content"))
real_obj.size = metadata.fetch(:size) { 0 }
real_obj.creation_date = File.ctime(obj_root).iso8601()
real_obj.modified_date = metadata.fetch(:modified_date) { File.mtime(File.join(obj_root,"content")).iso8601() }
@@ -96,45 +125,22 @@ def object_metadata(bucket,object)
end
def copy_object(src_bucket_name,src_name,dst_bucket_name,dst_name)
- src_root = File.join(@root,src_bucket_name,src_name,SHUCK_METADATA_DIR)
- src_metadata_filename = File.join(src_root,"metadata")
- src_metadata = YAML.load(File.open(src_metadata_filename,'rb').read)
- src_content_filename = File.join(src_root,"content")
-
- dst_filename= File.join(@root,dst_bucket_name,dst_name)
- FileUtils.mkdir_p(dst_filename)
-
- metadata_dir = File.join(dst_filename,SHUCK_METADATA_DIR)
- FileUtils.mkdir_p(metadata_dir)
-
- content = File.join(metadata_dir,"content")
- metadata = File.join(metadata_dir,"metadata")
+ obj = nil
+ if src_bucket_name == dst_bucket_name && src_name == dst_name
+ # source and destination are the same, nothing to do but
+ # find current object so it can be returned
+ obj = src_bucket.find(src_name)
+ else
+ src_root = File.join(@root,src_bucket_name,src_name,SHUCK_METADATA_DIR)
+ dst_root = File.join(@root,dst_bucket_name,dst_name,SHUCK_METADATA_DIR)
- File.open(content,'wb') do |f|
- File.open(src_content_filename,'rb') do |input|
- f << input.read
- end
- end
+ FileUtils.mkdir_p(dst_root)
+ FileUtils.copy_file(File.join(src_root,"content"),File.join(dst_root,"content"))
+ FileUtils.copy_file(File.join(src_root,"metadata"), File.join(dst_root,"metadata"))
- File.open(metadata,'w') do |f|
- File.open(src_metadata_filename,'r') do |input|
- f << input.read
- end
+ dst_bucket = self.get_bucket(dst_bucket_name)
+ dst_bucket.add(get_object(dst_bucket.name, dst_name, ""))
end
-
- src_bucket = self.get_bucket(src_bucket_name)
- dst_bucket = self.get_bucket(dst_bucket_name)
-
- obj = S3Object.new
- obj.name = dst_name
- obj.md5 = src_metadata[:md5]
- obj.content_type = src_metadata[:content_type]
- obj.size = src_metadata[:size]
- obj.modified_date = src_metadata[:modified_date]
-
- src_obj = src_bucket.find(src_name)
- dst_bucket.add(obj)
- src_bucket.remove(src_obj)
return obj
end
@@ -174,6 +180,7 @@ def store_object(bucket,object_name,request)
obj.md5 = metadata_struct[:md5]
obj.content_type = metadata_struct[:content_type]
obj.size = metadata_struct[:size]
+ obj.creation_date = File.ctime(metadata_dir)
obj.modified_date = metadata_struct[:modified_date]
bucket.add(obj)
View
23 lib/fakes3/rate_limitable_file.rb
@@ -1,5 +1,12 @@
module FakeS3
- class RateLimitableFile < File
+ class RateLimitableFile
+
+ attr_reader :path, :pos
+ def initialize(path, pos = nil)
+ @path = path
+ @pos = pos
+ end
+
@@rate_limit = nil
# Specify a rate limit in bytes per second
def self.rate_limit
@@ -10,12 +17,22 @@ def self.rate_limit=(rate_limit)
@@rate_limit = rate_limit
end
- def read(args)
+ def read(args = nil)
if @@rate_limit
time_to_sleep = args / @@rate_limit
sleep(time_to_sleep)
end
- return super(args)
+
+ return File.open(@path) do |file|
+ if !pos.nil?
+ file.pos = @pos
+ end
+ if args.nil?
+ file.read
+ else
+ file.read(args)
+ end
+ end
end
end
end
View
14 lib/fakes3/server.rb
@@ -58,6 +58,10 @@ def do_GET(request, response)
when 'LS_BUCKET'
bucket_obj = @store.get_bucket(s_req.bucket)
if bucket_obj
+ unless bucket_obj.opened
+ bucket_obj.objects = @store.get_sorted_object_list(bucket_obj)
+ bucket_obj.opened = true
+ end
response.status = 200
response['Content-Type'] = "application/xml"
query = {
@@ -117,12 +121,12 @@ def do_GET(request, response)
return
end
end
- response['Content-Length'] = File::Stat.new(real_obj.io.path).size
+ response['Content-Length'] = content_length
response['Last-Modified'] = real_obj.modified_date
if s_req.http_verb == 'HEAD'
response.body = ""
else
- response.body = real_obj.io
+ response.body = real_obj.io.read()
end
end
end
@@ -132,7 +136,9 @@ def do_PUT(request,response)
case s_req.type
when Request::COPY
+ bucket = @store.get_bucket(s_req.src_bucket)
@store.copy_object(s_req.src_bucket,s_req.src_object,s_req.bucket,s_req.object)
+ @store.delete_object(bucket,s_req.src_object, "")
when Request::STORE
bucket_obj = @store.get_bucket(s_req.bucket)
if !bucket_obj
@@ -160,8 +166,8 @@ def do_DELETE(request,response)
case s_req.type
when Request::DELETE_OBJECT
- bucket_obj = @store.get_bucket(s_req.bucket)
- @store.delete_object(bucket_obj,s_req.object,s_req.webrick_request)
+ bucket = @store.get_bucket(s_req.bucket)
+ @store.delete_object(bucket,s_req.object,s_req.webrick_request)
when Request::DELETE_BUCKET
@store.delete_bucket(s_req.bucket)
end
View
43 test/right_aws_commands_test.rb
@@ -6,57 +6,74 @@
class RightAWSCommandsTest < Test::Unit::TestCase
def setup
- @s3 = RightAws::S3Interface.new('1E3GDYEOGFJPIT7XXXXXX','hgTHt68JY07JKUY08ftHYtERkjgtfERn57XXXXXX',
+ @s3 = RightAws::S3.new('1E3GDYEOGFJPIT7XXXXXX','hgTHt68JY07JKUY08ftHYtERkjgtfERn57XXXXXX',
{:multi_thread => false, :server => 'localhost',
:port => 10453, :protocol => 'http',:logger => Logger.new("/dev/null"),:no_subdomains => true })
end
def teardown
+ @s3.interface.delete("s3media", "last_modified")
+ @s3.interface.delete("s3media", "exists")
end
def test_create_bucket
- bucket = @s3.create_bucket("s3media")
+ bucket = @s3.interface.create_bucket("s3media")
assert_not_nil bucket
end
def test_store
- @s3.put("s3media","helloworld","Hello World Man!")
- obj = @s3.get("s3media","helloworld")
+ @s3.interface.put("s3media","helloworld","Hello World Man!")
+ obj = @s3.interface.get("s3media","helloworld")
assert_equal "Hello World Man!",obj[:object]
- obj = @s3.get("s3media","helloworld")
+ obj = @s3.interface.get("s3media","helloworld")
+ end
+
+ def test_header_last_modified
+ @s3.interface.put("s3media","last_modified","foo")
+ obj = @s3.interface.get("s3media","last_modified")
+ assert_not_nil obj[:headers]["last-modified"]
+ end
+
+ def test_exists
+ key = @s3.bucket("s3media").key("exists")
+ assert !key.exists?
+
+ key.data = 'foo'
+ key.put
+ assert key.exists?
end
def test_large_store
- @s3.put("s3media","helloworld","Hello World Man!")
+ @s3.interface.put("s3media","helloworld","Hello World Man!")
buffer = ""
500000.times do
buffer << "#{(rand * 100).to_i}"
end
buf_len = buffer.length
- @s3.put("s3media","big",buffer)
+ @s3.interface.put("s3media","big",buffer)
output = ""
- @s3.get("s3media","big") do |chunk|
+ @s3.interface.get("s3media","big") do |chunk|
output << chunk
end
assert_equal buf_len,output.size
end
def test_multi_directory
- @s3.put("s3media","dir/right/123.txt","recursive")
+ @s3.interface.put("s3media","dir/right/123.txt","recursive")
output = ""
- obj = @s3.get("s3media","dir/right/123.txt") do |chunk|
+ obj = @s3.interface.get("s3media","dir/right/123.txt") do |chunk|
output << chunk
end
assert_equal "recursive", output
end
def test_intra_bucket_copy
- @s3.put("s3media","original.txt","Hello World")
- @s3.copy("s3media","original.txt","s3media","copy.txt")
- obj = @s3.get("s3media","copy.txt")
+ @s3.interface.put("s3media","original.txt","Hello World")
+ @s3.interface.copy("s3media","original.txt","s3media","copy.txt")
+ obj = @s3.interface.get("s3media","copy.txt")
assert_equal "Hello World",obj[:object]
end
Something went wrong with that request. Please try again.