Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Moving aws-s3 into git

  • Loading branch information...
commit d390edeea995328e71f92dd3f380cd1a890c13d5 0 parents
@marcel marcel authored
Showing with 8,664 additions and 0 deletions.
  1. +67 −0 CHANGELOG
  2. +19 −0 COPYING
  3. +55 −0 INSTALL
  4. +545 −0 README
  5. +58 −0 README.erb
  6. +328 −0 Rakefile
  7. +26 −0 TODO
  8. +6 −0 bin/s3sh
  9. +10 −0 bin/setup.rb
  10. +61 −0 lib/aws/s3.rb
  11. +636 −0 lib/aws/s3/acl.rb
  12. +218 −0 lib/aws/s3/authentication.rb
  13. +232 −0 lib/aws/s3/base.rb
  14. +58 −0 lib/aws/s3/bittorrent.rb
  15. +320 −0 lib/aws/s3/bucket.rb
  16. +314 −0 lib/aws/s3/connection.rb
  17. +69 −0 lib/aws/s3/error.rb
  18. +133 −0 lib/aws/s3/exceptions.rb
  19. +323 −0 lib/aws/s3/extensions.rb
  20. +306 −0 lib/aws/s3/logging.rb
  21. +610 −0 lib/aws/s3/object.rb
  22. +44 −0 lib/aws/s3/owner.rb
  23. +99 −0 lib/aws/s3/parsing.rb
  24. +180 −0 lib/aws/s3/response.rb
  25. +51 −0 lib/aws/s3/service.rb
  26. +12 −0 lib/aws/s3/version.rb
  27. +41 −0 site/index.erb
  28. BIN  site/public/images/box-and-gem.gif
  29. BIN  site/public/images/favicon.ico
  30. +18 −0 site/public/ruby.css
  31. +99 −0 site/public/screen.css
  32. +18 −0 support/faster-xml-simple/COPYING
  33. +8 −0 support/faster-xml-simple/README
  34. +54 −0 support/faster-xml-simple/Rakefile
  35. +187 −0 support/faster-xml-simple/lib/faster_xml_simple.rb
  36. +4 −0 support/faster-xml-simple/test/fixtures/test-1.rails.yml
  37. +3 −0  support/faster-xml-simple/test/fixtures/test-1.xml
  38. +4 −0 support/faster-xml-simple/test/fixtures/test-1.yml
  39. +6 −0 support/faster-xml-simple/test/fixtures/test-2.rails.yml
  40. +3 −0  support/faster-xml-simple/test/fixtures/test-2.xml
  41. +6 −0 support/faster-xml-simple/test/fixtures/test-2.yml
  42. +6 −0 support/faster-xml-simple/test/fixtures/test-3.rails.yml
  43. +5 −0 support/faster-xml-simple/test/fixtures/test-3.xml
  44. +6 −0 support/faster-xml-simple/test/fixtures/test-3.yml
  45. +5 −0 support/faster-xml-simple/test/fixtures/test-4.rails.yml
  46. +7 −0 support/faster-xml-simple/test/fixtures/test-4.xml
  47. +5 −0 support/faster-xml-simple/test/fixtures/test-4.yml
  48. +8 −0 support/faster-xml-simple/test/fixtures/test-5.rails.yml
  49. +7 −0 support/faster-xml-simple/test/fixtures/test-5.xml
  50. +8 −0 support/faster-xml-simple/test/fixtures/test-5.yml
  51. +43 −0 support/faster-xml-simple/test/fixtures/test-6.rails.yml
  52. +29 −0 support/faster-xml-simple/test/fixtures/test-6.xml
  53. +41 −0 support/faster-xml-simple/test/fixtures/test-6.yml
  54. +23 −0 support/faster-xml-simple/test/fixtures/test-7.rails.yml
  55. +22 −0 support/faster-xml-simple/test/fixtures/test-7.xml
  56. +22 −0 support/faster-xml-simple/test/fixtures/test-7.yml
  57. +14 −0 support/faster-xml-simple/test/fixtures/test-8.rails.yml
  58. +8 −0 support/faster-xml-simple/test/fixtures/test-8.xml
  59. +11 −0 support/faster-xml-simple/test/fixtures/test-8.yml
  60. +47 −0 support/faster-xml-simple/test/regression_test.rb
  61. +17 −0 support/faster-xml-simple/test/test_helper.rb
  62. +46 −0 support/faster-xml-simple/test/xml_simple_comparison_test.rb
  63. +211 −0 support/rdoc/code_info.rb
  64. +254 −0 test/acl_test.rb
  65. +96 −0 test/authentication_test.rb
  66. +143 −0 test/base_test.rb
  67. +48 −0 test/bucket_test.rb
  68. +190 −0 test/connection_test.rb
  69. +75 −0 test/error_test.rb
  70. +331 −0 test/extensions_test.rb
  71. +89 −0 test/fixtures.rb
  72. +102 −0 test/fixtures/buckets.yml
  73. +34 −0 test/fixtures/errors.yml
  74. +3 −0  test/fixtures/headers.yml
  75. +15 −0 test/fixtures/logging.yml
  76. +5 −0 test/fixtures/loglines.yml
  77. +7 −0 test/fixtures/logs.yml
  78. +16 −0 test/fixtures/policies.yml
  79. +89 −0 test/logging_test.rb
  80. +89 −0 test/mocks/base.rb
  81. +217 −0 test/object_test.rb
  82. +66 −0 test/parsing_test.rb
  83. +117 −0 test/remote/acl_test.rb
  84. +45 −0 test/remote/bittorrent_test.rb
  85. +146 −0 test/remote/bucket_test.rb
  86. +82 −0 test/remote/logging_test.rb
  87. +371 −0 test/remote/object_test.rb
  88. BIN  test/remote/test_file.data
  89. +30 −0 test/remote/test_helper.rb
  90. +70 −0 test/response_test.rb
  91. +26 −0 test/service_test.rb
  92. +86 −0 test/test_helper.rb
67 CHANGELOG
@@ -0,0 +1,67 @@
+trunk:
+
+0.4.0:
+
+- Various adjustments to connection handling to try to mitigate exceptions raised from deep within Net::HTTP.
+
+- Don't coerce numbers that start with a zero because the zero will be lost. If a bucket, for example, has a name like '0815', all operation it will fail. Closes ticket #10089 [reported anonymously]"
+
+- Add ability to connect through a proxy using the :proxy option when establishing a connection. Suggested by [Simon Horne <simon@soulware.co.uk>]
+
+- Add :authenticated option to url_for. When passing false, don't generate signature parameters for query string.
+
+- Make url_for accept custom port settings. [Rich Olson]
+
+0.3.0:
+
+- Ensure content type is eventually set to account for changes made to Net::HTTP in Ruby version 1.8.5. Reported by [David Hanson, Stephen Caudill, Tom Mornini <tmornini@engineyard.com>]
+
+- Add :persistent option to connections which keeps a persistent connection rather than creating a new one per request, defaulting to true. Based on a patch by [Metalhead <metalhead@metalhead.ws>]
+
+- If we are retrying a request after rescuing one of the retry exceptions, rewind the body if its an IO stream so it starts at the beginning. [Jamis Buck]
+
+- Ensure that all paths being submitted to S3 are valid utf8. If they are not, we remove the extended characters. Ample help from [Jamis Buck]
+
+- Wrap logs in Log objects which exposes each line as a Log::Line that has accessors by name for each field.
+
+- Various performance optimizations for the extensions code. [Roman LE NEGRATE <roman2k@free.fr>]
+
+- Make S3Object.copy more efficient by streaming in both directions in parallel.
+
+- Open up Net:HTTPGenericRequest to make the chunk size 1 megabyte, up from 1 kilobyte.
+
+- Add S3Object.exists?
+
+0.2.1:
+
+- When the bucket name argument (for e.g. Bucket.objects) is being used as the option hash, reassign it to the options variable and set the bucket to nil so bucket inference + options works.
+
+- Don't call CGI.escape on query string parameters in Hash#to_query_string since all paths get passed through URI.escape right before the request is made. Paths were getting double escaped. Bug spotted by [David Hanson]
+
+- Make s3sh exec irb.bat if on Windows. Bug spotted by [N. Sathish Kumar <nsathishk@yahoo.com>]
+
+- Avoid class_variable_(get|set) since it was only recently added to Ruby. Spotted by [N. Sathish Kumar <nsathishk@yahoo.com>]
+
+- Raise NoSuchKey if S3Object.about requests a key that does not exist.
+
+- If the response body is an empty string, don't try to parse it as xml.
+
+- Don't reject every body type save for IO and String at the door when making a request. Suggested by [Alex MacCaw <maccman@gmail.com>]
+
+- Allow dots in bucket names. [Jesse Newland]
+
+0.2.0:
+
+- Infer content type for an object when calling S3Object.store without explicitly passing in the :content_type option.
+
+0.1.2:
+
+- Scrap (overly) fancy generator based version of CoercibleString with a much simpler and clearer case statement. Continuations are really slow and the specific use of the generator was leaking memory. Bug spotted by [Remco van't Veer]
+
+0.1.1:
+
+- Don't add the underscore method to String if it is already defined (like, for example, from ActiveSupport). Bug spotted by [Matt White <stockliasteroid@gmail.com>]
+
+0.1.0:
+
+- Initial public release
19 COPYING
@@ -0,0 +1,19 @@
+#
+# Copyright (c) 2006-2007 Marcel Molina Jr. <marcel@vernix.org>
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy of
+# this software and associated documentation files (the "Software"), to deal in the
+# Software without restriction, including without limitation the rights to use,
+# copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the
+# Software, and to permit persons to whom the Software is furnished to do so,
+# subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
+# FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
+# COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN
+# AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
55 INSTALL
@@ -0,0 +1,55 @@
+== Rubygems
+
+The easiest way to install aws/s3 is with Rubygems:
+
+ % sudo gem i aws-s3 -ry
+
+== Directly from svn
+
+ % svn co svn://rubyforge.org/var/svn/amazon/s3/trunk aws
+
+== As a Rails plugin
+
+If you want to use aws/s3 with a Rails application, you can export the repository
+into your plugins directory and then check it in:
+
+ % cd my-rails-application/vendor/plugins
+ % svn export svn://rubyforge.org/var/svn/amazon/s3/trunk aws
+ % svn add aws
+
+Or you could pull it down with an svn:externals:
+
+ % cd my-rails-application/vendor/plugins
+ % svn propedit svn:externals .
+
+Then add the following line, save and exit:
+
+ aws svn://rubyforge.org/var/svn/amazon/s3/trunk
+
+If you go the svn route, be sure that you have all the dependencies installed. The list of dependencies follow.
+
+== Dependencies
+
+AWS::S3 requires Ruby 1.8.4 or greater.
+
+It also has the following dependencies:
+
+ sudo gem i xml-simple -ry
+ sudo gem i builder -ry
+ sudo gem i mime-types -ry
+
+=== XML parsing (xml-simple)
+
+AWS::S3 depends on XmlSimple (http://xml-simple.rubyforge.org/). When installing aws/s3 with
+Rubygems, this dependency will be taken care of for you. Otherwise, installation instructions are listed on the xml-simple
+site.
+
+If your system has the Ruby libxml bindings installed (http://libxml.rubyforge.org/) they will be used instead of REXML (which is what XmlSimple uses). For those concerned with speed and efficiency, it would behoove you to install libxml (instructions here: http://libxml.rubyforge.org/install.html) as it is considerably faster and less expensive than REXML.
+
+=== XML generation (builder)
+
+AWS::S3 also depends on the Builder library (http://builder.rubyforge.org/ and http://rubyforge.org/projects/builder/). This will also automatically be installed for you when using Rubygems.
+
+=== Content type inference (mime-types)
+
+AWS::S3 depends on the MIME::Types library (http://mime-types.rubyforge.org/) to infer the content type of an object that does not explicitly specify it. This library will automatically be installed for you when using Rubygems.
545 README
@@ -0,0 +1,545 @@
+= AWS::S3
+
+AWS::S3 is a Ruby library for Amazon's Simple Storage Service's REST API (http://aws.amazon.com/s3).
+Full documentation of the currently supported API can be found at http://docs.amazonwebservices.com/AmazonS3/2006-03-01.
+
+== Getting started
+
+To get started you need to require 'aws/s3':
+
+ % irb -rubygems
+ irb(main):001:0> require 'aws/s3'
+ # => true
+
+The AWS::S3 library ships with an interactive shell called <tt>s3sh</tt>. From within it, you have access to all the operations the library exposes from the command line.
+
+ % s3sh
+ >> Version
+
+Before you can do anything, you must establish a connection using Base.establish_connection!. A basic connection would look something like this:
+
+ AWS::S3::Base.establish_connection!(
+ :access_key_id => 'abc',
+ :secret_access_key => '123'
+ )
+
+The minimum connection options that you must specify are your access key id and your secret access key.
+
+(If you don't already have your access keys, all you need to sign up for the S3 service is an account at Amazon. You can sign up for S3 and get access keys by visiting http://aws.amazon.com/s3.)
+
+For convenience, if you set two special environment variables with the value of your access keys, the console will automatically create a default connection for you. For example:
+
+ % cat .amazon_keys
+ export AMAZON_ACCESS_KEY_ID='abcdefghijklmnop'
+ export AMAZON_SECRET_ACCESS_KEY='1234567891012345'
+
+Then load it in your shell's rc file.
+
+ % cat .zshrc
+ if [[ -f "$HOME/.amazon_keys" ]]; then
+ source "$HOME/.amazon_keys";
+ fi
+
+See more connection details at AWS::S3::Connection::Management::ClassMethods.
+
+
+== AWS::S3 Basics
+=== The service, buckets and objects
+
+The three main concepts of S3 are the service, buckets and objects.
+
+==== The service
+
+The service lets you find out general information about your account, like what buckets you have.
+
+ Service.buckets
+ # => []
+
+
+==== Buckets
+
+Buckets are containers for objects (the files you store on S3). To create a new bucket you just specify its name.
+
+ # Pick a unique name, or else you'll get an error
+ # if the name is already taken.
+ Bucket.create('jukebox')
+
+Bucket names must be unique across the entire S3 system, sort of like domain names across the internet. If you try
+to create a bucket with a name that is already taken, you will get an error.
+
+Assuming the name you chose isn't already taken, your new bucket will now appear in the bucket list:
+
+ Service.buckets
+ # => [#<AWS::S3::Bucket @attributes={"name"=>"jukebox"}>]
+
+Once you have succesfully created a bucket you can you can fetch it by name using Bucket.find.
+
+ music_bucket = Bucket.find('jukebox')
+
+The bucket that is returned will contain a listing of all the objects in the bucket.
+
+ music_bucket.objects.size
+ # => 0
+
+If all you are interested in is the objects of the bucket, you can get to them directly using Bucket.objects.
+
+ Bucket.objects('jukebox').size
+ # => 0
+
+By default all objects will be returned, though there are several options you can use to limit what is returned, such as
+specifying that only objects whose name is after a certain place in the alphabet be returned, and etc. Details about these options can
+be found in the documentation for Bucket.find.
+
+To add an object to a bucket you specify the name of the object, its value, and the bucket to put it in.
+
+ file = 'black-flowers.mp3'
+ S3Object.store(file, open(file), 'jukebox')
+
+You'll see your file has been added to it:
+
+ music_bucket.objects
+ # => [#<AWS::S3::S3Object '/jukebox/black-flowers.mp3'>]
+
+You can treat your bucket like a hash and access objects by name:
+
+ jukebox['black-flowers.mp3']
+ # => #<AWS::S3::S3Object '/jukebox/black-flowers.mp3'>
+
+In the event that you want to delete a bucket, you can use Bucket.delete.
+
+ Bucket.delete('jukebox')
+
+Keep in mind, like unix directories, you can not delete a bucket unless it is empty. Trying to delete a bucket
+that contains objects will raise a BucketNotEmpty exception.
+
+Passing the :force => true option to delete will take care of deleting all the bucket's objects for you.
+
+ Bucket.delete('photos', :force => true)
+ # => true
+
+
+==== Objects
+
+S3Objects represent the data you store on S3. They have a key (their name) and a value (their data). All objects belong to a
+bucket.
+
+You can store an object on S3 by specifying a key, its data and the name of the bucket you want to put it in:
+
+ S3Object.store('me.jpg', open('headshot.jpg'), 'photos')
+
+The content type of the object will be inferred by its extension. If the appropriate content type can not be inferred, S3 defaults
+to <tt>binary/octect-stream</tt>.
+
+If you want to override this, you can explicitly indicate what content type the object should have with the <tt>:content_type</tt> option:
+
+ file = 'black-flowers.m4a'
+ S3Object.store(
+ file,
+ open(file),
+ 'jukebox',
+ :content_type => 'audio/mp4a-latm'
+ )
+
+You can read more about storing files on S3 in the documentation for S3Object.store.
+
+If you just want to fetch an object you've stored on S3, you just specify its name and its bucket:
+
+ picture = S3Object.find 'headshot.jpg', 'photos'
+
+N.B. The actual data for the file is not downloaded in both the example where the file appeared in the bucket and when fetched directly.
+You get the data for the file like this:
+
+ picture.value
+
+You can fetch just the object's data directly:
+
+ S3Object.value 'headshot.jpg', 'photos'
+
+Or stream it by passing a block to <tt>stream</tt>:
+
+ open('song.mp3', 'w') do |file|
+ S3Object.stream('song.mp3', 'jukebox') do |chunk|
+ file.write chunk
+ end
+ end
+
+The data of the file, once download, is cached, so subsequent calls to <tt>value</tt> won't redownload the file unless you
+tell the object to reload its <tt>value</tt>:
+
+ # Redownloads the file's data
+ song.value(:reload)
+
+Other functionality includes:
+
+ # Check if an object exists?
+ S3Object.exists? 'headshot.jpg', 'photos'
+
+ # Copying an object
+ S3Object.copy 'headshot.jpg', 'headshot2.jpg', 'photos'
+
+ # Renaming an object
+ S3Object.rename 'headshot.jpg', 'portrait.jpg', 'photos'
+
+ # Deleting an object
+ S3Object.delete 'headshot.jpg', 'photos'
+
+==== More about objects and their metadata
+
+You can find out the content type of your object with the <tt>content_type</tt> method:
+
+ song.content_type
+ # => "audio/mpeg"
+
+You can change the content type as well if you like:
+
+ song.content_type = 'application/pdf'
+ song.store
+
+(Keep in mind that due to limitiations in S3's exposed API, the only way to change things like the content_type
+is to PUT the object onto S3 again. In the case of large files, this will result in fully re-uploading the file.)
+
+A bevie of information about an object can be had using the <tt>about</tt> method:
+
+ pp song.about
+ {"last-modified" => "Sat, 28 Oct 2006 21:29:26 GMT",
+ "content-type" => "binary/octect-stream",
+ "etag" => "\"dc629038ffc674bee6f62eb64ff3a\"",
+ "date" => "Sat, 28 Oct 2006 21:30:41 GMT",
+ "x-amz-request-id" => "B7BC68F55495B1C8",
+ "server" => "AmazonS3",
+ "content-length" => "3418766"}
+
+You can get and set metadata for an object:
+
+ song.metadata
+ # => {}
+ song.metadata[:album] = "A River Ain't Too Much To Love"
+ # => "A River Ain't Too Much To Love"
+ song.metadata[:released] = 2005
+ pp song.metadata
+ {"x-amz-meta-released" => 2005,
+ "x-amz-meta-album" => "A River Ain't Too Much To Love"}
+ song.store
+
+That metadata will be saved in S3 and is hence forth available from that object:
+
+ song = S3Object.find('black-flowers.mp3', 'jukebox')
+ pp song.metadata
+ {"x-amz-meta-released" => "2005",
+ "x-amz-meta-album" => "A River Ain't Too Much To Love"}
+ song.metada[:released]
+ # => "2005"
+ song.metada[:released] = 2006
+ pp song.metada
+ {"x-amz-meta-released" => 2006,
+ "x-amz-meta-album" => "A River Ain't Too Much To Love"}
+
+
+==== Streaming uploads
+
+When storing an object on the S3 servers using S3Object.store, the <tt>data</tt> argument can be a string or an I/O stream.
+If <tt>data</tt> is an I/O stream it will be read in segments and written to the socket incrementally. This approach
+may be desirable for very large files so they are not read into memory all at once.
+
+ # Non streamed upload
+ S3Object.store('greeting.txt', 'hello world!', 'marcel')
+
+ # Streamed upload
+ S3Object.store('roots.mpeg', open('roots.mpeg'), 'marcel')
+
+
+== Setting the current bucket
+==== Scoping operations to a specific bucket
+
+If you plan on always using a specific bucket for certain files, you can skip always having to specify the bucket by creating
+a subclass of Bucket or S3Object and telling it what bucket to use:
+
+ class JukeBoxSong < AWS::S3::S3Object
+ set_current_bucket_to 'jukebox'
+ end
+
+For all methods that take a bucket name as an argument, the current bucket will be used if the bucket name argument is omitted.
+
+ other_song = 'baby-please-come-home.mp3'
+ JukeBoxSong.store(other_song, open(other_song))
+
+This time we didn't have to explicitly pass in the bucket name, as the JukeBoxSong class knows that it will
+always use the 'jukebox' bucket.
+
+"Astute readers", as they say, may have noticed that we used the third parameter to pass in the content type,
+rather than the fourth parameter as we had the last time we created an object. If the bucket can be inferred, or
+is explicitly set, as we've done in the JukeBoxSong class, then the third argument can be used to pass in
+options.
+
+Now all operations that would have required a bucket name no longer do.
+
+ other_song = JukeBoxSong.find('baby-please-come-home.mp3')
+
+
+== BitTorrent
+==== Another way to download large files
+
+Objects on S3 can be distributed via the BitTorrent file sharing protocol.
+
+You can get a torrent file for an object by calling <tt>torrent_for</tt>:
+
+ S3Object.torrent_for 'kiss.jpg', 'marcel'
+
+Or just call the <tt>torrent</tt> method if you already have the object:
+
+ song = S3Object.find 'kiss.jpg', 'marcel'
+ song.torrent
+
+Calling <tt>grant_torrent_access_to</tt> on a object will allow anyone to anonymously
+fetch the torrent file for that object:
+
+ S3Object.grant_torrent_access_to 'kiss.jpg', 'marcel'
+
+Anonymous requests to
+
+ http://s3.amazonaws.com/marcel/kiss.jpg?torrent
+
+will serve up the torrent file for that object.
+
+
+== Access control
+==== Using canned access control policies
+
+By default buckets are private. This means that only the owner has access rights to the bucket and its objects.
+Objects in that bucket inherit the permission of the bucket unless otherwise specified. When an object is private, the owner can
+generate a signed url that exposes the object to anyone who has that url. Alternatively, buckets and objects can be given other
+access levels. Several canned access levels are defined:
+
+* <tt>:private</tt> - Owner gets FULL_CONTROL. No one else has any access rights. This is the default.
+* <tt>:public_read</tt> - Owner gets FULL_CONTROL and the anonymous principal is granted READ access. If this policy is used on an object, it can be read from a browser with no authentication.
+* <tt>:public_read_write</tt> - Owner gets FULL_CONTROL, the anonymous principal is granted READ and WRITE access. This is a useful policy to apply to a bucket, if you intend for any anonymous user to PUT objects into the bucket.
+* <tt>:authenticated_read</tt> - Owner gets FULL_CONTROL, and any principal authenticated as a registered Amazon S3 user is granted READ access.
+
+You can set a canned access level when you create a bucket or an object by using the <tt>:access</tt> option:
+
+ S3Object.store(
+ 'kiss.jpg',
+ data,
+ 'marcel',
+ :access => :public_read
+ )
+
+Since the image we created is publicly readable, we can access it directly from a browser by going to the corresponding bucket name
+and specifying the object's key without a special authenticated url:
+
+ http://s3.amazonaws.com/marcel/kiss.jpg
+
+==== Building custum access policies
+
+For both buckets and objects, you can use the <tt>acl</tt> method to see its access control policy:
+
+ policy = S3Object.acl('kiss.jpg', 'marcel')
+ pp policy.grants
+ [#<AWS::S3::ACL::Grant FULL_CONTROL to noradio>,
+ #<AWS::S3::ACL::Grant READ to AllUsers Group>]
+
+Policies are made up of one or more grants which grant a specific permission to some grantee. Here we see the default FULL_CONTROL grant
+to the owner of this object. There is also READ permission granted to the Allusers Group, which means anyone has read access for the object.
+
+Say we wanted to grant access to anyone to read the access policy of this object. The current READ permission only grants them permission to read
+the object itself (for example, from a browser) but it does not allow them to read the access policy. For that we will need to grant the AllUsers group the READ_ACP permission.
+
+First we'll create a new grant object:
+
+ grant = ACL::Grant.new
+ # => #<AWS::S3::ACL::Grant (permission) to (grantee)>
+ grant.permission = 'READ_ACP'
+
+Now we need to indicate who this grant is for. In other words, who the grantee is:
+
+ grantee = ACL::Grantee.new
+ # => #<AWS::S3::ACL::Grantee (xsi not set yet)>
+
+There are three ways to specify a grantee: 1) by their internal amazon id, such as the one returned with an object's Owner,
+2) by their Amazon account email address or 3) by specifying a group. As of this writing you can not create custom groups, but
+Amazon does provide three already: AllUsers, Authenticated and LogDelivery. In this case we want to provide the grant to all users.
+This effectively means "anyone".
+
+ grantee.group = 'AllUsers'
+
+Now that our grantee is setup, we'll associate it with the grant:
+
+ grant.grantee = grantee
+ grant
+ # => #<AWS::S3::ACL::Grant READ_ACP to AllUsers Group>
+
+Are grant has all the information we need. Now that it's ready, we'll add it on to the object's access control policy's list of grants:
+
+ policy.grants << grant
+ pp policy.grants
+ [#<AWS::S3::ACL::Grant FULL_CONTROL to noradio>,
+ #<AWS::S3::ACL::Grant READ to AllUsers Group>,
+ #<AWS::S3::ACL::Grant READ_ACP to AllUsers Group>]
+
+Now that the policy has the new grant, we reuse the <tt>acl</tt> method to persist the policy change:
+
+ S3Object.acl('kiss.jpg', 'marcel', policy)
+
+If we fetch the object's policy again, we see that the grant has been added:
+
+ pp S3Object.acl('kiss.jpg', 'marcel').grants
+ [#<AWS::S3::ACL::Grant FULL_CONTROL to noradio>,
+ #<AWS::S3::ACL::Grant READ to AllUsers Group>,
+ #<AWS::S3::ACL::Grant READ_ACP to AllUsers Group>]
+
+If we were to access this object's acl url from a browser:
+
+ http://s3.amazonaws.com/marcel/kiss.jpg?acl
+
+we would be shown its access control policy.
+
+==== Pre-prepared grants
+
+Alternatively, the ACL::Grant class defines a set of stock grant policies that you can fetch by name. In most cases, you can
+just use one of these pre-prepared grants rather than building grants by hand. Two of these stock policies are <tt>:public_read</tt>
+and <tt>:public_read_acp</tt>, which happen to be the two grants that we built by hand above. In this case we could have simply written:
+
+ policy.grants << ACL::Grant.grant(:public_read)
+ policy.grants << ACL::Grant.grant(:public_read_acp)
+ S3Object.acl('kiss.jpg', 'marcel', policy)
+
+The full details can be found in ACL::Policy, ACL::Grant and ACL::Grantee.
+
+
+==== Accessing private objects from a browser
+
+All private objects are accessible via an authenticated GET request to the S3 servers. You can generate an
+authenticated url for an object like this:
+
+ S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
+
+By default authenticated urls expire 5 minutes after they were generated.
+
+Expiration options can be specified either with an absolute time since the epoch with the <tt>:expires</tt> options,
+or with a number of seconds relative to now with the <tt>:expires_in</tt> options:
+
+ # Absolute expiration date
+ # (Expires January 18th, 2038)
+ doomsday = Time.mktime(2038, 1, 18).to_i
+ S3Object.url_for('beluga_baby.jpg',
+ 'marcel',
+ :expires => doomsday)
+
+ # Expiration relative to now specified in seconds
+ # (Expires in 3 hours)
+ S3Object.url_for('beluga_baby.jpg',
+ 'marcel',
+ :expires_in => 60 * 60 * 3)
+
+You can specify whether the url should go over SSL with the <tt>:use_ssl</tt> option:
+
+ # Url will use https protocol
+ S3Object.url_for('beluga_baby.jpg',
+ 'marcel',
+ :use_ssl => true)
+
+By default, the ssl settings for the current connection will be used.
+
+If you have an object handy, you can use its <tt>url</tt> method with the same objects:
+
+ song.url(:expires_in => 30)
+
+To get an unauthenticated url for the object, such as in the case
+when the object is publicly readable, pass the
+<tt>:authenticated</tt> option with a value of <tt>false</tt>.
+
+ S3Object.url_for('beluga_baby.jpg',
+ 'marcel',
+ :authenticated => false)
+ # => http://s3.amazonaws.com/marcel/beluga_baby.jpg
+
+
+== Logging
+==== Tracking requests made on a bucket
+
+A bucket can be set to log the requests made on it. By default logging is turned off. You can check if a bucket has logging enabled:
+
+ Bucket.logging_enabled_for? 'jukebox'
+ # => false
+
+Enabling it is easy:
+
+ Bucket.enable_logging_for('jukebox')
+
+Unless you specify otherwise, logs will be written to the bucket you want to log. The logs are just like any other object. By default they will start with the prefix 'log-'. You can customize what bucket you want the logs to be delivered to, as well as customize what the log objects' key is prefixed with by setting the <tt>target_bucket</tt> and <tt>target_prefix</tt> option:
+
+ Bucket.enable_logging_for(
+ 'jukebox', 'target_bucket' => 'jukebox-logs'
+ )
+
+Now instead of logging right into the jukebox bucket, the logs will go into the bucket called jukebox-logs.
+
+Once logs have accumulated, you can access them using the <tt>logs</tt> method:
+
+ pp Bucket.logs('jukebox')
+ [#<AWS::S3::Logging::Log '/jukebox-logs/log-2006-11-14-07-15-24-2061C35880A310A1'>,
+ #<AWS::S3::Logging::Log '/jukebox-logs/log-2006-11-14-08-15-27-D8EEF536EC09E6B3'>,
+ #<AWS::S3::Logging::Log '/jukebox-logs/log-2006-11-14-08-15-29-355812B2B15BD789'>]
+
+Each log has a <tt>lines</tt> method that gives you information about each request in that log. All the fields are available
+as named methods. More information is available in Logging::Log::Line.
+
+ logs = Bucket.logs('jukebox')
+ log = logs.first
+ line = log.lines.first
+ line.operation
+ # => 'REST.GET.LOGGING_STATUS'
+ line.request_uri
+ # => 'GET /jukebox?logging HTTP/1.1'
+ line.remote_ip
+ # => "67.165.183.125"
+
+Disabling logging is just as simple as enabling it:
+
+ Bucket.disable_logging_for('jukebox')
+
+
+== Errors
+==== When things go wrong
+
+Anything you do that makes a request to S3 could result in an error. If it does, the AWS::S3 library will raise an exception
+specific to the error. All exception that are raised as a result of a request returning an error response inherit from the
+ResponseError exception. So should you choose to rescue any such exception, you can simple rescue ResponseError.
+
+Say you go to delete a bucket, but the bucket turns out to not be empty. This results in a BucketNotEmpty error (one of the many
+errors listed at http://docs.amazonwebservices.com/AmazonS3/2006-03-01/ErrorCodeList.html):
+
+ begin
+ Bucket.delete('jukebox')
+ rescue ResponseError => error
+ # ...
+ end
+
+Once you've captured the exception, you can extract the error message from S3, as well as the full error response, which includes
+things like the HTTP response code:
+
+ error
+ # => #<AWS::S3::BucketNotEmpty The bucket you tried to delete is not empty>
+ error.message
+ # => "The bucket you tried to delete is not empty"
+ error.response.code
+ # => 409
+
+You could use this information to redisplay the error in a way you see fit, or just to log the error and continue on.
+
+
+==== Accessing the last request's response
+
+Sometimes methods that make requests to the S3 servers return some object, like a Bucket or an S3Object.
+Othertimes they return just <tt>true</tt>. Other times they raise an exception that you may want to rescue. Despite all these
+possible outcomes, every method that makes a request stores its response object for you in Service.response. You can always
+get to the last request's response via Service.response.
+
+ objects = Bucket.objects('jukebox')
+ Service.response.success?
+ # => true
+
+This is also useful when an error exception is raised in the console which you weren't expecting. You can
+root around in the response to get more details of what might have gone wrong.
+
+
58 README.erb
@@ -0,0 +1,58 @@
+= AWS::S3
+
+<%= docs_for['AWS::S3'] %>
+
+== AWS::S3 Basics
+=== The service, buckets and objects
+
+The three main concepts of S3 are the service, buckets and objects.
+
+==== The service
+
+<%= docs_for['AWS::S3::Service'] %>
+
+==== Buckets
+
+<%= docs_for['AWS::S3::Bucket'] %>
+
+==== Objects
+
+<%= docs_for['AWS::S3::S3Object'] %>
+
+==== Streaming uploads
+
+<%= docs_for['AWS::S3::S3Object::store'] %>
+
+== Setting the current bucket
+==== Scoping operations to a specific bucket
+
+<%= docs_for['AWS::S3::Base.set_current_bucket_to'] %>
+
+== BitTorrent
+==== Another way to download large files
+
+<%= docs_for['AWS::S3::BitTorrent'] %>
+
+== Access control
+==== Using canned access control policies
+
+<%= docs_for['AWS::S3::ACL'] %>
+
+==== Accessing private objects from a browser
+
+<%= docs_for['AWS::S3::S3Object.url_for'] %>
+
+== Logging
+==== Tracking requests made on a bucket
+
+<%= docs_for['AWS::S3::Logging'] %>
+
+== Errors
+==== When things go wrong
+
+<%= docs_for['AWS::S3::Error'] %>
+
+==== Accessing the last request's response
+
+<%= docs_for['AWS::S3::Service.response'] %>
+
328 Rakefile
@@ -0,0 +1,328 @@
+require 'rubygems'
+require 'rake'
+require 'rake/testtask'
+require 'rake/rdoctask'
+require 'rake/packagetask'
+require 'rake/gempackagetask'
+
+require File.dirname(__FILE__) + '/lib/aws/s3'
+
+def library_root
+ File.dirname(__FILE__)
+end
+
+task :default => :test
+
+Rake::TestTask.new do |test|
+ test.pattern = 'test/*_test.rb'
+ test.verbose = true
+end
+
+namespace :doc do
+ Rake::RDocTask.new do |rdoc|
+ rdoc.rdoc_dir = 'doc'
+ rdoc.title = "AWS::S3 -- Support for Amazon S3's REST api"
+ rdoc.options << '--line-numbers' << '--inline-source'
+ rdoc.rdoc_files.include('README')
+ rdoc.rdoc_files.include('COPYING')
+ rdoc.rdoc_files.include('INSTALL')
+ rdoc.rdoc_files.include('lib/**/*.rb')
+ end
+
+ task :rdoc => 'doc:readme'
+
+ task :refresh => :rerdoc do
+ system 'open doc/index.html'
+ end
+
+ task :readme do
+ require 'support/rdoc/code_info'
+ RDoc::CodeInfo.parse('lib/**/*.rb')
+
+ strip_comments = lambda {|comment| comment.gsub(/^# ?/, '')}
+ docs_for = lambda do |location|
+ info = RDoc::CodeInfo.for(location)
+ raise RuntimeError, "Couldn't find documentation for `#{location}'" unless info
+ strip_comments[info.comment]
+ end
+
+ open('README', 'w') do |file|
+ file.write ERB.new(IO.read('README.erb')).result(binding)
+ end
+ end
+
+ task :deploy => :rerdoc do
+ sh %(scp -r doc marcel@rubyforge.org:/var/www/gforge-projects/amazon/)
+ end
+end
+
+namespace :dist do
+ spec = Gem::Specification.new do |s|
+ s.name = 'aws-s3'
+ s.version = Gem::Version.new(AWS::S3::Version)
+ s.summary = "Client library for Amazon's Simple Storage Service's REST API"
+ s.description = s.summary
+ s.email = 'marcel@vernix.org'
+ s.author = 'Marcel Molina Jr.'
+ s.has_rdoc = true
+ s.extra_rdoc_files = %w(README COPYING INSTALL)
+ s.homepage = 'http://amazon.rubyforge.org'
+ s.rubyforge_project = 'amazon'
+ s.files = FileList['Rakefile', 'lib/**/*.rb', 'bin/*', 'support/**/*.rb']
+ s.executables << 's3sh'
+ s.test_files = Dir['test/**/*']
+
+ s.add_dependency 'xml-simple'
+ s.add_dependency 'builder'
+ s.add_dependency 'mime-types'
+ s.rdoc_options = ['--title', "AWS::S3 -- Support for Amazon S3's REST api",
+ '--main', 'README',
+ '--line-numbers', '--inline-source']
+ end
+
+ # Regenerate README before packaging
+ task :package => 'doc:readme'
+ Rake::GemPackageTask.new(spec) do |pkg|
+ pkg.need_tar_gz = true
+ pkg.package_files.include('{lib,script,test,support}/**/*')
+ pkg.package_files.include('README')
+ pkg.package_files.include('COPYING')
+ pkg.package_files.include('INSTALL')
+ pkg.package_files.include('Rakefile')
+ end
+
+ desc 'Install with gems'
+ task :install => :repackage do
+ sh "sudo gem i pkg/#{spec.name}-#{spec.version}.gem"
+ end
+
+ desc 'Uninstall gem'
+ task :uninstall do
+ sh "sudo gem uninstall #{spec.name} -x"
+ end
+
+ desc 'Reinstall gem'
+ task :reinstall => [:uninstall, :install]
+
+ task :confirm_release do
+ print "Releasing version #{spec.version}. Are you sure you want to proceed? [Yn] "
+ abort if STDIN.getc == ?n
+ end
+
+ desc 'Tag release'
+ task :tag do
+ svn_root = 'svn+ssh://marcel@rubyforge.org/var/svn/amazon/s3'
+ sh %(svn cp #{svn_root}/trunk #{svn_root}/tags/rel-#{spec.version} -m "Tag #{spec.name} release #{spec.version}")
+ end
+
+ desc 'Update changelog to include a release marker'
+ task :add_release_marker_to_changelog do
+ changelog = IO.read('CHANGELOG')
+ changelog.sub!(/^trunk:/, "#{spec.version}:")
+
+ open('CHANGELOG', 'w') do |file|
+ file.write "trunk:\n\n#{changelog}"
+ end
+ end
+
+ task :commit_changelog do
+ sh %(svn ci CHANGELOG -m "Bump changelog version marker for release")
+ end
+
+ package_name = lambda {|specification| File.join('pkg', "#{specification.name}-#{specification.version}")}
+
+ desc 'Push a release to rubyforge'
+ task :release => [:confirm_release, :clean, :add_release_marker_to_changelog, :package, :commit_changelog, :tag] do
+ require 'rubyforge'
+ package = package_name[spec]
+
+ rubyforge = RubyForge.new
+ rubyforge.login
+
+ version_already_released = lambda do
+ releases = rubyforge.autoconfig['release_ids']
+ releases.has_key?(spec.name) && releases[spec.name][spec.version]
+ end
+
+ abort("Release #{spec.version} already exists!") if version_already_released.call
+
+ if release_id = rubyforge.add_release(spec.rubyforge_project, spec.name, spec.version, "#{package}.tar.gz")
+ rubyforge.add_file(spec.rubyforge_project, spec.name, release_id, "#{package}.gem")
+ else
+ puts 'Release failed!'
+ end
+ end
+
+ desc 'Upload a beta gem'
+ task :push_beta_gem => [:clobber_package, :package] do
+ beta_gem = package_name[spec]
+ sh %(scp #{beta_gem}.gem marcel@rubyforge.org:/var/www/gforge-projects/amazon/beta)
+ end
+
+ task :spec do
+ puts spec.to_ruby
+ end
+end
+
+desc 'Check code to test ratio'
+task :stats do
+ library_files = FileList["#{library_root}/lib/**/*.rb"]
+ test_files = FileList["#{library_root}/test/**/*_test.rb"]
+ count_code_lines = Proc.new do |lines|
+ lines.inject(0) do |code_lines, line|
+ next code_lines if [/^\s*$/, /^\s*#/].any? {|non_code_line| non_code_line === line}
+ code_lines + 1
+ end
+ end
+
+ count_code_lines_for_files = Proc.new do |files|
+ files.inject(0) {|code_lines, file| code_lines + count_code_lines[IO.read(file)]}
+ end
+
+ library_code_lines = count_code_lines_for_files[library_files]
+ test_code_lines = count_code_lines_for_files[test_files]
+ ratio = Proc.new { sprintf('%.2f', test_code_lines.to_f / library_code_lines)}
+
+ puts "Code LOC: #{library_code_lines} Test LOC: #{test_code_lines} Code to Test Ratio: 1:#{ratio.call}"
+end
+
+namespace :test do
+ find_file = lambda do |name|
+ file_name = lambda {|path| File.join(path, "#{name}.rb")}
+ root = $:.detect do |path|
+ File.exist?(file_name[path])
+ end
+ file_name[root] if root
+ end
+
+ TEST_LOADER = find_file['rake/rake_test_loader']
+ multiruby = lambda do |glob|
+ system 'multiruby', TEST_LOADER, *Dir.glob(glob)
+ end
+
+ desc 'Check test coverage'
+ task :coverage do
+ system("rcov --sort coverage #{File.join(library_root, 'test/*_test.rb')}")
+ show_test_coverage_results
+ end
+
+ Rake::TestTask.new(:remote) do |test|
+ test.pattern = 'test/remote/*_test.rb'
+ test.verbose = true
+ end
+
+ Rake::TestTask.new(:all) do |test|
+ test.pattern = 'test/**/*_test.rb'
+ test.verbose = true
+ end
+
+ desc 'Check test coverage of full stack remote tests'
+ task :full_coverage do
+ system("rcov --sort coverage #{File.join(library_root, 'test/remote/*_test.rb')} #{File.join(library_root, 'test/*_test.rb')}")
+ show_test_coverage_results
+ end
+
+ desc 'Run local tests against multiple versions of Ruby'
+ task :version_audit do
+ multiruby['test/*_test.rb']
+ end
+
+ namespace :version_audit do
+ desc 'Run remote tests against multiple versions of Ruby'
+ task :remote do
+ multiruby['test/remote/*_test.rb']
+ end
+
+ desc 'Run all tests against multiple versions of Ruby'
+ task :all do
+ multiruby['test/**/*_test.rb']
+ end
+ end
+
+ def show_test_coverage_results
+ system("open #{File.join(library_root, 'coverage/index.html')}") if PLATFORM['darwin']
+ end
+
+ desc 'Remove coverage products'
+ task :clobber_coverage do
+ rm_r 'coverage' rescue nil
+ end
+end
+
+namespace :todo do
+ class << TODOS = IO.read(File.join(library_root, 'TODO'))
+ def items
+ split("\n").grep(/^\[\s|X\]/)
+ end
+
+ def completed
+ find_items_matching(/^\[X\]/)
+ end
+
+ def uncompleted
+ find_items_matching(/^\[\s\]/)
+ end
+
+ def find_items_matching(regexp)
+ items.grep(regexp).instance_eval do
+ def display
+ puts map {|item| "* #{item.sub(/^\[[^\]]\]\s/, '')}"}
+ end
+ self
+ end
+ end
+ end
+
+ desc 'Completed todo items'
+ task :completed do
+ TODOS.completed.display
+ end
+
+ desc 'Incomplete todo items'
+ task :uncompleted do
+ TODOS.uncompleted.display
+ end
+end if File.exists?(File.join(library_root, 'TODO'))
+
+namespace :site do
+ require 'erb'
+ require 'rdoc/markup/simple_markup'
+ require 'rdoc/markup/simple_markup/to_html'
+
+ readme = lambda { IO.read('README')[/^== Getting started\n(.*)/m, 1] }
+
+ readme_to_html = lambda do
+ handler = SM::ToHtml.new
+ handler.instance_eval do
+ require 'syntax'
+ require 'syntax/convertors/html'
+ def accept_verbatim(am, fragment)
+ syntax = Syntax::Convertors::HTML.for_syntax('ruby')
+ @res << %(<div class="ruby">#{syntax.convert(fragment.txt, true)}</div>)
+ end
+ end
+ SM::SimpleMarkup.new.convert(readme.call, handler)
+ end
+
+ desc 'Regenerate the public website page'
+ task :build => 'doc:readme' do
+ open('site/public/index.html', 'w') do |file|
+ erb_data = {}
+ erb_data[:readme] = readme_to_html.call
+ file.write ERB.new(IO.read('site/index.erb')).result(binding)
+ end
+ end
+
+ task :refresh => :build do
+ system 'open site/public/index.html'
+ end
+
+ desc 'Update the live website'
+ task :deploy => :build do
+ site_files = FileList['site/public/*']
+ site_files.delete_if {|file| File.directory?(file)}
+ sh %(scp #{site_files.join ' '} marcel@rubyforge.org:/var/www/gforge-projects/amazon/)
+ end
+end
+
+task :clean => ['dist:clobber_package', 'doc:clobber_rdoc', 'test:clobber_coverage']
26 TODO
@@ -0,0 +1,26 @@
+0.3.0
+
+ [ ] Alias make alias for establish_connection! that is non-bang
+
+ [ ] Pass filter criteria like :max_keys onto methods like logs_for and logs which return logs.
+ [ ] Add high level support to custom logging information as documented in the "Adding Custom Information..." here http://docs.amazonwebservices.com/AmazonS3/2006-03-01/LogFormat.html
+
+[ ] Bucket.delete(:force => true) needs to fetch all objects in the bucket until there are no more, taking into account the max-keys limit of 1000 objects at a time and it needs to do so in a very efficient manner so it can handle very large buckets (using :prefix and :marker)
+[ ] Ability to set content_type on S3Object that has not been stored yet
+[ ] Allow symbol and abbreviated version of logging options ('target_prefix' => :prefix, 'target_bucket' => :bucket)
+[ ] Allow symbol options for grant's constructor ('permission' => :permission)
+[ ] Reconsider save method to Policies returned by Bucket and S3Object's acl instance method so you can do some_object.acl.save after modifying it rather than some_object.acl(some_object.acl)
+
+[X] S3Object.copy and S3Object.move should preserve the acl
+[X] Consider opening up Net::HTTPGenericRequest to replace hardcoded chunk_size to something greater than 1k (maybe 500k since the files are presumed to be quite large)
+[X] Add S3Object.exists?
+[X] See about replacing XmlSimple with libxml if it's installed since XmlSimple can be rather slow (due to wrapping REXML)
+[X] Ability to build up the README from internal docs so documentation for various classes and the README can feed from a single source
+[X] Bittorrent documentation
+[X] Document logging methods
+[X] Bittorrent
+[X] ACL documentation
+[X] Log management ([de]activation & retrieval)
+[X] Remote ACL tests
+[X] ACL requesting and parsing
+[X] ACL updating for already stored objects which merges with existing ACL
6 bin/s3sh
@@ -0,0 +1,6 @@
+#!/usr/bin/env ruby
+s3_lib = File.dirname(__FILE__) + '/../lib/aws/s3'
+setup = File.dirname(__FILE__) + '/setup'
+irb_name = RUBY_PLATFORM =~ /mswin32/ ? 'irb.bat' : 'irb'
+
+exec "#{irb_name} -r #{s3_lib} -r #{setup} --simple-prompt"
10 bin/setup.rb
@@ -0,0 +1,10 @@
+#!/usr/bin/env ruby
+if ENV['AMAZON_ACCESS_KEY_ID'] && ENV['AMAZON_SECRET_ACCESS_KEY']
+ AWS::S3::Base.establish_connection!(
+ :access_key_id => ENV['AMAZON_ACCESS_KEY_ID'],
+ :secret_access_key => ENV['AMAZON_SECRET_ACCESS_KEY']
+ )
+end
+
+require File.dirname(__FILE__) + '/../test/fixtures'
+include AWS::S3
61 lib/aws/s3.rb
@@ -0,0 +1,61 @@
+require 'base64'
+require 'cgi'
+require 'uri'
+require 'openssl'
+require 'digest/sha1'
+require 'net/https'
+require 'time'
+require 'date'
+require 'open-uri'
+
+$:.unshift(File.dirname(__FILE__))
+require 's3/extensions'
+require_library_or_gem 'builder' unless defined? Builder
+require_library_or_gem 'mime/types' unless defined? MIME::Types
+
+require 's3/base'
+require 's3/version'
+require 's3/parsing'
+require 's3/acl'
+require 's3/logging'
+require 's3/bittorrent'
+require 's3/service'
+require 's3/owner'
+require 's3/bucket'
+require 's3/object'
+require 's3/error'
+require 's3/exceptions'
+require 's3/connection'
+require 's3/authentication'
+require 's3/response'
+
+AWS::S3::Base.class_eval do
+ include AWS::S3::Connection::Management
+end
+
+AWS::S3::Bucket.class_eval do
+ include AWS::S3::Logging::Management
+ include AWS::S3::ACL::Bucket
+end
+
+AWS::S3::S3Object.class_eval do
+ include AWS::S3::ACL::S3Object
+ include AWS::S3::BitTorrent
+end
+
+require_library_or_gem 'xmlsimple' unless defined? XmlSimple
+# If libxml is installed, we use the FasterXmlSimple library, that provides most of the functionality of XmlSimple
+# except it uses the xml/libxml library for xml parsing (rather than REXML). If libxml isn't installed, we just fall back on
+# XmlSimple.
+AWS::S3::Parsing.parser =
+ begin
+ require_library_or_gem 'xml/libxml'
+ # Older version of libxml aren't stable (bus error when requesting attributes that don't exist) so we
+ # have to use a version greater than '0.3.8.2'.
+ raise LoadError unless XML::Parser::VERSION > '0.3.8.2'
+ $:.push(File.join(File.dirname(__FILE__), '..', '..', 'support', 'faster-xml-simple', 'lib'))
+ require_library_or_gem 'faster_xml_simple'
+ FasterXmlSimple
+ rescue LoadError
+ XmlSimple
+ end
636 lib/aws/s3/acl.rb
@@ -0,0 +1,636 @@
+module AWS
+ module S3
+ # By default buckets are private. This means that only the owner has access rights to the bucket and its objects.
+ # Objects in that bucket inherit the permission of the bucket unless otherwise specified. When an object is private, the owner can
+ # generate a signed url that exposes the object to anyone who has that url. Alternatively, buckets and objects can be given other
+ # access levels. Several canned access levels are defined:
+ #
+ # * <tt>:private</tt> - Owner gets FULL_CONTROL. No one else has any access rights. This is the default.
+ # * <tt>:public_read</tt> - Owner gets FULL_CONTROL and the anonymous principal is granted READ access. If this policy is used on an object, it can be read from a browser with no authentication.
+ # * <tt>:public_read_write</tt> - Owner gets FULL_CONTROL, the anonymous principal is granted READ and WRITE access. This is a useful policy to apply to a bucket, if you intend for any anonymous user to PUT objects into the bucket.
+ # * <tt>:authenticated_read</tt> - Owner gets FULL_CONTROL, and any principal authenticated as a registered Amazon S3 user is granted READ access.
+ #
+ # You can set a canned access level when you create a bucket or an object by using the <tt>:access</tt> option:
+ #
+ # S3Object.store(
+ # 'kiss.jpg',
+ # data,
+ # 'marcel',
+ # :access => :public_read
+ # )
+ #
+ # Since the image we created is publicly readable, we can access it directly from a browser by going to the corresponding bucket name
+ # and specifying the object's key without a special authenticated url:
+ #
+ # http://s3.amazonaws.com/marcel/kiss.jpg
+ #
+ # ==== Building custum access policies
+ #
+ # For both buckets and objects, you can use the <tt>acl</tt> method to see its access control policy:
+ #
+ # policy = S3Object.acl('kiss.jpg', 'marcel')
+ # pp policy.grants
+ # [#<AWS::S3::ACL::Grant FULL_CONTROL to noradio>,
+ # #<AWS::S3::ACL::Grant READ to AllUsers Group>]
+ #
+ # Policies are made up of one or more grants which grant a specific permission to some grantee. Here we see the default FULL_CONTROL grant
+ # to the owner of this object. There is also READ permission granted to the Allusers Group, which means anyone has read access for the object.
+ #
+ # Say we wanted to grant access to anyone to read the access policy of this object. The current READ permission only grants them permission to read
+ # the object itself (for example, from a browser) but it does not allow them to read the access policy. For that we will need to grant the AllUsers group the READ_ACP permission.
+ #
+ # First we'll create a new grant object:
+ #
+ # grant = ACL::Grant.new
+ # # => #<AWS::S3::ACL::Grant (permission) to (grantee)>
+ # grant.permission = 'READ_ACP'
+ #
+ # Now we need to indicate who this grant is for. In other words, who the grantee is:
+ #
+ # grantee = ACL::Grantee.new
+ # # => #<AWS::S3::ACL::Grantee (xsi not set yet)>
+ #
+ # There are three ways to specify a grantee: 1) by their internal amazon id, such as the one returned with an object's Owner,
+ # 2) by their Amazon account email address or 3) by specifying a group. As of this writing you can not create custom groups, but
+ # Amazon does provide three already: AllUsers, Authenticated and LogDelivery. In this case we want to provide the grant to all users.
+ # This effectively means "anyone".
+ #
+ # grantee.group = 'AllUsers'
+ #
+ # Now that our grantee is setup, we'll associate it with the grant:
+ #
+ # grant.grantee = grantee
+ # grant
+ # # => #<AWS::S3::ACL::Grant READ_ACP to AllUsers Group>
+ #
+ # Are grant has all the information we need. Now that it's ready, we'll add it on to the object's access control policy's list of grants:
+ #
+ # policy.grants << grant
+ # pp policy.grants
+ # [#<AWS::S3::ACL::Grant FULL_CONTROL to noradio>,
+ # #<AWS::S3::ACL::Grant READ to AllUsers Group>,
+ # #<AWS::S3::ACL::Grant READ_ACP to AllUsers Group>]
+ #
+ # Now that the policy has the new grant, we reuse the <tt>acl</tt> method to persist the policy change:
+ #
+ # S3Object.acl('kiss.jpg', 'marcel', policy)
+ #
+ # If we fetch the object's policy again, we see that the grant has been added:
+ #
+ # pp S3Object.acl('kiss.jpg', 'marcel').grants
+ # [#<AWS::S3::ACL::Grant FULL_CONTROL to noradio>,
+ # #<AWS::S3::ACL::Grant READ to AllUsers Group>,
+ # #<AWS::S3::ACL::Grant READ_ACP to AllUsers Group>]
+ #
+ # If we were to access this object's acl url from a browser:
+ #
+ # http://s3.amazonaws.com/marcel/kiss.jpg?acl
+ #
+ # we would be shown its access control policy.
+ #
+ # ==== Pre-prepared grants
+ #
+ # Alternatively, the ACL::Grant class defines a set of stock grant policies that you can fetch by name. In most cases, you can
+ # just use one of these pre-prepared grants rather than building grants by hand. Two of these stock policies are <tt>:public_read</tt>
+ # and <tt>:public_read_acp</tt>, which happen to be the two grants that we built by hand above. In this case we could have simply written:
+ #
+ # policy.grants << ACL::Grant.grant(:public_read)
+ # policy.grants << ACL::Grant.grant(:public_read_acp)
+ # S3Object.acl('kiss.jpg', 'marcel', policy)
+ #
+ # The full details can be found in ACL::Policy, ACL::Grant and ACL::Grantee.
+ module ACL
+ # The ACL::Policy class lets you inspect and modify access controls for buckets and objects.
+ # A policy is made up of one or more Grants which specify a permission and a Grantee to whom that permission is granted.
+ #
+ # Buckets and objects are given a default access policy which contains one grant permitting the owner of the bucket or object
+ # FULL_CONTROL over its contents. This means they can read the object, write to the object, as well as read and write its
+ # policy.
+ #
+ # The <tt>acl</tt> method for both buckets and objects returns the policy object for that entity:
+ #
+ # policy = Bucket.acl('some-bucket')
+ #
+ # The <tt>grants</tt> method of a policy exposes its grants. You can treat this collection as an array and push new grants onto it:
+ #
+ # policy.grants << grant
+ #
+ # Check the documentation for Grant and Grantee for more details on how to create new grants.
+ class Policy
+ include SelectiveAttributeProxy #:nodoc:
+ attr_accessor :owner, :grants
+
+ def initialize(attributes = {})
+ @attributes = attributes
+ @grants = [].extend(GrantListExtensions)
+ extract_owner! if owner?
+ extract_grants! if grants?
+ end
+
+ # The xml representation of the policy.
+ def to_xml
+ Builder.new(owner, grants).to_s
+ end
+
+ private
+
+ def owner?
+ attributes.has_key?('owner') || !owner.nil?
+ end
+
+ def grants?
+ (attributes.has_key?('access_control_list') && attributes['access_control_list']['grant']) || !grants.empty?
+ end
+
+ def extract_owner!
+ @owner = Owner.new(attributes.delete('owner'))
+ end
+
+ def extract_grants!
+ attributes['access_control_list']['grant'].each do |grant|
+ grants << Grant.new(grant)
+ end
+ end
+
+ module GrantListExtensions #:nodoc:
+ def include?(grant)
+ case grant
+ when Symbol
+ super(ACL::Grant.grant(grant))
+ else
+ super
+ end
+ end
+
+ def delete(grant)
+ case grant
+ when Symbol
+ super(ACL::Grant.grant(grant))
+ else
+ super
+ end
+ end
+
+ # Two grant lists are equal if they have identical grants both in terms of permission and grantee.
+ def ==(grants)
+ size == grants.size && all? {|grant| grants.include?(grant)}
+ end
+ end
+
+ class Builder < XmlGenerator #:nodoc:
+ attr_reader :owner, :grants
+ def initialize(owner, grants)
+ @owner = owner
+ @grants = grants.uniq # There could be some duplicate grants
+ super()
+ end
+
+ def build
+ xml.tag!('AccessControlPolicy', 'xmlns' => 'http://s3.amazonaws.com/doc/2006-03-01/') do
+ xml.Owner do
+ xml.ID owner.id
+ xml.DisplayName owner.display_name
+ end
+
+ xml.AccessControlList do
+ xml << grants.map {|grant| grant.to_xml}.join("\n")
+ end
+ end
+ end
+ end
+ end
+
+ # A Policy is made up of one or more Grant objects. A grant sets a specific permission and grants it to the associated grantee.
+ #
+ # When creating a new grant to add to a policy, you need only set its permission and then associate with a Grantee.
+ #
+ # grant = ACL::Grant.new
+ # => #<AWS::S3::ACL::Grant (permission) to (grantee)>
+ #
+ # Here we see that neither the permission nor the grantee have been set. Let's make this grant provide the READ permission.
+ #
+ # grant.permission = 'READ'
+ # grant
+ # => #<AWS::S3::ACL::Grant READ to (grantee)>
+ #
+ # Now let's assume we have a grantee to the AllUsers group already set up. Just associate that grantee with our grant.
+ #
+ # grant.grantee = all_users_group_grantee
+ # grant
+ # => #<AWS::S3::ACL::Grant READ to AllUsers Group>
+ #
+ # And now are grant is complete. It provides READ permission to the AllUsers group, effectively making this object publicly readable
+ # without any authorization.
+ #
+ # Assuming we have some object's policy available in a local variable called <tt>policy</tt>, we can now add this grant onto its
+ # collection of grants.
+ #
+ # policy.grants << grant
+ #
+ # And then we send the updated policy to the S3 servers.
+ #
+ # some_s3object.acl(policy)
+ class Grant
+ include SelectiveAttributeProxy #:nodoc:
+ constant :VALID_PERMISSIONS, %w(READ WRITE READ_ACP WRITE_ACP FULL_CONTROL)
+ attr_accessor :grantee
+
+ class << self
+ # Returns stock grants with name <tt>type</tt>.
+ #
+ # public_read_grant = ACL::Grant.grant :public_read
+ # => #<AWS::S3::ACL::Grant READ to AllUsers Group>
+ #
+ # Valid stock grant types are:
+ #
+ # * <tt>:authenticated_read</tt>
+ # * <tt>:authenticated_read_acp</tt>
+ # * <tt>:authenticated_write</tt>
+ # * <tt>:authenticated_write_acp</tt>
+ # * <tt>:logging_read</tt>
+ # * <tt>:logging_read_acp</tt>
+ # * <tt>:logging_write</tt>
+ # * <tt>:logging_write_acp</tt>
+ # * <tt>:public_read</tt>
+ # * <tt>:public_read_acp</tt>
+ # * <tt>:public_write</tt>
+ # * <tt>:public_write_acp</tt>
+ def grant(type)
+ case type
+ when *stock_grant_map.keys
+ build_stock_grant_for type
+ else
+ raise ArgumentError, "Unknown grant type `#{type}'"
+ end
+ end
+
+ private
+ def stock_grant_map
+ grant = lambda {|permission, group| {:permission => permission, :group => group}}
+ groups = {:public => 'AllUsers', :authenticated => 'Authenticated', :logging => 'LogDelivery'}
+ permissions = %w(READ WRITE READ_ACP WRITE_ACP)
+ stock_grants = {}
+ groups.each do |grant_group_name, group_name|
+ permissions.each do |permission|
+ stock_grants["#{grant_group_name}_#{permission.downcase}".to_sym] = grant[permission, group_name]
+ end
+ end
+ stock_grants
+ end
+ memoized :stock_grant_map
+
+ def build_stock_grant_for(type)
+ stock_grant = stock_grant_map[type]
+ grant = new do |g|
+ g.permission = stock_grant[:permission]
+ end
+ grant.grantee = Grantee.new do |gr|
+ gr.group = stock_grant[:group]
+ end
+ grant
+ end
+ end
+
+ def initialize(attributes = {})
+ attributes = {'permission' => nil}.merge(attributes)
+ @attributes = attributes
+ extract_grantee!
+ yield self if block_given?
+ end
+
+ # Set the permission for this grant.
+ #
+ # grant.permission = 'READ'
+ # grant
+ # => #<AWS::S3::ACL::Grant READ to (grantee)>
+ #
+ # If the specified permisison level is not valid, an <tt>InvalidAccessControlLevel</tt> exception will be raised.
+ def permission=(permission_level)
+ unless self.class.valid_permissions.include?(permission_level)
+ raise InvalidAccessControlLevel.new(self.class.valid_permissions, permission_level)
+ end
+ attributes['permission'] = permission_level
+ end
+
+ # The xml representation of this grant.
+ def to_xml
+ Builder.new(permission, grantee).to_s
+ end
+
+ def inspect #:nodoc:
+ "#<%s:0x%s %s>" % [self.class, object_id, self]
+ end
+
+ def to_s #:nodoc:
+ [permission || '(permission)', 'to', grantee ? grantee.type_representation : '(grantee)'].join ' '
+ end
+
+ def eql?(grant) #:nodoc:
+ # This won't work for an unposted AmazonCustomerByEmail because of the normalization
+ # to CanonicalUser but it will work for groups.
+ to_s == grant.to_s
+ end
+ alias_method :==, :eql?
+
+ def hash #:nodoc:
+ to_s.hash
+ end
+
+ private
+
+ def extract_grantee!
+ @grantee = Grantee.new(attributes['grantee']) if attributes['grantee']
+ end
+
+ class Builder < XmlGenerator #:nodoc:
+ attr_reader :grantee, :permission
+
+ def initialize(permission, grantee)
+ @permission = permission
+ @grantee = grantee
+ super()
+ end
+
+ def build
+ xml.Grant do
+ xml << grantee.to_xml
+ xml.Permission permission
+ end
+ end
+ end
+ end
+
+ # Grants bestow a access permission to grantees. Each grant of some access control list Policy is associated with a grantee.
+ # There are three ways of specifying a grantee at the time of this writing.
+ #
+ # * By canonical user - This format uses the <tt>id</tt> of a given Amazon account. The id value for a given account is available in the
+ # Owner object of a bucket, object or policy.
+ #
+ # grantee.id = 'bb2041a25975c3d4ce9775fe9e93e5b77a6a9fad97dc7e00686191f3790b13f1'
+ #
+ # Often the id will just be fetched from some owner object.
+ #
+ # grantee.id = some_object.owner.id
+ #
+ # * By amazon email address - You can specify an email address for any Amazon account. The Amazon account need not be signed up with the S3 service.
+ # though it must be unique across the entire Amazon system. This email address is normalized into a canonical user representation once the grant
+ # has been sent back up to the S3 servers.
+ #
+ # grantee.email_address = 'joe@example.org'
+ #
+ # * By group - As of this writing you can not create custom groups, but Amazon provides three group that you can use. See the documentation for the
+ # Grantee.group= method for details.
+ #
+ # grantee.group = 'Authenticated'
+ class Grantee
+ include SelectiveAttributeProxy #:nodoc:
+
+ undef_method :id # Get rid of Object#id
+
+ def initialize(attributes = {})
+ # Set default values for attributes that may not be passed in but we still want the object
+ # to respond to
+ attributes = {'id' => nil, 'display_name' => nil, 'email_address' => nil, 'uri' => nil}.merge(attributes)
+ @attributes = attributes
+ extract_type!
+ yield self if block_given?
+ end
+
+ # The xml representation of the current grantee object.
+ def to_xml
+ Builder.new(self).to_s
+ end
+
+ # Returns the type of grantee. Will be one of <tt>CanonicalUser</tt>, <tt>AmazonCustomerByEmail</tt> or <tt>Group</tt>.
+ def type
+ return attributes['type'] if attributes['type']
+
+ # Lookups are in order of preference so if, for example, you set the uri but display_name and id are also
+ # set, we'd rather go with the canonical representation.
+ if display_name && id
+ 'CanonicalUser'
+ elsif email_address
+ 'AmazonCustomerByEmail'
+ elsif uri
+ 'Group'
+ end
+ end
+
+ # Sets the grantee's group by name.
+ #
+ # grantee.group = 'AllUsers'
+ #
+ # Currently, valid groups defined by S3 are:
+ #
+ # * <tt>AllUsers</tt>: This group represents anyone. In other words, an anonymous request.
+ # * <tt>Authenticated</tt>: Any authenticated account on the S3 service.
+ # * <tt>LogDelivery</tt>: The entity that delivers bucket access logs.
+ def group=(group_name)
+ section = %w(AllUsers Authenticated).include?(group_name) ? 'global' : 's3'
+ self.uri = "http://acs.amazonaws.com/groups/#{section}/#{group_name}"
+ end
+
+ # Returns the grantee's group. If the grantee is not a group, <tt>nil</tt> is returned.
+ def group
+ return unless uri
+ uri[%r([^/]+$)]
+ end
+
+ def type_representation #:nodoc:
+ case type
+ when 'CanonicalUser' then display_name || id
+ when 'AmazonCustomerByEmail' then email_address
+ when 'Group' then "#{group} Group"
+ end
+ end
+
+ def inspect #:nodoc:
+ "#<%s:0x%s %s>" % [self.class, object_id, type_representation || '(type not set yet)']
+ end
+
+ private
+ def extract_type!
+ attributes['type'] = attributes.delete('xsi:type')
+ end
+
+ class Builder < XmlGenerator #:nodoc:
+
+ def initialize(grantee)
+ @grantee = grantee
+ super()
+ end
+
+ def build
+ xml.tag!('Grantee', attributes) do
+ representation
+ end
+ end
+
+ private
+ attr_reader :grantee
+
+ def representation
+ case grantee.type
+ when 'CanonicalUser'
+ xml.ID grantee.id
+ xml.DisplayName grantee.display_name
+ when 'AmazonCustomerByEmail'
+ xml.EmailAddress grantee.email_address
+ when 'Group'
+ xml.URI grantee.uri
+ end
+ end
+
+ def attributes
+ {'xsi:type' => grantee.type, 'xmlns:xsi' => 'http://www.w3.org/2001/XMLSchema-instance'}
+ end
+ end
+ end
+
+ module Bucket
+ def self.included(klass) #:nodoc:
+ klass.extend(ClassMethods)
+ end
+
+ module ClassMethods
+ # The acl method is the single point of entry for reading and writing access control list policies for a given bucket.
+ #
+ # # Fetch the acl for the 'marcel' bucket
+ # policy = Bucket.acl 'marcel'
+ #
+ # # Modify the policy ...
+ # # ...
+ #
+ # # Send updated policy back to the S3 servers
+ # Bucket.acl 'marcel', policy
+ def acl(name = nil, policy = nil)
+ if name.is_a?(ACL::Policy)
+ policy = name
+ name = nil
+ end
+
+ path = path(name) << '?acl'
+ respond_with ACL::Policy::Response do
+ policy ? put(path, {}, policy.to_xml) : ACL::Policy.new(get(path(name) << '?acl').policy)
+ end
+ end
+ end
+
+ # The acl method returns and updates the acl for a given bucket.
+ #
+ # # Fetch a bucket
+ # bucket = Bucket.find 'marcel'
+ #
+ # # Add a grant to the bucket's policy
+ # bucket.acl.grants << some_grant
+ #
+ # # Write the changes to the policy
+ # bucket.acl(bucket.acl)
+ def acl(reload = false)
+ policy = reload.is_a?(ACL::Policy) ? reload : nil
+ memoize(reload) do
+ self.class.acl(name, policy) if policy
+ self.class.acl(name)
+ end
+ end
+ end
+
+ module S3Object
+ def self.included(klass) #:nodoc:
+ klass.extend(ClassMethods)
+ end
+
+ module ClassMethods
+ # The acl method is the single point of entry for reading and writing access control list policies for a given object.
+ #
+ # # Fetch the acl for the 'kiss.jpg' object in the 'marcel' bucket
+ # policy = S3Object.acl 'kiss.jpg', 'marcel'
+ #
+ # # Modify the policy ...
+ # # ...
+ #
+ # # Send updated policy back to the S3 servers
+ # S3Object.acl 'kiss.jpg', 'marcel', policy
+ def acl(name, bucket = nil, policy = nil)
+ # We're using the second argument as the ACL::Policy
+ if bucket.is_a?(ACL::Policy)
+ policy = bucket
+ bucket = nil
+ end
+
+ bucket = bucket_name(bucket)
+ path = path!(bucket, name) << '?acl'
+
+ respond_with ACL::Policy::Response do
+ policy ? put(path, {}, policy.to_xml) : ACL::Policy.new(get(path).policy)
+ end
+ end
+ end
+
+ # The acl method returns and updates the acl for a given s3 object.
+ #
+ # # Fetch a the object
+ # object = S3Object.find 'kiss.jpg', 'marcel'
+ #
+ # # Add a grant to the object's
+ # object.acl.grants << some_grant
+ #
+ # # Write the changes to the policy
+ # object.acl(object.acl)
+ def acl(reload = false)
+ policy = reload.is_a?(ACL::Policy) ? reload : nil
+ memoize(reload) do
+ self.class.acl(key, bucket.name, policy) if policy
+ self.class.acl(key, bucket.name)
+ end
+ end
+ end
+
+ class OptionProcessor #:nodoc:
+ attr_reader :options
+ class << self
+ def process!(options)
+ new(options).process!
+ end
+ end
+
+ def initialize(options)
+ options.to_normalized_options!
+ @options = options
+ @access_level = extract_access_level
+ end
+
+ def process!
+ return unless access_level_specified?
+ validate!
+ options['x-amz-acl'] = access_level
+ end
+
+ private
+ def extract_access_level
+ options.delete('access') || options.delete('x-amz-acl')
+ end
+
+ def validate!
+ raise InvalidAccessControlLevel.new(valid_levels, access_level) unless valid?
+ end
+
+ def valid?
+ valid_levels.include?(access_level)
+ end
+
+ def access_level_specified?
+ !@access_level.nil?
+ end
+
+ def valid_levels
+ %w(private public-read public-read-write authenticated-read)
+ end
+
+ def access_level
+ @normalized_access_level ||= @access_level.to_header
+ end
+ end
+ end
+ end
+end
218 lib/aws/s3/authentication.rb
@@ -0,0 +1,218 @@
+module AWS
+ module S3
+ # All authentication is taken care of for you by the AWS::S3 library. None the less, some details of the two types
+ # of authentication and when they are used may be of interest to some.
+ #
+ # === Header based authentication
+ #
+ # Header based authentication is achieved by setting a special <tt>Authorization</tt> header whose value
+ # is formatted like so:
+ #
+ # "AWS #{access_key_id}:#{encoded_canonical}"
+ #
+ # The <tt>access_key_id</tt> is the public key that is assigned by Amazon for a given account which you use when
+ # establishing your initial connection. The <tt>encoded_canonical</tt> is computed according to rules layed out
+ # by Amazon which we will describe presently.
+ #
+ # ==== Generating the encoded canonical string
+ #
+ # The "canonical string", generated by the CanonicalString class, is computed by collecting the current request method,
+ # a set of significant headers of the current request, and the current request path into a string.
+ # That canonical string is then encrypted with the <tt>secret_access_key</tt> assigned by Amazon. The resulting encrypted canonical
+ # string is then base 64 encoded.
+ #
+ # === Query string based authentication
+ #
+ # When accessing a restricted object from the browser, you can authenticate via the query string, by setting the following parameters:
+ #
+ # "AWSAccessKeyId=#{access_key_id}&Expires=#{expires}&Signature=#{encoded_canonical}"
+ #
+ # The QueryString class is responsible for generating the appropriate parameters for authentication via the
+ # query string.
+ #
+ # The <tt>access_key_id</tt> and <tt>encoded_canonical</tt> are the same as described in the Header based authentication section.
+ # The <tt>expires</tt> value dictates for how long the current url is valid (by default, it will expire in 5 minutes). Expiration can be specified
+ # either by an absolute time (expressed in seconds since the epoch), or in relative time (in number of seconds from now).
+ # Details of how to customize the expiration of the url are provided in the documentation for the QueryString class.
+ #
+ # All requests made by this library use header authentication. When a query string authenticated url is needed,
+ # the S3Object#url method will include the appropriate query string parameters.
+ #
+ # === Full authentication specification
+ #
+ # The full specification of the authentication protocol can be found at
+ # http://docs.amazonwebservices.com/AmazonS3/2006-03-01/RESTAuthentication.html
+ class Authentication
+ constant :AMAZON_HEADER_PREFIX, 'x-amz-'
+
+ # Signature is the abstract super class for the Header and QueryString authentication methods. It does the job
+ # of computing the canonical_string using the CanonicalString class as well as encoding the canonical string. The subclasses
+ # parameterize these computations and arrange them in a string form appropriate to how they are used, in one case a http request
+ # header value, and in the other case key/value query string parameter pairs.
+ class Signature < String #:nodoc:
+ attr_reader :request, :access_key_id, :secret_access_key
+
+ def initialize(request, access_key_id, secret_access_key, options = {})
+ super()
+ @request, @access_key_id, @secret_access_key = request, access_key_id, secret_access_key
+ @options = options
+ end
+
+ private
+
+ def canonical_string
+ options = {}
+ options[:expires] = expires if expires?
+ CanonicalString.new(request, options)
+ end
+ memoized :canonical_string
+
+ def encoded_canonical
+ digest = OpenSSL::Digest::Digest.new('sha1')
+ b64_hmac = Base64.encode64(OpenSSL::HMAC.digest(digest, secret_access_key, canonical_string)).strip
+ url_encode? ? CGI.escape(b64_hmac) : b64_hmac
+ end
+
+ def url_encode?
+ !@options[:url_encode].nil?
+ end
+
+ def expires?
+ is_a? QueryString
+ end
+
+ def date
+ request['date'].to_s.strip.empty? ? Time.now : Time.parse(request['date'])
+ end
+ end
+
+ # Provides header authentication by computing the value of the Authorization header. More details about the
+ # various authentication schemes can be found in the docs for its containing module, Authentication.
+ class Header < Signature #:nodoc:
+ def initialize(*args)
+ super
+ self << "AWS #{access_key_id}:#{encoded_canonical}"
+ end
+ end
+
+ # Provides query string authentication by computing the three authorization parameters: AWSAccessKeyId, Expires and Signature.
+ # More details about the various authentication schemes can be found in the docs for its containing module, Authentication.
+ class QueryString < Signature #:nodoc:
+ constant :DEFAULT_EXPIRY, 300 # 5 minutes
+
+ def initialize(*args)
+ super
+ @options[:url_encode] = true
+ self << build
+ end
+
+ private
+
+ # Will return one of three values, in the following order of precedence:
+ #
+ # 1) Seconds since the epoch explicitly passed in the +:expires+ option
+ # 2) The current time in seconds since the epoch plus the number of seconds passed in
+ # the +:expires_in+ option
+ # 3) The current time in seconds since the epoch plus the default number of seconds (60 seconds)
+ def expires
+ return @options[:expires] if @options[:expires]
+ date.to_i + (@options[:expires_in] || DEFAULT_EXPIRY)
+ end
+
+ # Keep in alphabetical order
+ def build
+ "AWSAccessKeyId=#{access_key_id}&Expires=#{expires}&Signature=#{encoded_canonical}"
+ end
+ end
+
+ # The CanonicalString is used to generate an encrypted signature, signed with your secrect access key. It is composed of
+ # data related to the given request for which it provides authentication. This data includes the request method, request headers,
+ # and the request path. Both Header and QueryString use it to generate their signature.
+ class CanonicalString < String #:nodoc:
+ class << self
+ def default_headers
+ %w(content-type content-md5)
+ end
+
+ def interesting_headers
+ ['content-md5', 'content-type', 'date', amazon_header_prefix]
+ end
+
+ def amazon_header_prefix
+ /^#{AMAZON_HEADER_PREFIX}/io
+ end
+ end
+
+ attr_reader :request, :headers
+
+ def initialize(request, options = {})
+ super()
+ @request = request
+ @headers = {}
+ @options = options
+ # "For non-authenticated or anonymous requests. A NotImplemented error result code will be returned if
+ # an authenticated (signed) request specifies a Host: header other than 's3.amazonaws.com'"
+ # (from http://docs.amazonwebservices.com/AmazonS3/2006-03-01/VirtualHosting.html)
+ request['Host'] = DEFAULT_HOST
+ build
+ end
+
+ private
+ def build
+ self << "#{request.method}\n"
+ ensure_date_is_valid
+
+ initialize_headers
+ set_expiry!
+
+ headers.sort_by {|k, _| k}.each do |key, value|
+ value = value.to_s.strip
+ self << (key =~ self.class.amazon_header_prefix ? "#{key}:#{value}" : value)
+ self << "\n"
+ end
+ self << path
+ end
+
+ def initialize_headers
+ identify_interesting_headers
+ set_default_headers
+ end
+
+ def set_expiry!
+ self.headers['date'] = @options[:expires] if @options[:expires]
+ end
+
+ def ensure_date_is_valid
+ request['Date'] ||= Time.now.httpdate
+ end
+
+ def identify_interesting_headers
+ request.each do |key, value|
+ key = key.downcase # Can't modify frozen string so no bang
+ if self.class.interesting_headers.any? {|header| header === key}
+ self.headers[key] = value.to_s.strip
+ end
+ end
+ end
+
+ def set_default_headers
+ self.class.default_headers.each do |header|
+ self.headers[header] ||= ''
+ end
+ end
+
+ def path
+ [only_path, extract_significant_parameter].compact.join('?')
+ end
+
+ def extract_significant_parameter
+ request.path[/[&?](acl|torrent|logging)(?:&|=|$)/, 1]
+ end
+
+ def only_path
+ request.path[/^[^?]*/]
+ end
+ end
+ end
+ end
+end
232 lib/aws/s3/base.rb
@@ -0,0 +1,232 @@
+module AWS #:nodoc:
+ # AWS::S3 is a Ruby library for Amazon's Simple Storage Service's REST API (http://aws.amazon.com/s3).
+ # Full documentation of the currently supported API can be found at http://docs.amazonwebservices.com/AmazonS3/2006-03-01.
+ #
+ # == Getting started
+ #
+ # To get started you need to require 'aws/s3':
+ #
+ # % irb -rubygems
+ # irb(main):001:0> require 'aws/s3'
+ # # => true
+ #
+ # The AWS::S3 library ships with an interactive shell called <tt>s3sh</tt>. From within it, you have access to all the operations the library exposes from the command line.
+ #
+ # % s3sh
+ # >> Version
+ #
+ # Before you can do anything, you must establish a connection using Base.establish_connection!. A basic connection would look something like this:
+ #
+ # AWS::S3::Base.establish_connection!(
+ # :access_key_id => 'abc',
+ # :secret_access_key => '123'
+ # )
+ #
+ # The minimum connection options that you must specify are your access key id and your secret access key.
+ #
+ # (If you don't already have your access keys, all you need to sign up for the S3 service is an account at Amazon. You can sign up for S3 and get access keys by visiting http://aws.amazon.com/s3.)
+ #
+ # For convenience, if you set two special environment variables with the value of your access keys, the console will automatically create a default connection for you. For example:
+ #
+ # % cat .amazon_keys
+ # export AMAZON_ACCESS_KEY_ID='abcdefghijklmnop'
+ # export AMAZON_SECRET_ACCESS_KEY='1234567891012345'
+ #
+ # Then load it in your shell's rc file.
+ #
+ # % cat .zshrc
+ # if [[ -f "$HOME/.amazon_keys" ]]; then
+ # source "$HOME/.amazon_keys";
+ # fi
+ #
+ # See more connection details at AWS::S3::Connection::Management::ClassMethods.
+ module S3
+ constant :DEFAULT_HOST, 's3.amazonaws.com'
+
+ # AWS::S3::Base is the abstract super class of all classes who make requests against S3, such as the built in
+ # Service, Bucket and S3Object classes. It provides methods for making requests, inferring or setting response classes,
+ # processing request options, and accessing attributes from S3's response data.
+ #
+ # Establishing a connection with the Base class is the entry point to using the library:
+ #
+ # AWS::S3::Base.establish_connection!(:access_key_id => '...', :secret_access_key => '...')
+ #
+ # The <tt>:access_key_id</tt> and <tt>:secret_access_key</tt> are the two required connection options. More
+ # details can be found in the docs for Connection::Management::ClassMethods.
+ #
+ # Extensive examples can be found in the README[link:files/README.html].
+ class Base
+ class << self
+ # Wraps the current connection's request method and picks the appropriate response class to wrap the response in.
+ # If the response is an error, it will raise that error as an exception. All such exceptions can be caught by rescuing
+ # their superclass, the ResponseError exception class.
+ #
+ # It is unlikely that you would call this method directly. Subclasses of Base have convenience methods for each http request verb
+ # that wrap calls to request.
+ def request(verb, path, options = {}, body = nil, attempts = 0, &block)
+ Service.response = nil
+ process_options!(options, verb)
+ response = response_class.new(connection.request(verb, path, options, body, attempts, &block))
+ Service.response = response
+
+ Error::Response.new(response.response).error.raise if response.error?
+ response
+ # Once in a while, a request to S3 returns an internal error. A glitch in the matrix I presume. Since these
+ # errors are few and far between the request method will rescue InternalErrors the first three times they encouter them
+ # and will retry the request again. Most of the time the second attempt will work.
+ rescue *retry_exceptions
+ attempts == 3 ? raise : (attempts += 1; retry)
+ end
+
+ [:get, :post, :put, :delete, :head].each do |verb|
+ class_eval(<<-EVAL, __FILE__, __LINE__)
+ def #{verb}(path, headers = {}, body = nil, &block)
+ request(:#{verb}, path, headers, body, &block)
+ end
+ EVAL
+ end
+
+ # Called when a method which requires a bucket name is called without that bucket name specified. It will try to
+ # infer the current bucket by looking for it as the subdomain of the current connection's address. If no subdomain
+ # is found, CurrentBucketNotSpecified will be raised.
+ #
+ # MusicBucket.establish_connection! :server => 'jukeboxzero.s3.amazonaws.com'
+ # MusicBucket.connection.server
+ # => 'jukeboxzero.s3.amazonaws.com'
+ # MusicBucket.current_bucket
+ # => 'jukeboxzero'
+ #
+ # Rather than infering the current bucket from the subdomain, the current class' bucket can be explicitly set with
+ # set_current_bucket_to.
+ def current_bucket
+ connection.subdomain or raise CurrentBucketNotSpecified.new(connection.http.address)
+ end
+
+ # If you plan on always using a specific bucket for certain files, you can skip always having to specify the bucket by creating
+ # a subclass of Bucket or S3Object and telling it what bucket to use:
+ #
+ # class JukeBoxSong < AWS::S3::S3Object
+ # set_current_bucket_to 'jukebox'
+ # end
+ #
+ # For all methods that take a bucket name as an argument, the current bucket will be used if the bucket name argument is omitted.
+ #
+ # other_song = 'baby-please-come-home.mp3'
+ # JukeBoxSong.store(other_song, open(other_song))
+ #
+ # This time we didn't have to explicitly pass in the bucket name, as the JukeBoxSong class knows that it will
+ # always use the 'jukebox' bucket.
+ #
+ # "Astute readers", as they say, may have noticed that we used the third parameter to pass in the content type,
+ # rather than the fourth parameter as we had the last time we created an object. If the bucket can be inferred, or
+ # is explicitly set, as we've done in the JukeBoxSong class, then the third argument can be used to pass in
+ # options.
+ #
+ # Now all operations that would have required a bucket name no longer do.
+ #
+ # other_song = JukeBoxSong.find('baby-please-come-home.mp3')
+ def set_current_bucket_to(name)
+ raise ArgumentError, "`#{__method__}' must be called on a subclass of #{self.name}" if self == AWS::S3::Base
+ instance_eval(<<-EVAL)
+ def current_bucket
+ '#{name}'
+ end
+ EVAL
+ end
+ alias_method :current_bucket=, :set_current_bucket_to
+
+ private
+
+ def response_class
+ FindResponseClass.for(self)
+ end
+
+ def process_options!(options, verb)
+ options.replace(RequestOptions.process(options, verb))
+ end
+
+ # Using the conventions layed out in the <tt>response_class</tt> works for more than 80% of the time.
+ # There are a few edge cases though where we want a given class to wrap its responses in different
+ # response classes depending on which method is being called.
+ def respond_with(klass)
+ eval(<<-EVAL, binding, __FILE__, __LINE__)
+ def new_response_class
+ #{klass}
+ end
+
+ class << self
+ alias_method :old_response_class, :response_class
+ alias_method :response_class, :new_response_class
+ end
+ EVAL
+
+ yield
+ ensure
+ # Restore the original version
+ eval(<<-EVAL, binding, __FILE__, __LINE__)
+ class << self
+ alias_method :response_class, :old_response_class
+ end
+ EVAL
+ end
+
+ def bucket_name(name)
+ name || current_bucket
+ end
+
+ def retry_exceptions
+ [InternalError, RequestTimeout]
+ end
+
+ class RequestOptions < Hash #:nodoc:
+ attr_reader :options, :verb
+
+ class << self
+ def process(*args, &block)
+ new(*args, &block).process!
+ end
+ end
+
+ def initialize(options, verb = :get)
+ @options = options.to_normalized_options
+ @verb = verb
+ super()
+ end
+
+ def process!
+ set_access_controls! if verb == :put
+ replace(options)
+ end
+
+ private
+ def set_access_controls!
+ ACL::OptionProcessor.process!(options)
+ end
+ end
+ end
+
+ def initialize(attributes = {}) #:nodoc:
+ @attributes = attributes
+ end
+
+ private
+ attr_reader :attributes
+
+ def connection
+ self.class.connection
+ end
+
+ def http
+ connection.http
+ end
+
+ def request(*args, &block)
+ self.class.request(*args, &block)
+ end
+
+ def method_missing(method, *args, &block)
+ attributes[method.to_s] || attributes[method] || super
+ end
+ end
+ end
+end
58 lib/aws/s3/bittorrent.rb
@@ -0,0 +1,58 @@
+module AWS
+ module S3
+ # Objects on S3 can be distributed via the BitTorrent file sharing protocol.
+ #
+ # You can get a torrent file for an object by calling <tt>torrent_for</tt>:
+ #
+ # S3Object.torrent_for 'kiss.jpg', 'marcel'
+ #
+ # Or just call the <tt>torrent</tt> method if you already have the object:
+ #
+ # song = S3Object.find 'kiss.jpg', 'marcel'
+ # song.torrent
+ #
+ # Calling <tt>grant_torrent_access_to</tt> on a object will allow anyone to anonymously
+ # fetch the torrent file for that object:
+ #
+ # S3Object.grant_torrent_access_to 'kiss.jpg', 'marcel'
+ #
+ # Anonymous requests to
+ #
+ # http://s3.amazonaws.com/marcel/kiss.jpg?torrent
+ #
+ # will serve up the torrent file for that object.
+ module BitTorrent
+ def self.included(klass) #:nodoc:
+ klass.extend ClassMethods
+ end
+
+ # Adds methods to S3Object for accessing the torrent of a given object.
+ module ClassMethods
+ # Returns the torrent file for the object with the given <tt>key</tt>.
+ def torrent_for(key, bucket = nil)
+ get(path!(bucket, key) << '?torrent').body
+ end
+ alias_method :torrent, :torrent_for
+
+ # Grants access to the object with the given <tt>key</tt> to be accessible as a torrent.
+ def grant_torrent_access_to(key, bucket = nil)
+ policy = acl(key, bucket)
+ return true if policy.grants.include?(:public_read)
+ policy.grants << ACL::Grant.grant(:public_read)
+ acl(key, bucket, policy)
+ end
+ alias_method :grant_torrent_access, :grant_torrent_access_to
+ end
+
+ # Returns the torrent file for the object.
+ def torrent
+ self.class.torrent_for(key, bucket.name)
+ end
+
+ # Grants torrent access publicly to anyone who requests it on this object.
+ def grant_torrent_access
+ self.class.grant_torrent_access_to(key, bucket.name)
+ end
+ end
+ end
+end
320 lib/aws/s3/bucket.rb
@@ -0,0 +1,320 @@
+module AWS
+ module S3
+ # Buckets are containers for objects (the files you store on S3). To create a new bucket you just specify its name.
+ #
+ # # Pick a unique name, or else you'll get an error
+ # # if the name is already taken.
+ # Bucket.create('jukebox')
+ #
+ # Bucket names must be unique across the entire S3 system, sort of like domain names across the internet. If you try
+ # to create a bucket with a name that is already taken, you will get an error.
+ #
+ # Assuming the name you chose isn't already taken, your new bucket will now appear in the bucket list:
+ #
+ # Service.buckets
+ # # => [#<AWS::S3::Bucket @attributes={"name"=>"jukebox"}>]
+ #
+ # Once you have succesfully created a bucket you can you can fetch it by name using Bucket.find.
+ #
+ # music_bucket = Bucket.find('jukebox')
+ #
+ # The bucket that is returned will contain a listing of all the objects in the bucket.
+ #
+ # music_bucket.objects.size
+ # # => 0
+ #
+ # If all you are interested in is the objects of the bucket, you can get to them directly using Bucket.objects.
+ #
+ # Bucket.objects('jukebox').size
+ # # => 0
+ #
+ # By default all objects will be returned, though there are several options you can use to limit what is returned, such as
+ # specifying that only objects whose name is after a certain place in the alphabet be returned, and etc. Details about these options can
+ # be found in the documentation for Bucket.find.
+ #
+ # To add an object to a bucket you specify the name of the object, its value, and the bucket to put it in.
+ #
+ # file = 'black-flowers.mp3'
+ # S3Object.store(file, open(file), 'jukebox')
+ #
+ # You'll see your file has been added to it:
+ #
+ # music_bucket.objects
+ # # => [#<AWS::S3::S3Object '/jukebox/black-flowers.mp3'>]
+ #
+ # You can treat your bucket like a hash and access objects by name:
+ #
+ # jukebox['black-flowers.mp3']
+ # # => #<AWS::S3::S3Object '/jukebox/black-flowers.mp3'>
+ #
+ # In the event that you want to delete a bucket, you can use Bucket.delete.
+ #
+ # Bucket.delete('jukebox')
+ #
+ # Keep in mind, like unix directories, you can not delete a bucket unless it is empty. Trying to delete a bucket
+ # that contains objects will raise a BucketNotEmpty exception.
+ #
+ # Passing the :force => true option to delete will take care of deleting all the bucket's objects for you.
+ #
+ # Bucket.delete('photos', :force => true)
+ # # => true
+ class Bucket < Base
+ class << self
+ # Creates a bucket named <tt>name</tt>.
+ #
+ # Bucket.create('jukebox')
+ #
+ # Your bucket name must be unique across all of S3. If the name
+ # you request has already been taken, you will get a 409 Conflict response, and a BucketAlreadyExists exception
+ # will be raised.
+ #
+ # By default new buckets have their access level set to private. You can override this using the <tt>:access</tt> option.
+ #
+ # Bucket.create('internet_drop_box', :access => :public_read_write)
+ #
+ # The full list of access levels that you can set on Bucket and S3Object creation are listed in the README[link:files/README.html]
+ # in the section called 'Setting access levels'.
+ def create(name, options = {})
+ validate_name!(name)
+ put("/#{name}", options).success?
+ end
+
+ # Fetches the bucket named <tt>name</tt>.
+ #
+ # Bucket.find('jukebox')
+ #
+ # If a default bucket is inferable from the current connection's subdomain, or if set explicitly with Base.set_current_bucket,
+ # it will be used if no bucket is specified.
+ #
+ # MusicBucket.current_bucket
+ # => 'jukebox'
+ # MusicBucket.find.name
+ # => 'jukebox'
+ #
+ # By default all objects contained in the bucket will be returned (sans their data) along with the bucket.
+ # You can access your objects using the Bucket#objects method.
+ #
+ # Bucket.find('jukebox').objects
+ #
+ # There are several options which allow you to limit which objects are retrieved. The list of object filtering options
+ # are listed in the documentation for Bucket.objects.
+ def find(name = nil, options = {})
+ new(get(path(name, options)).bucket)
+ end
+
+ # Return just the objects in the bucket named <tt>name</tt>.
+ #
+ # By default all objects of the named bucket will be returned. There are options, though, for filtering
+ # which objects are returned.
+ #
+ # === Object filtering options
+ #
+ # * <tt>:max_keys</tt> - The maximum number of keys you'd like to see in the response body.
+ # The server may return fewer than this many keys, but will not return more.
+ #
+ # Bucket.objects('jukebox').size
+ # # => 3
+ # Bucket.objects('jukebox', :max_keys => 1).size
+ # # => 1
+ #
+ # * <tt>:prefix</tt> - Restricts the response to only contain results that begin with the specified prefix.
+ #
+ # Bucket.objects('jukebox')
+ # # => [<AWS::S3::S3Object '/jazz/miles.mp3'>, <AWS::S3::S3Object '/jazz/dolphy.mp3'>, <AWS::S3::S3Object '/classical/malher.mp3'>]
+ # Bucket.objects('jukebox', :prefix => 'classical')
+ # # => [<AWS::S3::S3Object '/classical/malher.mp3'>]
+ #
+ # * <tt>:marker</tt> - Marker specifies where in the result set to resume listing. It restricts the response
+ # to only contain results that occur alphabetically _after_ the value of marker. To retrieve the next set of results,
+ # use the last key from the current page of results as the marker in your next request.
+ #
+ # # Skip 'mahler'
+ # Bucket.objects('jukebox', :marker => 'mb')
+ # # => [<AWS::S3::S3Object '/jazz/miles.mp3'>]
+ #
+ # === Examples
+ #
+ # # Return no more than 2 objects whose key's are listed alphabetically after the letter 'm'.
+ # Bucket.objects('jukebox', :marker => 'm', :max_keys => 2)
+ # # => [<AWS::S3::S3Object '/jazz/miles.mp3'>, <AWS::S3::S3Object '/classical/malher.mp3'>]
+ #
+ # # Return no more than 2 objects whose key's are listed alphabetically after the letter 'm' and have the 'jazz' prefix.
+ # Bucket.objects('jukebox', :marker => 'm', :max_keys => 2, :prefix => 'jazz')
+ # # => [<AWS::S3::S3Object '/jazz/miles.mp3'>]
+ def objects(name = nil, options = {})
+ find(name, options).object_cache
+ end
+
+ # Deletes the bucket named <tt>name</tt>.
+ #
+ # All objects in the bucket must be deleted before the bucket can be deleted. If the bucket is not empty,
+ # BucketNotEmpty will be raised.
+ #
+ # You can side step this issue by passing the :force => true option to delete which will take care of
+ # emptying the bucket before deleting it.
+ #
+ # Bucket.delete('photos', :force => true)
+ #
+ # Only the owner of a bucket can delete a bucket, regardless of the bucket's access control policy.
+ def delete(name = nil, options = {})
+ name = path(name)