Skip to content
This repository

:hash interpolation for generic obfuscated URLs and :timestamp bugfix #416

Merged
1 commit merged into from about 3 years ago

8 participants

John Mileham Jon Yurek Andrey Voronov Omar Abdel-Wahab Paul A Jungwirth jriff guyisra
John Mileham

Hi there,

I'm building a site that needs to securely obfuscate attachments in a public S3 bucket. The solution I arrived at seems like it might be useful to the community as a general-purpose tool that remains customizable enough to allow developers with different requirements to implement it differently without having to hack into Paperclip core. It is a set of storage-agnostic extensions to Attachment and Interpolations and supporting tests.

It features:

  • No extra data model required
  • Choice of HMAC digest algorithm on a per-attachment basis via openssl (modeled after ActiveSupport::MessageVerifier, default algo is SHA1)
  • Choice of fields to include in the hash on a per-attachment basis, which permits many developer-selected tradeoffs between security and flexibility -- some examples:
    • :hash_data => ":class/:attachment/:id" - Unique per asset, long-lived filename with deducible style URLs (e.g. you want a 3rd party site to be able to generate style URLs from a single URL trunk that you provide once and always get the latest attachment)
    • :hash_data => ":class/:attachment/:id/:style" - Adds non-deducible style URLs (e.g. you want to store uncompressed :originals but don't want end users to be able to find them and kill your bandwidth bill)
    • :hash_data => ":class/:attachment/:id/:style/:updated_at" - Adds non-deducible versions (e.g. if a user's private profile photo URL makes it out into the wild via a malicious friend, they can unfriend, upload a new photo and the leak is plugged) -- this is the default

Since the hashes are extremely unlikely to collide, you can also remove other elements of the :path that were previously required for global uniqueness (like :id), which can give your users plausible deniability of ownership, or even completely obliterate the directory structure (:path => ":hash" -- or less controversially :path => ":hash.:extension") such that an attacker can't even prove what kind of attachment he/she has gained access to. Paired with a large :hash_secret and HTTPS, your users are left with a similar security guarantee to the analog hole -- attackers would need an authorized-user mole, or would need to compromise either your server-side secret or an authorized user's machine in order to gain access to protected attachments, and then would be able to freely distribute them (sadly, using your hosting infrastructure as an accomplice).

The :timestamp bug

While building this, I ran across what I think is a bug: The :timestamp interpolation is non-deterministic in the presence of per-thread time_zones, e.g. displaying time zones per user location, which would affect any such site that uses :timestamp in an attachment :path or :url (or if I had used :timestamp in :hash). While using :timestamp in paths and URLs is probably a little-used feature given the general wordiness of Time#to_s, and the intersection of sites that attempt to use it as well as implement per-user time zones is a vanishingly small slice of Paperclip users, I think it's worth taking a look at.

I took a stab at dealing with it by adding a new attachment config parameter :use_default_time_zone (defaulted to true) that explicitly churns out :timestamp interpolations in the server-wide default time zone. This is not perfect, because it's possible that a site owner would want to change the default time zone via config.time_zone=, which would in turn invalidate all of their existing attachment URLs. One option available to such an implementor would be to redefine my new method Attachment#time_zone explicitly to meet their requirements. Or I could add a :time_zone option on Attachment, defaulted to null, and falling through to the present Attachment#time_zone implementation. Such an option could even optionally take a block a la options[:url] and options[:path].

While the ideal path for :timestamp is unclear to me as I'm not deeply familiar with the design philosophy behind Paperclip or the direction it wants to head in (aggressively fixing holes like this or supporting users who may rely on legacy behavior?), it seemed natural to me to expose the integer Attachment#updated_at value as an interpolation as an alternative to :timestamp. Epoch seconds don't carry any time zone information and are therefor immune to this issue, plus they're presumably faster to convert to than text, and fewer characters to hash to boot.

So my present implementation has a provisionally "improved" behavior for :timestamp (which should be backward compatible for everybody who didn't run into the bug personally), and a new :updated_at interpolation that merely exposes the existing Attachment instance method. I'd be eager for feedback on what if anything should be done with that stuff.

The caveats:

  • I'm pretty new to testing, and this was definitely my first time writing tests using shoulda and test/unit. Feel free to slap me for transgressions in style and abuse of mocks/stubs/fakes/dummies.
  • I tried to follow house style to the extent I could derive it, but I'm sure I got it wrong in places.
  • I'm not sure what went on with the appraisal gemfile.locks reverting to an earlier version dependency on Rails -- perhaps this is expected? Again, never used appraisal before either. Tips would be more than welcome.

I'd be very excited if you see some or all of these changes as valuable to the Paperclip mainstream, and am happy to iterate this into something worthy of pulling.

Thanks a lot!

-john

Jon Yurek
Collaborator

Thanks for this patch! I've pulled it into master.

Andrey Voronov

Very good!!!

Omar Abdel-Wahab
owahab commented

Amazing feature with no tangible documentation. Any chance you can help the community using this amazing feature?

Paul A Jungwirth

This is a great feature, and I'm glad to see it pulled into master. But I'm having trouble with including :updated_at in the :hash parameter. My image gets stored on S3 at one path, but then when I say user.photo(:original) I get a different path. I guess this is because Paperclip is storing the file /before/ the model is saved, so that the path is immediately wrong because further requests for the path will depend on a new updated_at. But I find this hard to believe, because I guess the feature works for other folks?

Incidentally, it seems that another tangle about using updated_at is that if you alter anything else in the model, you'll need to remember to move the file(s) on S3. Is that right?

John Mileham

The updated at uses the attachment's updated at column so unrelated model changes shouldn't affect it. We've been using this code in production at ImpulseSave for about as long as the pull request has been in the wild and it manages file names without breaking references without problems as long as you don't go behind paperclip's back when updating your models.

I would make sure that the model instance you're asking the URL of is fresh and aware of the latest save and that you're storing an :original. If your hash_data includes style, that means that each style will have a distinct hash too of course, so different hashes don't necessarily mean out-of-date, possibly just for a different attachment style.

If you'd like to throw up a gist of what you're working with I'd be happy to take a quick look.

Can this work with existing models? I tried it, but it obfuscates the existing normal file name and breaks any existing paths.

Any way to just apply it to new uploads?

John Mileham

Migrating will probably take a little bit of creativity... Your best net might be to start with a new paperclip attachment on the same model that uses hashes and either migrate the assets across in bulk (maybe via a rake task?) or use the new attachment moving forward and fall back to the old on a read-only basis.

Thanks for the quick response :-) I'll work something up in the morning when I'm freshly caffeinated.

jriff

I have a problem with the file name in the _file_name DB field. I do this:

has_attached_file :video,
:storage => :s3,
:s3_credentials => "#{Rails.root}/config/s3.yml",
:path => "/videos/:hash.:extension",
:hash_secret => "[my secret string]"

When a file is uploaded the _file_name field get set to the original name of the file and not the hashed one. Am I doing something wrong?

John Mileham

The generated URL is never persisted to the database. my_model.video.url should return what you're looking for.

jriff

Thanks - but this gives the whole URL. How about adding something like this:

module Paperclip
class Attachment
def file_name
File.basename(url)
end
end
end

John Mileham

The filename is not really meaningful to Paperclip without the rest of its URL -- depending on how you configure the :path, the filename might be identical for every instance, and might only be distinct based on the rest of the path. If presenting the filename on its own makes sense in your application, you probably just want to put that in a helper method.

guyisra
guyisra commented

anyone figured out how to easily move attachments with old path to new obfuscated path?

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 1 unique commit by 1 author.

Feb 10, 2011
John Mileham Adds secure :hash interpolation and fixes time zone brittleness in :t…
…imestamp interpolation
13c86cf
This page is out of date. Refresh to see the latest.
2  gemfiles/rails3.gemfile
@@ -2,7 +2,7 @@
2 2
 
3 3
 source "http://rubygems.org"
4 4
 gem "ruby-debug"
5  
-gem "rails", ">=3.0.3"
  5
+gem "rails", "~>3.0.0"
6 6
 gem "rake"
7 7
 gem "sqlite3-ruby", "~>1.3.0"
8 8
 gem "shoulda"
2  gemfiles/rails3.gemfile.lock
@@ -90,7 +90,7 @@ DEPENDENCIES
90 90
   appraisal
91 91
   aws-s3
92 92
   mocha
93  
-  rails (>= 3.0.3)
  93
+  rails (~> 3.0.0)
94 94
   rake
95 95
   ruby-debug
96 96
   shoulda
78  lib/paperclip/attachment.rb
@@ -8,16 +8,19 @@ class Attachment
8 8
 
9 9
     def self.default_options
10 10
       @default_options ||= {
11  
-        :url               => "/system/:attachment/:id/:style/:filename",
12  
-        :path              => ":rails_root/public:url",
13  
-        :styles            => {},
14  
-        :processors        => [:thumbnail],
15  
-        :convert_options   => {},
16  
-        :default_url       => "/:attachment/:style/missing.png",
17  
-        :default_style     => :original,
18  
-        :storage           => :filesystem,
19  
-        :use_timestamp     => true,
20  
-        :whiny             => Paperclip.options[:whiny] || Paperclip.options[:whiny_thumbnails]
  11
+        :url                   => "/system/:attachment/:id/:style/:filename",
  12
+        :path                  => ":rails_root/public:url",
  13
+        :styles                => {},
  14
+        :processors            => [:thumbnail],
  15
+        :convert_options       => {},
  16
+        :default_url           => "/:attachment/:style/missing.png",
  17
+        :default_style         => :original,
  18
+        :storage               => :filesystem,
  19
+        :use_timestamp         => true,
  20
+        :whiny                 => Paperclip.options[:whiny] || Paperclip.options[:whiny_thumbnails],
  21
+        :use_default_time_zone => true,
  22
+        :hash_digest           => "SHA1",
  23
+        :hash_data             => ":class/:attachment/:id/:style/:updated_at"
21 24
       }
22 25
     end
23 26
 
@@ -32,24 +35,28 @@ def initialize name, instance, options = {}
32 35
 
33 36
       options = self.class.default_options.merge(options)
34 37
 
35  
-      @url               = options[:url]
36  
-      @url               = @url.call(self) if @url.is_a?(Proc)
37  
-      @path              = options[:path]
38  
-      @path              = @path.call(self) if @path.is_a?(Proc)
39  
-      @styles            = options[:styles]
40  
-      @normalized_styles = nil
41  
-      @default_url       = options[:default_url]
42  
-      @default_style     = options[:default_style]
43  
-      @storage           = options[:storage]
44  
-      @use_timestamp     = options[:use_timestamp]
45  
-      @whiny             = options[:whiny_thumbnails] || options[:whiny]
46  
-      @convert_options   = options[:convert_options]
47  
-      @processors        = options[:processors]
48  
-      @options           = options
49  
-      @queued_for_delete = []
50  
-      @queued_for_write  = {}
51  
-      @errors            = {}
52  
-      @dirty             = false
  38
+      @url                   = options[:url]
  39
+      @url                   = @url.call(self) if @url.is_a?(Proc)
  40
+      @path                  = options[:path]
  41
+      @path                  = @path.call(self) if @path.is_a?(Proc)
  42
+      @styles                = options[:styles]
  43
+      @normalized_styles     = nil
  44
+      @default_url           = options[:default_url]
  45
+      @default_style         = options[:default_style]
  46
+      @storage               = options[:storage]
  47
+      @use_timestamp         = options[:use_timestamp]
  48
+      @whiny                 = options[:whiny_thumbnails] || options[:whiny]
  49
+      @use_default_time_zone = options[:use_default_time_zone]
  50
+      @hash_digest           = options[:hash_digest]
  51
+      @hash_data             = options[:hash_data]
  52
+      @hash_secret           = options[:hash_secret]
  53
+      @convert_options       = options[:convert_options]
  54
+      @processors            = options[:processors]
  55
+      @options               = options
  56
+      @queued_for_delete     = []
  57
+      @queued_for_write      = {}
  58
+      @errors                = {}
  59
+      @dirty                 = false
53 60
 
54 61
       initialize_storage
55 62
     end
@@ -197,6 +204,21 @@ def updated_at
197 204
       time && time.to_f.to_i
198 205
     end
199 206
 
  207
+    # The time zone to use for timestamp interpolation.  Using the default
  208
+    # time zone ensures that results are consistent across all threads.
  209
+    def time_zone
  210
+      @use_default_time_zone ? Time.zone_default : Time.zone
  211
+    end
  212
+
  213
+    # Returns a unique hash suitable for obfuscating the URL of an otherwise
  214
+    # publicly viewable attachment.
  215
+    def hash
  216
+      raise ArgumentError, "Unable to generate hash without :hash_secret" unless @hash_secret
  217
+      require 'openssl' unless defined?(OpenSSL)
  218
+      data = interpolate(@hash_data)
  219
+      OpenSSL::HMAC.hexdigest(OpenSSL::Digest.const_get(@hash_digest).new, @hash_secret, data)
  220
+    end
  221
+
200 222
     def generate_fingerprint(source)
201 223
       data = source.read
202 224
       source.rewind if source.respond_to?(:rewind)
18  lib/paperclip/interpolations.rb
@@ -48,8 +48,18 @@ def url attachment, style_name
48 48
     end
49 49
 
50 50
     # Returns the timestamp as defined by the <attachment>_updated_at field
  51
+    # in the server default time zone unless :use_global_time_zone is set
  52
+    # to false.  Note that a Rails.config.time_zone change will still 
  53
+    # invalidate any path or URL that uses :timestamp.  For a
  54
+    # time_zone-agnostic timestamp, use #updated_at.
51 55
     def timestamp attachment, style_name
52  
-      attachment.instance_read(:updated_at).to_s
  56
+      attachment.instance_read(:updated_at).in_time_zone(attachment.time_zone).to_s
  57
+    end
  58
+
  59
+    # Returns an integer timestamp that is time zone-neutral, so that paths
  60
+    # remain valid even if a server's time zone changes.
  61
+    def updated_at attachment, style_name
  62
+      attachment.updated_at
53 63
     end
54 64
 
55 65
     # Returns the Rails.root constant.
@@ -94,6 +104,12 @@ def fingerprint attachment, style_name
94 104
       attachment.fingerprint
95 105
     end
96 106
 
  107
+    # Returns a the attachment hash.  See Paperclip::Attachment#hash for
  108
+    # more details.
  109
+    def hash attachment, style_name
  110
+      attachment.hash
  111
+    end
  112
+
97 113
     # Returns the id of the instance in a split path form. e.g. returns
98 114
     # 000/001/234 for an id of 1234.
99 115
     def id_partition attachment, style_name
72  test/attachment_test.rb
@@ -96,6 +96,78 @@ class AttachmentTest < Test::Unit::TestCase
96 96
       assert_equal "1024.omg/1024-bbq/1024what/000/001/024.wtf", @dummy.avatar.path
97 97
     end
98 98
   end
  99
+  
  100
+  context "An attachment with :timestamp interpolations" do
  101
+    setup do
  102
+      @file = StringIO.new("...")
  103
+      @zone = 'UTC'
  104
+      Time.stubs(:zone).returns(@zone)
  105
+      @zone_default = 'Eastern Time (US & Canada)'
  106
+      Time.stubs(:zone_default).returns(@zone_default)
  107
+    end
  108
+
  109
+    context "using default time zone" do
  110
+      setup do
  111
+        rebuild_model :path => ":timestamp", :use_default_time_zone => true
  112
+        @dummy = Dummy.new
  113
+        @dummy.avatar = @file
  114
+      end
  115
+
  116
+      should "return a time in the default zone" do
  117
+        assert_equal @dummy.avatar_updated_at.in_time_zone(@zone_default).to_s, @dummy.avatar.path
  118
+      end
  119
+    end
  120
+    
  121
+    context "using per-thread time zone" do
  122
+      setup do
  123
+        rebuild_model :path => ":timestamp", :use_default_time_zone => false
  124
+        @dummy = Dummy.new
  125
+        @dummy.avatar = @file
  126
+      end
  127
+
  128
+      should "return a time in the per-thread zone" do
  129
+        assert_equal @dummy.avatar_updated_at.in_time_zone(@zone).to_s, @dummy.avatar.path
  130
+      end
  131
+    end
  132
+  end
  133
+  
  134
+  context "An attachment with :hash interpolations" do
  135
+    setup do
  136
+      @file = StringIO.new("...")
  137
+    end
  138
+    
  139
+    should "raise if no secret is provided" do
  140
+      @attachment = attachment :path => ":hash"
  141
+      @attachment.assign @file
  142
+
  143
+      assert_raise ArgumentError do
  144
+        @attachment.path
  145
+      end
  146
+    end
  147
+
  148
+    context "when secret is set" do
  149
+      setup do
  150
+        @attachment = attachment :path => ":hash", :hash_secret => "w00t"
  151
+        @attachment.stubs(:instance_read).with(:updated_at).returns(Time.at(1234567890))
  152
+        @attachment.stubs(:instance_read).with(:file_name).returns("bla.txt")
  153
+        @attachment.instance.id = 1234
  154
+        @attachment.assign @file
  155
+      end
  156
+      
  157
+      should "interpolate the hash data" do
  158
+        @attachment.expects(:interpolate).with(@attachment.options[:hash_data]).returns("interpolated_stuff")
  159
+        @attachment.hash
  160
+      end
  161
+      
  162
+      should "result in the correct interpolation" do
  163
+        assert_equal "fake_models/avatars/1234/original/1234567890", @attachment.send(:interpolate,@attachment.options[:hash_data])
  164
+      end
  165
+      
  166
+      should "result in a correct hash" do
  167
+        assert_equal "d22b617d1bf10016aa7d046d16427ae203f39fce", @attachment.path
  168
+      end
  169
+    end
  170
+  end
99 171
 
100 172
   context "An attachment with a :rails_env interpolation" do
101 173
     setup do
2  test/helper.rb
@@ -90,7 +90,7 @@ def rebuild_class options = {}
90 90
 class FakeModel
91 91
   attr_accessor :avatar_file_name,
92 92
                 :avatar_file_size,
93  
-                :avatar_last_updated,
  93
+                :avatar_updated_at,
94 94
                 :avatar_content_type,
95 95
                 :avatar_fingerprint,
96 96
                 :id
18  test/interpolations_test.rb
@@ -112,9 +112,25 @@ def url(*args)
112 112
 
113 113
   should "return the timestamp" do
114 114
     now = Time.now
  115
+    zone = 'UTC'
115 116
     attachment = mock
116 117
     attachment.expects(:instance_read).with(:updated_at).returns(now)
117  
-    assert_equal now.to_s, Paperclip::Interpolations.timestamp(attachment, :style)
  118
+    attachment.expects(:time_zone).returns(zone)
  119
+    assert_equal now.in_time_zone(zone).to_s, Paperclip::Interpolations.timestamp(attachment, :style)
  120
+  end
  121
+
  122
+  should "return updated_at" do
  123
+    attachment = mock
  124
+    seconds_since_epoch = 1234567890
  125
+    attachment.expects(:updated_at).returns(seconds_since_epoch)
  126
+    assert_equal seconds_since_epoch, Paperclip::Interpolations.updated_at(attachment, :style)
  127
+  end
  128
+
  129
+  should "return hash" do
  130
+    attachment = mock
  131
+    fake_hash = "a_wicked_secure_hash"
  132
+    attachment.expects(:hash).returns(fake_hash)
  133
+    assert_equal fake_hash, Paperclip::Interpolations.hash(attachment, :style)
118 134
   end
119 135
 
120 136
   should "call all expected interpolations with the given arguments" do
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.