Skip to content

s3cmd modify command #37

Closed
mckamey opened this Issue Mar 9, 2012 · 52 comments
@mckamey
mckamey commented Mar 9, 2012

It would be super awesome to have something akin to a modify command that allowed changing the settings that are possible to set during upload but which don't seem to have a way to be modified later.

For instance, I'd like to be able to set headers, switch to reduced redundancy, make public, etc. for JPGs that are already uploaded:

s3cmd modify \
    --add-header='Cache-Control: public, max-age=31536000' \
    --reduced-redundancy \
    --cf-invalidate \
    --acl-public \
    --recursive \
    --exclude '*' \
    --include '*.jpg' \
    'foo' \
    s3://foo

Unless I'm missing something these can be changed manually in the AWS Console, but not via s3cmd without a full delete and re-upload which may not be feasible.

Thanks in advance and keep up the great work!

@mludvig
@mckamey
mckamey commented Mar 9, 2012

Is there a way to do that in bulk? If it has to be done one at a time, I can do that in the AWS Console. I'd really love at the least the ability to add a new header without having to re-upload everything.

@mludvig
@mckamey
mckamey commented Mar 9, 2012

The messaging implied that it worked but using mv appears to have deleted all of the files in the directory:

s3cmd mv --add-header='Cache-Control: public, max-age=31536000' --reduced-redundancy --cf-invalidate --acl-public --recursive s3://bucket/folder/ s3://bucket/folder/

I haven't tried cp yet...

@mludvig
@mckamey
mckamey commented Mar 9, 2012

I'm finally getting back around to this. I just tried using cp and it doesn't look like it works. On the plus side though it looks like maintained the attributes that were already applied. I was wondering if I'd have to respecify them.

It's possible that S3 detects that the source and target are the same and does a NOOP.

@mludvig
@mckamey
mckamey commented Mar 9, 2012

Copying seems to work great, it even keeps the headers. But it doesn't seem to respect the --add-header option:

s3cmd cp --add-header='Cache-Control: public, max-age=31536000' --recursive s3://backet/dir1/ s3://bucket/dir2/
@mludvig
@brandon-rhodes

I am getting an S3 error when I attempt to "cp" an object to itself with or without an "--add-header" option:


$ s3cmd cp -P --add-header="Cache-Control: max-age=86400"  s3://bucket/foo.jpg s3://bucket/foo.jpg

ERROR: S3 error: 400 (InvalidRequest): This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class or encryption attributes.

Oh, it will be painful if I have to re-install all of these gigabytes of content just because I didn't know to set a Cache-Control header the first time around. Thanks for s3cmd, though, it has otherwise been an INCREDIBLE timesaver and tool!

@brandon-rhodes

A bigger problem: even copying the resource to a new name does not make a new header appear.

$ s3cmd cp -P --add-header="Cache-Control: max-age=86400"  s3://bucket/foo.jpg s3://bucket/foo2.jpg

Downloading foo2.jpg does not show any additional headers beyond the ones that Amazon provides by default. I should note that doing a completely fresh "put" to a non-existent URL does cause the header to appear, as does doing the "put" to the already-existing resource so that you overwrite it.

@mckamey
mckamey commented Apr 4, 2012

It sounds like @brandon-rhodes has the same use case I do. I haven't had a chance to dig into s3cmd code but it does seem like there needs to be a tweak in the s3cmd cp command to support this.

To answer the question @mludvig raised of a comma vs. semi-colon, the comma is correct syntax for extensions according to the HTTP/1.1 RFC:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.6

Regardless, I don't believe the content of the header shouldn't affect whether the header is set or not. I believe S3 filters by header name.

@willrc
willrc commented Sep 4, 2012

For some reason getting this to work via python subprocess took a lot of fiddling. I though it might be helpful to post what worked:

'--add-header=Cache-Control:public;max-age=31536000'

Note: semicolon, no spaces. Any other settings cause s3cmd to complain when called via subprocess (though it works fine from the command line).

@Valloric

I have the exact same use case as the other people here; I want to just add headers to files that are already in my bucket on s3. A modify command sounds like the cleanest approach from a usability perspective since the mental model for cp would not need to be changed ("oh, I can also modify a file's metadata with cp too if I copy a file to itself? Weird...").

@andreteves

+1 for this feature

@fgaudin
fgaudin commented Dec 6, 2012

here is a patch to make it work #94, you also have to define the content-type

@tszming
tszming commented Dec 9, 2012

I agree this feature is useful, and for those who need now, consider using a script like this: http://amix.dk/blog/post/19687

@gollyg
gollyg commented Jan 9, 2013

I think that one of the issues here is that s3cmd is hard coding in a header in line 414 of S3.py. This header sets the x-amz-metadata-directive to COPY (there is a TODO to turn this to a switch). From S3 documentation:

Constraints: Values other than COPY or REPLACE result in an immediate 400-based error response. You cannot copy an object to itself unless the MetadataDirective header is specified and its value set to REPLACE.

http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html

So I don't think that the final request is actually configured correctly. Manually adding this header does not overwrite the default value.

I have hacked on that file, and successfully run a batch script that recursively changes 400,000 images from application/octet to application/jpeg. But the todo idea of adding a switch is needed for release code.

@luk3thomas

+1 for this as well

@johnboxall

From website redirects, I used the following pattern:

s3cmd -c .s3cfg put dummy/index.html s3://bucket/from/ -P --add-header="x-amz-website-redirect-location:/to/"

Make a dummy file dummy/index.html and upload that - set a redirect on it boom!

@ShepBook

+1 from me as well. I could use this exact ability right about now. :)

@arxpoetica

+1

@conatus
conatus commented Jun 20, 2013

+1 for this!

@snbuback

+1

@pelcasandra

+1

@winzig
winzig commented Aug 9, 2013

+100

@joelso
joelso commented Sep 8, 2013

+1

@bendenoz

+1

@starbugs

+1000

@dbackeus

+1

@ianmaddox

+1

@reikjarloekl

+1

@Flash-
Flash- commented Jan 28, 2014

+1

@mdomsch
s3tools member
mdomsch commented Jan 28, 2014

All - while your +1 votes are nice, I'd really appreciate someone trying to tackle the patch to implement this. I'd be happy to review and merge something that's done well. There's a patch in #94 that may be a good start, but even the author indicates it has problems.

@cirofeitosa

+1

@mdomsch
s3tools member
@mdomsch mdomsch closed this May 1, 2014
@mlb5000
mlb5000 commented Jun 24, 2014

I just tried this today using the latest in Git to no avail.

s3cmd --access_key=<key>--secret_key=<key> --verbose -m image/jpeg modify s3://bucket/*.jpg

ERROR: S3 error: 400 (InvalidRequest): This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.
@bejayoharen

I used this to add a cache-control header and it switched all my mime-types to binary/octet-stream. Maj Bummer.

@mlb5000
mlb5000 commented Jul 20, 2014

@bejayoharen I ended up modifying the script here to suit my needs: http://amix.dk/blog/post/19687

@genexp
genexp commented Sep 25, 2014

+1

@carl-amplidata

This seems to work on my latest s3cmd checkout

-> ./s3cmd --version
s3cmd version 1.5.0-rc1

-> ./s3cmd --add-header=x-amz-metadata-directive:REPLACE -c s3cmd.scaler.cfg cp s3://bucketname/objectname s3://bucketname/objectname

-> ./s3cmd --add-header=x-amz-meta-newmetakey:newmetavalue --add-header=x-amz-metadata-directive:REPLACE -c s3cmd.scaler.cfg cp s3://bucketname/objectname s3://bucketname/objectname

@CraigMason

as @bejayoharen said, using s3cmd modify appears to set the content-type header to binary/octet-stream. I tried using it in combination with numerous mime-type settings, to no avail.

Trying to specify --add-header="Content-Type: image/jpeg" directly gave a signature error.

I resorted to re-putting the file and adding --add-header="Cache-Control:max-age=31536000"

@mdomsch
s3tools member
@pablolibo

+1

@mdomsch
s3tools member
@lindleywhite

So will modify work, without creating a copy of the file in the same directory?

Assuming it will, can we use it like this?

s3cmd modify --recursive --add-header=Cache-Control:max-age=86400 s3://BUCKET/FOLDER
@lindleywhite

I pulled down the latest branch and am getting an error from Python.

Invoked as: /usr/local/bin/s3cmd modify --add-header=Cache-Control:max-age=86400 -m image/png s3://blendtec.com/images/Homepage/warranty_mobile.png
Problem: AttributeError: 'S3' object has no attribute 'object_modify'
S3cmd:   1.5.0-rc1
python:   2.7.9 (default, Dec 19 2014, 11:33:50) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)]
environment LANG=en_US.UTF-8

Traceback (most recent call last):
  File "/usr/local/bin/s3cmd", line 2547, in <module>
    rc = main()
  File "/usr/local/bin/s3cmd", line 2460, in main
    rc = cmd_func(args)
  File "/usr/local/bin/s3cmd", line 761, in cmd_modify
    return subcmd_cp_mv(args, s3.object_modify, "modify", u"File %(src)s modified")
AttributeError: 'S3' object has no attribute 'object_modify'
@mdomsch
s3tools member
@ryandurfey

Just installed s3cmd/1.6 and the command line modify suggested above worked. I also included an extra example for pattern matching in the pathname.

Modify everything below a directory:
s3cmd modify --recursive --add-header=Cache-Control:max-age=0 s3://bucketname/directory1/directory2

Modify everything with "directory4" in the path name:
s3cmd modify --recursive --add-header=Cache-Control:max-age=300 s3://bucketname --exclude "*" --include "*directory4*"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.