Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No copy operations performed, even when destination is completely empty #16

Closed
ohTHATaaronbrown opened this issue Dec 9, 2013 · 8 comments
Labels

Comments

@ohTHATaaronbrown
Copy link

I have a backups bucket in us-west-2 that I want to mirror to another bucket in us-east-1 for disaster recovery purposes. It contains ~25000 files consuming ~2TB.

The key structure is:
YYYMMDD/CUSTOMER/TYPE/backup_file_name

I'm running s3s3mirror on an Ubuntu 12.04.3 LTS server, using the master branch of s3s3mirror.

I've got s3s3mirror using the credentials file from s3cmd (which I'm attempting to replace with s3s3mirror).

When I run it, here's what I get:

ubuntu@prod-logstash:/opt/s3s3mirror$ ./s3s3mirror.sh source_backups_bucket destination_backups_bucket
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - starting...
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.MirrorStats -
--------------------------------------------------------------------
STATS BEGIN
read: 10000
copied: 0
copy errors: 0
duration: 0:00:11
read rate: 50386.29492777964/minute
copy rate: 0.0/minute
bytes copied: 0 bytes
GET operations: 10066
COPY operations: 0
STATS END
--------------------------------------------------------------------

pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.MirrorStats -
--------------------------------------------------------------------
STATS BEGIN
read: 20000
copied: 0
copy errors: 0
duration: 0:00:17
read rate: 67613.2521974307/minute
copy rate: 0.0/minute
bytes copied: 0 bytes
GET operations: 19992
COPY operations: 0
STATS END
--------------------------------------------------------------------

pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - No more keys found in source bucket, exiting
Thread-1 INFO : org.cobbzilla.s3s3mirror.MirrorStats -
--------------------------------------------------------------------
STATS BEGIN
read: 25655
copied: 0
copy errors: 0
duration: 0:00:20
read rate: 73965.6912209889/minute
copy rate: 0.0/minute
bytes copied: 0 bytes
GET operations: 25912
COPY operations: 0
STATS END
--------------------------------------------------------------------

Nothing copied, even though there are several hundred files in the source that don't exist in the destination. Weird. I thought, perhaps it's an issue crossing regions, so I created another bucket in the same region and tried again. Same result: nothing copied, even though the destination is now a completely empty bucket.

What gives?

@ohTHATaaronbrown
Copy link
Author

Update: The only time anything actually happens is if I specify a different prefix for the destination than in the source. Then it starts the keylister, and promptly hangs with no status indicator for a long time, and then eventually gives me a lovely stack trace:

pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - starting...
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - No more keys found in source bucket, exiting
pool-1-thread-9 INFO : com.amazonaws.http.AmazonHttpClient - Unable to execute HTTP request: Read timed out
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:150)
        at java.net.SocketInputStream.read(SocketInputStream.java:121)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149)
        at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:110)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:260)
        at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:281)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
        at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:219)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:622)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:454)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:281)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:164)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:2839)
        at com.amazonaws.services.s3.AmazonS3Client.copyObject(AmazonS3Client.java:1211)
        at org.cobbzilla.s3s3mirror.KeyJob.run(KeyJob.java:59)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

If I hit ctrl+c, I get this:

--------------------------------------------------------------------
STATS BEGIN
read: 536
copied: 508
copy errors: 0
duration: 0:02:16
read rate: 234.81136965997618/minute
copy rate: 222.5451040807237/minute
bytes copied: 25.515344046987593 GB (27396892057 bytes)
GET operations: 1078
COPY operations: 508
STATS END
--------------------------------------------------------------------

So copying is happening, but only when the destination prefix path is not the same as the source prefix path.

But, this leads to another issue: if the destination prefix is different than the source prefix, the copy always happens, even if the file already exists at the destination path.

@WubbleWobble
Copy link

Sorry - just a "me too", but I seem to be having an identical issue. I've set the input prefix to just select a small subset of the files, and this small subset definitely does not exist at the destination.

wobble@saldejums:~/s3m3mirror$ ./s3s3mirror.sh -v -p images/rqWvqF -t 20 source-bucket dest-bucket 
main INFO : org.cobbzilla.s3s3mirror.KeyLister - added initial set of 8 keys 
main INFO : org.cobbzilla.s3s3mirror.MirrorMaster - 8 keys found in first batch from source bucket -- processing... 
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - starting... 
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - No more keys found in source bucket, exiting 
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - KeyLister run loop finished 
main INFO : org.cobbzilla.s3s3mirror.MirrorMaster - No more keys found in source bucket -- ALL DONE 
pool-1-thread-2 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image.jpg 
pool-1-thread-2 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image.jpg 
pool-1-thread-3 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_admin.jpg 
pool-1-thread-3 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_admin.jpg 
pool-1-thread-4 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_copy_large.jpg 
pool-1-thread-4 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_large.jpg 
pool-1-thread-5 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_copy_medium.jpg 
pool-1-thread-5 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_medium.jpg 
pool-1-thread-6 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_copy_small.jpg 
pool-1-thread-6 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_small.jpg 
pool-1-thread-7 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_copy_tiny.jpg 
pool-1-thread-7 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_tiny.jpg 
pool-1-thread-8 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_homepage-link.jpg 
pool-1-thread-8 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_homepage-link.jpg 
pool-1-thread-9 INFO : org.cobbzilla.s3s3mirror.KeyJob - Destination file is same as source, not copying: images/rqWvqF/image_widget.jpg 
pool-1-thread-9 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_widget.jpg 
Thread-1 INFO : org.cobbzilla.s3s3mirror.MirrorStats - 
-------------------------------------------------------------------- 
STATS BEGIN 
read: 8 
copied: 0 
copy errors: 0 
duration: 0:00:00 
read rate: 648.6486486486486/minute 
copy rate: 0.0/minute 
bytes copied: 0 bytes 
GET operations: 9 
COPY operations: 0 
STATS END 
--------------------------------------------------------------------  

This leads me to question whether it's doing something silly like using the same bucket for input and output? (despite the specified dest-bucket)

@WubbleWobble
Copy link

Command still runs with identical output if I do something crazy like:

wobble@saldejums:~/s3m3mirror$ ./s3s3mirror.sh -v -p images/rqWvqF -t 20 source-bucket non.existent.name

So either it's ignoring the dest-bucket argument or I'm doing something very very stupid? (both are possibilities!)

@WubbleWobble
Copy link

Ok - as far as I can tell, there's a regression in commit 8766708. Commit 03f547c works for me as expected.

In my test files, the destination files with the specified prefix definitely do not exist. Here's the broken output

#
# Commit 876670843c8d4c5f02122f69926bedca7b405ad8 (broken)
#

wobble@saldejums:~/s3m3mirror$ ./s3s3mirror.sh -v -n -p images/rqWvqF -t 20 source-bucket dest-bucket
main INFO : org.cobbzilla.s3s3mirror.MirrorMain - Adding shutdown hook
main INFO : org.cobbzilla.s3s3mirror.KeyLister - added initial set of 8 keys
main INFO : org.cobbzilla.s3s3mirror.MirrorMaster - 8 keys found in first batch from source bucket -- processing...
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - No more keys found in source bucket, exiting
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - KeyLister run loop finished
pool-1-thread-2 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image.jpg
pool-1-thread-3 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_admin.jpg
pool-1-thread-4 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_large.jpg
pool-1-thread-5 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_medium.jpg
pool-1-thread-6 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_small.jpg
pool-1-thread-7 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_tiny.jpg
pool-1-thread-8 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_homepage-link.jpg
main INFO : org.cobbzilla.s3s3mirror.MirrorMaster - No more keys found in source bucket -- ALL DONE
pool-1-thread-9 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_widget.jpg
Thread-1 INFO : org.cobbzilla.s3s3mirror.MirrorStats -
--------------------------------------------------------------------
STATS BEGIN
read: 0
copied: 0
copy errors: 0
duration: 0:00:00
read rate: 0.0/minute
copy rate: 0.0/minute
STATS END
--------------------------------------------------------------------

And here's the working version...

#
# Commit 03f547cc22e74f4877fd56c208451f0cc3a7ffbc
#

wobble@saldejums:~/s3m3mirror$ ./s3s3mirror.sh -v -n -p images/rqWvqF -t 20 source-bucket dest-bucket
main INFO : org.cobbzilla.s3s3mirror.MirrorMain - Adding shutdown hook
main INFO : org.cobbzilla.s3s3mirror.KeyLister - added initial set of 8 keys
main INFO : org.cobbzilla.s3s3mirror.MirrorMaster - 8 keys found in first batch from source bucket -- processing...
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - No more keys found in source bucket, exiting
pool-1-thread-1 INFO : org.cobbzilla.s3s3mirror.KeyLister - KeyLister run loop finished
main INFO : org.cobbzilla.s3s3mirror.MirrorMaster - No more keys found in source bucket -- ALL DONE
pool-1-thread-2 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image.jpg
pool-1-thread-7 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_copy_tiny.jpg
pool-1-thread-6 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_copy_small.jpg
pool-1-thread-5 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_copy_medium.jpg
pool-1-thread-4 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_copy_large.jpg
pool-1-thread-3 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_admin.jpg
pool-1-thread-8 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_homepage-link.jpg
pool-1-thread-9 INFO : org.cobbzilla.s3s3mirror.KeyJob - Key not found in destination bucket (will copy): images/rqWvqF/image_widget.jpg
pool-1-thread-2 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image.jpg to destination
pool-1-thread-2 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image.jpg
pool-1-thread-4 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_copy_large.jpg to destination
pool-1-thread-4 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_large.jpg
pool-1-thread-9 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_widget.jpg to destination
pool-1-thread-9 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_widget.jpg
pool-1-thread-3 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_admin.jpg to destination
pool-1-thread-3 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_admin.jpg
pool-1-thread-7 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_copy_tiny.jpg to destination
pool-1-thread-7 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_tiny.jpg
pool-1-thread-8 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_homepage-link.jpg to destination
pool-1-thread-8 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_homepage-link.jpg
pool-1-thread-6 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_copy_small.jpg to destination
pool-1-thread-6 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_small.jpg
pool-1-thread-5 INFO : org.cobbzilla.s3s3mirror.KeyJob - Would have copied images/rqWvqF/image_copy_medium.jpg to destination
pool-1-thread-5 INFO : org.cobbzilla.s3s3mirror.KeyJob - done with images/rqWvqF/image_copy_medium.jpg
Thread-1 INFO : org.cobbzilla.s3s3mirror.MirrorStats -
--------------------------------------------------------------------
STATS BEGIN
read: 0
copied: 0
copy errors: 0
duration: 0:00:01
read rate: 0.0/minute
copy rate: 0.0/minute
STATS END
--------------------------------------------------------------------

With regards to using a "non.existent.name" for a destination bucket (mentioned in a previous comment), the "check file exists" part of the code seems to assume that the file doesn't exist for such a case, and s3m3mirror only actually emits an error once the code attempts to actually copy the file to the non-existent bucket.

@cobbzilla
Copy link
Owner

Dammit. I should have tested that pull request more. This is unfortunate. I will get this fixed asap.

@cobbzilla
Copy link
Owner

This wasn't the pull request, which was fine. This was a stupid bug I introduced. I've fixed it, and have added a test that covers this case. As future bugs are found, each one will get a test written, to validate that the fix has worked. Closing this as fixed, but will happily re-open if there are still any problems.

@WubbleWobble
Copy link

Awesome - many thanks :)

@ohTHATaaronbrown
Copy link
Author

Excellent, I'll pull it down and give it a whirl. Thanks mucho for quick response and fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants