Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very small number of S3 multipart upload objects got truncated #564

Closed
tatsuya6502 opened this issue May 1, 2015 · 5 comments
Closed

Very small number of S3 multipart upload objects got truncated #564

tatsuya6502 opened this issue May 1, 2015 · 5 comments

Comments

@tatsuya6502
Copy link

Hi,

One of my customers is using your S3 client and has uploaded over 1 million objects using the high-level abstractions for multipart upload (UploadBuilder::newInstance()->bulid(); $uploader->upload())

However, they found very very small number of files on S3 are truncated. They found so far 7 objects, out of ~1 million uploads, are truncated. They all missing the last upload part, but upload() didn't throw any exception.

I think all parts were uploaded without error, but by some reason, the last part was omitted by Complete Multipart Upload request. But they have no evidence to support it. I haven't reproduced the problem.

Here are some details:

  • User Agent:
    • aws-sdk-php2/2.6.15 Guzzle/3.9.2 curl/7.19.7 PHP/5.4.28 MUP
  • These truncated objects have only 2 or 3 parts. (min part size is 5MB).
  • They use setConcurrency(300) and the default values for other parameters.

I think this will be a race condition in Guzzle. But I posted this issue here because other users could have the same problem. I tried to debug the client, but I couldn't find anything that would explain the cause.
(I have no PHP experience.)

Also if we can't fix this now (because I can't provide steps to reproduce), will you add a "safety-net" to AWS SDK for PHP so my customer can detect when the problem happens again, and leave some information to help us to analyze the root cause?

For instance, before running Complete Multipart Upload, count the number of upload parts commands given to the client and compare it against the count of replies from the client. If they don't match, throw an exception with the information about the missing part.

Thanks,
Tatsuya

@jeremeamia
Copy link
Contributor

Thank you for bringing this to our attention. We will investigate this to see if we can reproduce or find a cause for this behavior.

The uploader object has an event dispatcher that you can hook into to create your own "safety-net". 😄 Here is an example:

// Configure your builder.
$uploader = UploadBuilder::newInstance()->build();

// Attach some event listeners to keep track of parts uploaded.
$dispatcher = $uploader->getEventDispatcher();
$numUploads = 0;
$dispatcher->addListener($uploader::BEFORE_PART_UPLOAD, function ($event) use (&$numUploads) {
    $numUploads++;
});
$dispatcher->addListener($uploader::AFTER_UPLOAD, function ($event) use (&$numUploads) {
    $countedParts = count($event['state']);
    // Compare the number of parts recorded in the upload's state to the ones you counted.
    if ($countedParts !== $numUploads) {
        throw new \RuntimeException("Multipart upload is missing parts for completion. "
            . "Found {$countedParts}, but expected {$numUploads}.");
    }
});

// Trigger the upload.
$uploader->upload();

Let us know if you are able to provide any additional information.

@tatsuya6502
Copy link
Author

Thank you very much for the code snippet. Wow, this is more than I expected!

I'll give it a try and share it with my customer. I'll also keep trying to reproduce the problem. I'll let you know if I find anything.

@mtdowling
Copy link
Member

They found so far 7 objects, out of ~1 million uploads, are truncated.

Is there any information that you can share about the 7 files that you observed as truncated? What file size was uploaded and what was the expected size? Is there a common pattern regarding the sizes (e.g., they are all X size but were expected to be Y size)?

They use setConcurrency(300) and the default values for other parameters.

That's a very large number. I think you'll get better throughput by reducing this number significantly. At 300 concurrent requests, you're probably at 100% CPU and possibly saturating your network connection.

@tatsuya6502
Copy link
Author

They found so far 7 objects, out of ~1 million uploads, are truncated.

Is there any information that you can share about the 7 files that you observed as truncated? What file size was uploaded and what was the expected size? Is there a common pattern regarding the sizes (e.g., they are all X size but were expected to be Y size)?

Thanks for looking into this issue. Here is the information. All files (except 6) seem to miss the last part. I don't see any strong relationships between sizes.

unit: bytes
note: 5242880 = 5MB (part size)

   File Size on S3  Expected Size    Diff       Upload Parts
1:    5242880         6996186     -1753306   5242880 * 1 + 1753306
2:    5242880         8493874     -3250994   5242880 * 1 + 3250994
3:    5242880         8493891     -3251011   5242880 * 1 + 3251011
4:    5242880         8493894     -3251014   5242880 * 1 + 3251014
5:   10485760        11936050     -1450290   5242880 * 2 + 1450290
6:   18712585        13469705     -5242880   5242880 * 3 + 2983945
7:   31457280        32336458      -879178   5242880 * 6 +  879178

They use setConcurrency(300) and the default values for other parameters.

That's a very large number. I think you'll get better throughput by reducing this number significantly. At 300 concurrent requests, you're probably at 100% CPU and possibly saturating your network connection.

Understood. I think the high concurrency number wouldn't be related to this issue because about 99% of files they have uploaded are smaller than 20MB. The actual concurrency number will be <= 4.

@tatsuya6502
Copy link
Author

Unfortunately, I couldn't reproduce the problem and no further information available to proceed the investigation.

Let me close this issue for now because:

  • I couldn't reproduce the problem.
  • It seems nobody else has this problem.
  • I reviewed the source codes of AWS SDK for PHP 2 and Guzzle, and couldn't find anything that would explain how the problem occurred. Guzzle's Http\Curl\CurlMulti and PHP's cURL modules look good to me.
  • I don't have access to my customer's application source codes though I have received some code fragments of it. As far as I know, it downloads S3 objects from an S3 bucket and uploads them to another S3 bucket. There may be other places that things can go wrong.

If the problem occurs again and I get more information, I'll reopen this issue.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants