Skip to content

Patch: Allow uploading files > 2G https://bugs.php.net/bug.php?id=44522 #372

Merged
merged 5 commits into from Sep 17, 2013

6 participants

@ralflang

Patch: Allow uploading files > 2G https://bugs.php.net/bug.php?id=44522

This is essentially the same as the patch
"uploads_larger_than_2g_HEAD_v2 (last revision 2012-03-26 03:59 UTC) by
jason at infininull dot com)" but using off_t instead of signed long
(originally: uint)

I tested this on 64bit linux and succeeded uploading a file of 4.8 G.
The File did not get corrupted or truncated in any way.

I did not yet test this under windows or 32 bit linux

Note that there are still limitations:

  • Did not test for files > 8 G
  • php does not yet reject absurdly high values
  • Still limited by underlying file system specific limits and free space
  • in upload
  • tmp dir and destination dir
ralflang added some commits Jun 28, 2013
@ralflang ralflang Patch for https://bugs.php.net/bug.php?id=44522 to allow uploading files
above 2G.

This is essentially the same as the patch
"uploads_larger_than_2g_HEAD_v2 (last revision 2012-03-26 03:59 UTC) by
jason at infininull dot com)" but using off_t instead of signed long
(originally: uint)

I tested this on 64bit linux and succeeded uploading a file of 4.8 G.
The File did not get corrupted or truncated in any way.

I did not yet test this under windows or 32 bit linux

Note that there are still limitations:

* Did not test for files > 8 G
* php does not yet reject absurdly high values
* Still limited by underlying file system specific limits and free space
* in upload
* tmp dir and destination dir
da04b2e
@ralflang ralflang ws f978f11
@johannes johannes commented on an outdated diff Jul 3, 2013
main/rfc1867.c
@@ -676,8 +676,9 @@ SAPI_API SAPI_POST_HANDLER_FUNC(rfc1867_post_handler) /* {{{ */
{
char *boundary, *s = NULL, *boundary_end = NULL, *start_arr = NULL, *array_index = NULL;
char *temp_filename = NULL, *lbuf = NULL, *abuf = NULL;
- int boundary_len = 0, total_bytes = 0, cancel_upload = 0, is_arr_upload = 0, array_len = 0;
- int max_file_size = 0, skip_upload = 0, anonindex = 0, is_anonymous;
+ int boundary_len = 0, cancel_upload = 0, is_arr_upload = 0, array_len = 0;
+ off_t total_bytes = 0, max_file_size = 0;
@johannes
php.net member
johannes added a note Jul 3, 2013

max_file_size is later (around line 900) set using

                if (!strcasecmp(param, "MAX_FILE_SIZE")) {
                    max_file_size = atol(value);
                }

this might cause issues on windows where sizeof(long) always equals 4, even on 64 bit systems. In case off_t is a 64 bit integer (or unsized 32 bit) this might lead to unexpected behavior.

Also later, around line 1026 I see

if (PG(upload_max_filesize) > 0 && (long)(total_bytes+blen) > PG(upload_max_filesize)) {
#if DEBUG_FILE_UPLOAD
                    sapi_module.sapi_error(E_NOTICE, "upload_max_filesize of %ld bytes exceeded - file [%s=%s] not saved", PG(upload_max_filesize), param, filename);
#endif
                    cancel_upload = UPLOAD_ERROR_A;
                } else if (max_file_size && ((long)(total_bytes+blen) > max_file_size)) {
#if DEBUG_FILE_UPLOAD
                    sapi_module.sapi_error(E_NOTICE, "MAX_FILE_SIZE of %ld bytes exceeded - file [%s=%s] not saved", max_file_size, param, filename);
#endif
                    cancel_upload = UPLOAD_ERROR_B;
                } else if (blen > 0) {

those casts to long seem to be wrong after this patch, probably casting to off_t is better %ld should be double checked (while that's debug code only)

Again for Windows: Line 1222 or so reads

                  file_size.value.lval = total_bytes;

lval is a long. long on Windows always is 32bit. Probably we need (unprecise) double or string representation for files with total_bytes > maxint

This is simpler if off_t on Win64 is 32bits, too, which I don't know :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@weltling
weltling commented Jul 8, 2013

off_t is long, so 4 bytes in both x64 and x86 windows. On 32 bit linux as they say it's 'signed integer'. That all means on windows and 32 bit linux there's no benefit from this patch, true 64 bit only systems would benefit. I think double instead of off_t could be used on non 64 bit systems to make it profitable even there.

@ralflang
ralflang commented Jul 8, 2013

The off_t was advised by yohgaki@ohgaki.net on the php-dev internals list - to be used together with #define _FILE_OFFSET_BITS 64 which I haven't done yet. And I need to address Johannes' points..

Double has a wide range on all platforms but it's not a precise, discrete integer type. I'm not sure a floating point type is appropriate here and like to hear other opinions. I think I'm best off with an integer type that is 64 bit on all platforms.

@johannes
php.net member
johannes commented Jul 8, 2013

Be careful with _FILE_OFFSET_BITS 64 - as there is an off_t in a global header this would have to affect all of PHP, which might cause trouble with external libs we're linking too. That can of worms made us not add the large file support.

@ralflang
ralflang commented Jul 8, 2013

I've read a bit about windows type lengths and it's probably best to define an own type filesize_t which equals unsigned long long in linux and whatever equivalent we can get on both 32 bit and 64 bit windows (i'm not a windows dev but http://msdn.microsoft.com/en-us/library/s3f49ktz%28v=vs.80%29.aspx says their long long is also 64 bit). This could avoid _File_OFFSET_BITS.

@weltling
weltling commented Jul 8, 2013

With 64 bit integer on 32 bit platform there were possible overflow in returning the file size to PHP, even returning it as string. double is good enough as soon as it's ieee 754 complaint, so has 52 bit integer part. And it can be returned to php without overflow. Or how would you do it?

@johannes
php.net member
@weltling
weltling commented Jul 8, 2013

That's true. Plus minus one byte would corrupt a file. But what's the issue using just the integer part, like double d = .0; int i = 42; d += (double)i; ... 42 were converted to 42.0 even without explicit cast, or where is the pitfall?

@ralflang

I settled for int64_t - sorry for the delay but I was a little alien to windows build at first.
You seem to define an uint64_t too, but only for windows? Actually signed file sizes don't make too much sense but it seems like glibc does not have an atoull either. strtoull seems an option but I did not want to introduce too much change at once. 63bits + signed is probably enough for the next decade or so.

@weltling

on windows it is _atoi64(value)

@weltling

Yep, that should already compile with VC. I'd however rewrite the check a bit more precise

#if defined(PHP_WIN32) && !defined(HAVE_ATOLL)

like that, to prevent possible future conflicts. You can ping me again when i can test on windows.

@weltling

or, as that's an isolated case, just do that in place, like

#ifdef PHP_WIN32
.... do __atoi64
#else
.... do atoll
#endif

@ralflang

I've setup a win7/VC11 build and it built. I'm currently reading/searching how to set up a windows test case with this build.

@m6w6 m6w6 added a commit to m6w6/php-src that referenced this pull request Aug 5, 2013
@m6w6 m6w6 Merge branch 'master' of github.com:/ralflang/php-src
merged pull request #372:
>2G uploads by Ralf Lang
php#372
69aed1b
@pierrejoye

I do not really understand why this patch gets merged already. Not that it is not necessary :) but I do not think it is ready enough to get into master. There are a couple of areas that needs more work (userland side, handling for large filesize, only meta infos f.e.). As we discussed on IRC, it could way easier to get it via the 64bit support patch/RFC.

What was the reasoning

@m6w6
m6w6 commented Aug 16, 2013

It was ridiculous to not support 64bit upload size on platforms where long is large enough. The int64 change is a lot more work. I see that it is not perfect yet, but small steps are better than no progress at all. Actually, using int64_t instead of long would not be necessary, so I can change it back if you don't like it!

@ralflang

We had all this type changing back and forth (long, size_t, int64_t) in the patch to make sure it behaves consistently in linux and windows, 32 and 64 bit platforms.

signed long in 32 bit systems is only 32 bit, roughly 2G - not enough.

@php-pulls php-pulls merged commit d80a910 into php:master Sep 17, 2013

1 check failed

Details default The Travis CI build could not complete due to an error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.