Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream upload/download for GDrive #21357

Closed
PVince81 opened this issue Dec 23, 2015 · 16 comments · Fixed by #23323
Closed

Stream upload/download for GDrive #21357

PVince81 opened this issue Dec 23, 2015 · 16 comments · Fixed by #23323

Comments

@PVince81
Copy link
Contributor

Currently the whole file is stored in memory and then passed to the library: https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/lib/google.php#L469

This could easily cause out of memory errors in case the file is too big to fit in memory.

The question here is whether the library that OC uses called google-api-php-client supports stream upload ?

@icewind1991 @Xenopathic

@PVince81
Copy link
Contributor Author

If direct stream upload (from php input to google) is not possible, at least it should stream from the TMP file and not first load it all in memory...

@PVince81
Copy link
Contributor Author

@PVince81 PVince81 changed the title Stream upload for GDrive Stream upload/download for GDrive Dec 23, 2015
@PVince81
Copy link
Contributor Author

Looking a the code of the library, it seems that even downloading a file will store it in memory first through curl_exec's response: https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/3rdparty/google-api-php-client/src/Google/IO/Curl.php#L85

@PVince81
Copy link
Contributor Author

Seems the library has a Stream class, maybe that could be used instead https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/3rdparty/google-api-php-client/src/Google/IO/Stream.php ?

@PVince81
Copy link
Contributor Author

Looks like it's possible to tell the client to use its stream class, by adding this:

            $config = new \Google_Config();
            $config->setIoClass("Google_IO_Stream");
            $this->client = new \Google_Client($config);

I see that this class uses stream_context_create but still requires the full content to be given.
A quick googling seems to tell that stream_context_create cannot be used directly with a file handle. (http://stackoverflow.com/a/1691616)

This means the full file would still be in memory...

And for download, even though the response could be a file handle, that class still copies it to memory: https://github.com/owncloud/core/blob/v8.2.2/apps/files_external/3rdparty/google-api-php-client/src/Google/IO/Stream.php#L128

It almost looks like the library wasn't designed with download/upload of big files in mind, but rather as a client for various APIs.

@PVince81
Copy link
Contributor Author

If streaming is not possible, then at least "download directly to temp file" would still be better than "download to PHP string then save to temp file", at least would cause memory peak usage.

@PVince81
Copy link
Contributor Author

PVince81 commented Jan 8, 2016

If we anyway need a new library, maybe it's an opportunity to look into using Flysystem's GDrive implementation, if suitable ? (need to check that it doesn't store everything in memory...)

Just checked, and there is currently none 😦
It was requested here: thephpleague/flysystem#192

@PVince81
Copy link
Contributor Author

PVince81 commented Jan 8, 2016

Looks like someone started something here https://github.com/ignited/flysystem-google-drive (lots of TODOs) but they are using the same library so it will suffer from the same problems 😦

@PVince81
Copy link
Contributor Author

For the download part, seems like using Guzzle for download does the trick: #21732

@PVince81
Copy link
Contributor Author

I see mentions of Guzzle in the repo of the upstream library: https://github.com/google/google-api-php-client/
Maybe upgrading it would already solve the memory issue.

@ghost ghost modified the milestones: 9.0.1-next-maintenance, 9.0-current Feb 22, 2016
@PVince81
Copy link
Contributor Author

PVince81 commented Mar 9, 2016

Note: the memory issue with downloads has been solved in 9.0.

However there is still a memory issue with uploads, which might be more tricky to fix but not impossible. We might need to resort to a temporary file and then have Guzzle stream the local file up to the GDrive URL instead of using the library's method that only accepts the body a string.

@PVince81 PVince81 self-assigned this Mar 9, 2016
@PVince81
Copy link
Contributor Author

PVince81 commented Mar 9, 2016

Let's keep this on 9.0.1 to investigate the "direct curl" approach.

@PVince81
Copy link
Contributor Author

Looking at the documentation, it looks like it is not possible to stream-upload: https://developers.google.com/api-client-library/php/guide/media_upload

The best recommended way to upload is to use resumable/chunked upload as suggested here https://developers.google.com/api-client-library/php/guide/media_upload#resumable-file-upload

I can have a try with the example code...

@PVince81
Copy link
Contributor Author

It worked! Here is chunk upload for GDrive: #23323

@MorrisJobke MorrisJobke modified the milestones: 9.1-current, 9.0.1-current-maintenance Mar 22, 2016
@PVince81
Copy link
Contributor Author

Bringing back GDrive stream download now that the encryption socket issue has a fix: #23517

@lock
Copy link

lock bot commented Aug 5, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Aug 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants