Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large memory consumption #105

Closed
AndrewDryga opened this issue Sep 3, 2016 · 16 comments
Closed

Large memory consumption #105

AndrewDryga opened this issue Sep 3, 2016 · 16 comments

Comments

@AndrewDryga
Copy link
Contributor

Right now BEAM process seems to require space of about 3x file size. Maybe you can suggest how to send file directly to S3 without reading it into memory? We deal with very large files (up to 2 Gb).

@AndrewDryga
Copy link
Contributor Author

I guess problem is located here:

file.binary || File.read!(file.path)

You shouldn't read whole file into memory when you have file streams and binary reads :(.

@stavro
Copy link
Owner

stavro commented Sep 3, 2016

Looks like we can likely use https://hexdocs.pm/ex_aws/1.0.0-beta1/ExAws.S3.html#upload_part/6 to upload in chunks.

@stavro
Copy link
Owner

stavro commented Sep 4, 2016

Can you try the ex_aws_beta branch and let me know if it works better for you?

It has some quirks with error handling, but a large file upload should work better.

@Samorai
Copy link

Samorai commented Sep 15, 2016

Hi.
We tried the ex_aws_beta but have got an error
[error] Task #PID<0.1162.0> started from #PID<0.1161.0> terminating ** (MatchError) no match of right hand side value: %{body: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<InitiateMultipartUploadResult xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\"><Bucket>os-dev-batch-uploads</Bucket><Key>vivus.pl/2016/09/15/vivus.pl-47c6122ce5a2567d7709c75eb30f33d319330bf6.bson</Key><UploadId>X25x.4KcNAoY0e27807dJcgq_tny3f9Wf1M1Or5t4yuysKEVv7qzu7.szDlRWEuUjUWGURiXdrt0zaWj.yUt7C00HkEMW8OMebQ2s6J8ByURB3kwbYrcyyy95eCUA6kS</UploadId></InitiateMultipartUploadResult>", headers: [{"x-amz-id-2", "N1XDqA+ham+b3vVwSQYNTqCkEMWAY8KrDIy5M6W9ML895wKNAB5A6GP95Fx8DMjmQvOgZQCDYgo="}, {"x-amz-request-id", "29BF433D73AD79C0"}, {"Date", "Thu, 15 Sep 2016 08:40:38 GMT"}, {"Transfer-Encoding", "chunked"}, {"Server", "AmazonS3"}], status_code: 200} (ex_aws) lib/ex_aws/s3/upload.ex:41: ExAws.S3.Upload.initialize!/2 (ex_aws) lib/ex_aws/s3/upload.ex:82: ExAws.Operation.ExAws.S3.Upload.perform/2 (ex_aws) lib/ex_aws.ex:41: ExAws.request!/2 (arc) lib/arc/storage/s3.ex:48: Arc.Storage.S3.do_put/3 (elixir) lib/task/supervised.ex:94: Task.Supervised.do_apply/2 (elixir) lib/task/supervised.ex:45: Task.Supervised.reply/5 (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3 Function: #Function<0.113666932/0 in Arc.Actions.Store.async_put_version/3> Args: []

Something wrong with ex_aws library, yes?
Debug shows that ExAws.S3.Parsers.parse_initiate_multipart_upload/1 does not works correctly.

@jfrolich
Copy link
Contributor

Same error here. Also trying to get large uploads working. Looking into the code.

@stavro
Copy link
Owner

stavro commented Sep 29, 2016

The latest ExAws wasn't ready yet for file streaming when I tried. If you want to try updating that branch and see if it helps, I would appreciate it!

@jfrolich
Copy link
Contributor

jfrolich commented Sep 29, 2016

Hey @stavro. As far as I tested this, it worked great with the current version of ex_aws! I am still doing some testing with really large files.

ex_aws however silently fails and returns raw XML when you do not have the sweet_xml library installed. (As error above). I pull-requested a clearer error trigger when sweet_xml is not present.

@stavro
Copy link
Owner

stavro commented Sep 29, 2016

Amazing. Should this library then require sweet_xml?

I thought AWS had a way of requesting errors in JSON. If it's possible to request errors in JSON we should totally push that down to ExAws!

@jfrolich
Copy link
Contributor

Good one. It shouldn't be too involving to change that, as the parser in ex_aws is just a module, it should be easy to create a json parser. Anyway this works for now :)

Probably best to include sweet_xml in this library indeed for using it with ex_aws 1.0.

@jfrolich
Copy link
Contributor

To confirm, no trouble in uploading 500+mb files on a small heroku dyno.

@stavro
Copy link
Owner

stavro commented Sep 29, 2016

Amazing! Thanks for looking into it. I'll update the ex_aws_beta branch and start adding documentation about it next week. If you run into any other quirks please let me know. As soon as ExAws is out of beta we'll merge it into here. 🎉

@jfrolich
Copy link
Contributor

jfrolich commented Sep 30, 2016

Cool, I run it in production, because we need to support large file sizes. But it's ok because we have tests, and it seems to work fine 💃 Will report any issues that come up.

@AndrewDryga
Copy link
Contributor Author

We will use ex_aws_beta brunch in production, so It would be very awesome if you will notify us here when you will release next version of hex package, so we will stop using github repo dependency :).

@jfrolich
Copy link
Contributor

jfrolich commented Oct 1, 2016

Same here, it might be good to publish a beta release?

@stavro
Copy link
Owner

stavro commented Oct 1, 2016

Will do on Monday or Tuesday. Thanks for helping test everyone!

On Sep 30, 2016 11:50 PM, "Jaap Frolich" notifications@github.com wrote:

Same here, it might be good to publish a beta release?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#105 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACR-IK3qA81g_9kAqVTkZ-9Z4lcYAxdrks5qvgKigaJpZM4J0V3C
.

@stavro
Copy link
Owner

stavro commented Oct 5, 2016

Released arc as v0.6.0-rc1 to track ExAws.

Please try and report any feedback. Thanks!

@stavro stavro closed this as completed Oct 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants