-
-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Use temporary file to avoid out-of-memory when receiving big chunks. #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not perfect but I think it's a reasonable solution. For small request bodies, I suppose performance wouldn't be an issue. For large ones, this seems to be a necessary evil.
Any open issue for this? |
|
||
if h.cfg.OnSucceed != nil { | ||
input, err = ioutil.ReadAll(reqBody) | ||
tmpfile, err := ioutil.TempFile("", "gogs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't found info about the permissions that TempFile would put on the file, I guess it should be checked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/golang/go/blob/master/src/io/ioutil/tempfile.go#L55
Is this what you are looking for?
Current coverage is 3.03% (diff: 100%)@@ master #216 diff @@
========================================
Files 33 33
Lines 8106 8106
Methods 0 0
Messages 0 0
Branches 0 0
========================================
Hits 246 246
Misses 7840 7840
Partials 20 20
|
No issue that I know of, please leave here a reference to the Gogs PR which in turn has more refs to Gogs issue and if you feel like it, also add a local issue |
I don't think it's a good idea to use temporary file always since most time, the upload data is small data on git situation. For large files, lfs maybe a better choice and we have a merge-ready PR #122 . |
@lunny I am genuinely open to a better solution but I am afraid that LFS cannot solve the problem if the repository in huge size consists of many small files rather than some big binary blobs. Maybe someone else has experiences about how other similar projects deal with such issue? |
I think this PR maybe resolve #218 . Could we chose memory or file according the content-size ? |
I think a max size must be defined (maybe on app.ini) to avoid app crash or other kind of strange issue if free space is not enough to store temp data. |
The current approach will slow down all writes, and that's sad |
^ |
Id not possible to process incoming data as stream rather than just wait upload was completed ? |
Maybe we can use this library to do something https://github.com/edsrzf/mmap-go |
A stream oriented approach would be best, but it isn't trivial to implement (me and @bkcsoft tried it with no much luck) - if I recall correctly it had to do with the way the code was written, that's to say a refactoring might make it easier to do stream-oriented. Want to give it another try ? |
I think |
If Windows support is not an issue then |
Gitea MUST work on Windows as well |
@tboerger If that's the case, then |
I'm replying to this comment which I don't find on the web ui On Tue, Nov 22, 2016 at 06:12:49PM -0800, typeless wrote:
This kind of refactoring did sound like a good idea, did you |
@strk Yeah, I did delete the comment right after I realized that it's not just simply changing the function signature. The problem is that we have to read though the stream twice and the reader is not necessarily seekable. Need to take a look at the underlying mechanism of the origin of the request body (how the data is connected/copied to the io.Reader). |
This is only because we maybe called |
Maybe there's the possibility of using a Tee
|
@lunny Yeah, hopefully someone knows about why it has to be a two-pass operation in the first place. |
@strk io.Pipe() + goroutines seems to be the straightforward way. But I guess that would make the |
How about this then: (AFAIK) you can check the payload-size and for anything above |
About using TeeWriter, I've tried that without success... The good fix for this is to move |
@bkcsoft I have a PR which adds what you suggested but the value of I have lost the original repo during the process of transitioning the Github fork to Gitea from Gogs (Local repo is still there, though I am afraid it's impossible to restore the tracking for Github). I would have to either resend this PR or send another PR addressing the threshold after this getting merged. Which means, I will probably leave the PR as-is and send follow-ups for |
if you're going to define a new constant use CamelCase please ..
|
I have moved it to 1.1.0 for now. |
For the record, @unkwon merged this in Gogs: gogs/gogs#2960 |
A temporary method is to write to a template file in an another goroutine and write actions information to database. Every time write disk will slow the operations. So I don't think merge this directly is a good idea. |
Is there any opportunity to handle this more gracefully (i.e. to avoid copying completely) if those operations are replaced with git library calls instead of fork/exec to |
I think @bkcsoft 's idea is the good direction.
|
@lunny using
|
I think this should be closed. And let's discuss in #218 |
* Revert YAML-rendering of markdown files * tmp
Not perfect but I think it's a reasonable solution.
For small request bodies, I suppose performance wouldn't be an issue.
For large ones, this seems to be a necessary evil.
Fixes #218