Skip to content

Conversation

typeless
Copy link
Contributor

@typeless typeless commented Nov 22, 2016

Not perfect but I think it's a reasonable solution.
For small request bodies, I suppose performance wouldn't be an issue.
For large ones, this seems to be a necessary evil.

Fixes #218

Not perfect but I think it's a reasonable solution.
For small request bodies, I suppose performance wouldn't be an issue.
For large ones, this seems to be a necessary evil.
@typeless
Copy link
Contributor Author

Any open issue for this?

@typeless typeless changed the title Develop Use temporary file to avoid out-of-memory when receiving big chunks. Nov 22, 2016

if h.cfg.OnSucceed != nil {
input, err = ioutil.ReadAll(reqBody)
tmpfile, err := ioutil.TempFile("", "gogs")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't found info about the permissions that TempFile would put on the file, I guess it should be checked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codecov-io
Copy link

codecov-io commented Nov 22, 2016

Current coverage is 3.03% (diff: 100%)

Merging #216 into master will not change coverage

@@            master      #216   diff @@
========================================
  Files           33        33          
  Lines         8106      8106          
  Methods          0         0          
  Messages         0         0          
  Branches         0         0          
========================================
  Hits           246       246          
  Misses        7840      7840          
  Partials        20        20          

Powered by Codecov. Last update cb16028...a0ca633

@strk
Copy link
Member

strk commented Nov 22, 2016

No issue that I know of, please leave here a reference to the Gogs PR which in turn has more refs to Gogs issue and if you feel like it, also add a local issue

@lunny
Copy link
Member

lunny commented Nov 22, 2016

I don't think it's a good idea to use temporary file always since most time, the upload data is small data on git situation. For large files, lfs maybe a better choice and we have a merge-ready PR #122 .

@typeless
Copy link
Contributor Author

typeless commented Nov 22, 2016

@lunny I am genuinely open to a better solution but I am afraid that LFS cannot solve the problem if the repository in huge size consists of many small files rather than some big binary blobs.

Maybe someone else has experiences about how other similar projects deal with such issue?

@lunny
Copy link
Member

lunny commented Nov 22, 2016

I think this PR maybe resolve #218 . Could we chose memory or file according the content-size ?

@thibaultmeyer
Copy link
Contributor

thibaultmeyer commented Nov 22, 2016

I think a max size must be defined (maybe on app.ini) to avoid app crash or other kind of strange issue if free space is not enough to store temp data.

@tboerger
Copy link
Member

The current approach will slow down all writes, and that's sad

@lunny
Copy link
Member

lunny commented Nov 22, 2016

^
That's what I want to say. And most times, it will slow down.

@thibaultmeyer
Copy link
Contributor

Id not possible to process incoming data as stream rather than just wait upload was completed ?

@lunny
Copy link
Member

lunny commented Nov 22, 2016

Maybe we can use this library to do something https://github.com/edsrzf/mmap-go

@strk
Copy link
Member

strk commented Nov 22, 2016

A stream oriented approach would be best, but it isn't trivial to implement (me and @bkcsoft tried it with no much luck) - if I recall correctly it had to do with the way the code was written, that's to say a refactoring might make it easier to do stream-oriented. Want to give it another try ?

@lunny
Copy link
Member

lunny commented Nov 22, 2016

I think mmap is the correct method to resolve it. Keep the performance and resolve the large file bug.

@typeless
Copy link
Contributor Author

typeless commented Nov 23, 2016

If Windows support is not an issue then open(2) with O_TMPFILE could be an option as well. ( I am not sure how well the status of cross-platform support of Gitea is)

@tboerger
Copy link
Member

Gitea MUST work on Windows as well

@typeless
Copy link
Contributor Author

typeless commented Nov 23, 2016

@tboerger If that's the case, then O_TMPFILE and mmap would introduce some extra platform-dependent complexities.

@strk
Copy link
Member

strk commented Nov 23, 2016

I'm replying to this comment which I don't find on the web ui
anymore (deleted?):

On Tue, Nov 22, 2016 at 06:12:49PM -0800, typeless wrote:

Just took a glance and it seems a stream-oriented approach would be more probable if we can change the function signature of the [OnSucceed](https://github.com/go-gitea/gitea/blob/master/routers/repo/http.go#L233) from func(rpc string, input []byte) to func(rpc string, input io.Reader).

This kind of refactoring did sound like a good idea, did you
hit a roadblock that made you remove the comment ?

@typeless
Copy link
Contributor Author

@strk Yeah, I did delete the comment right after I realized that it's not just simply changing the function signature. The problem is that we have to read though the stream twice and the reader is not necessarily seekable.

Need to take a look at the underlying mechanism of the origin of the request body (how the data is connected/copied to the io.Reader).

@lunny
Copy link
Member

lunny commented Nov 23, 2016

This is only because we maybe called h.cfg.OnSucceed. Or we can just let cmd.Stdin = reqBody.

@strk
Copy link
Member

strk commented Nov 23, 2016 via email

@typeless
Copy link
Contributor Author

typeless commented Nov 23, 2016

This is only because we maybe called h.cfg.OnSucceed. Or we can just let cmd.Stdin = reqBody.

@lunny Yeah, hopefully someone knows about why it has to be a two-pass operation in the first place.

@typeless
Copy link
Contributor Author

Maybe there's the possibility of using a Tee

@strk io.Pipe() + goroutines seems to be the straightforward way. But I guess that would make the Read of Tee blocking if the other end of read in OnSucceed isn't running simultaneously.

@bkcsoft
Copy link
Member

bkcsoft commented Nov 23, 2016

How about this then: (AFAIK) you can check the payload-size and for anything above MAX_HTTP_PAYLOAD would go into a TempFile ?

@tboerger tboerger added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Nov 23, 2016
@bkcsoft
Copy link
Member

bkcsoft commented Nov 23, 2016

About using TeeWriter, I've tried that without success... The good fix for this is to move h.Cfg.OnSucced to .git/hooks/post-recieve (where it belongs), but doing that would make said git-hook unavailable to the User/Admin (i.e. break backwards-compat).

@typeless
Copy link
Contributor Author

typeless commented Nov 24, 2016

@bkcsoft I have a PR which adds what you suggested but the value of MAX_HTTP_PAYLOAD is still a TBD.
Any suggestions on the value and the location of the constant definition?

I have lost the original repo during the process of transitioning the Github fork to Gitea from Gogs (Local repo is still there, though I am afraid it's impossible to restore the tracking for Github). I would have to either resend this PR or send another PR addressing the threshold after this getting merged.

Which means, I will probably leave the PR as-is and send follow-ups for MAX_HTTP_PAYLOAD if that's the agreed direction.

@tboerger tboerger added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Nov 24, 2016
@strk
Copy link
Member

strk commented Nov 24, 2016 via email

@tboerger tboerger added this to the 1.1.0 milestone Nov 24, 2016
@tboerger tboerger added the type/enhancement An improvement of existing functionality label Nov 24, 2016
@tboerger
Copy link
Member

I have moved it to 1.1.0 for now.

@strk
Copy link
Member

strk commented Dec 24, 2016

For the record, @unkwon merged this in Gogs: gogs/gogs#2960

@lunny
Copy link
Member

lunny commented Dec 24, 2016

A temporary method is to write to a template file in an another goroutine and write actions information to database. Every time write disk will slow the operations. So I don't think merge this directly is a good idea.

@typeless
Copy link
Contributor Author

Is there any opportunity to handle this more gracefully (i.e. to avoid copying completely) if those operations are replaced with git library calls instead of fork/exec to git? Just saying.

@lunny
Copy link
Member

lunny commented Dec 24, 2016

I think @bkcsoft 's idea is the good direction.

About using TeeWriter, I've tried that without success... The good fix for this is to move h.Cfg.OnSucced to .git/hooks/post-recieve (where it belongs), but doing that would make said git-hook unavailable to the User/Admin (i.e. break backwards-compat).

@bkcsoft
Copy link
Member

bkcsoft commented Dec 25, 2016

@lunny using post-recieve-hook would be 2 PRs in this case

  1. move the user-provided hook somewhere else (hooks/gitea-post-recieve?) and have hooks/post-recieve call that script
  2. move this code into hooks/post-recieve after user-provided script is ran

@lunny
Copy link
Member

lunny commented Feb 5, 2017

I think this should be closed. And let's discuss in #218

@lunny lunny closed this Feb 5, 2017
@tboerger tboerger added reviewed/invalid and removed type/enhancement An improvement of existing functionality lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Feb 10, 2017
@tboerger tboerger removed this from the 1.1.0 milestone Feb 10, 2017
ethantkoenig added a commit to ethantkoenig/gitea that referenced this pull request Sep 18, 2017
* Revert YAML-rendering of markdown files

* tmp
@go-gitea go-gitea locked and limited conversation to collaborators Nov 23, 2020
@delvh delvh added issue/not-a-bug The reported issue is the intended behavior or the problem is not inside Gitea and removed reviewed/invalid labels Oct 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

issue/not-a-bug The reported issue is the intended behavior or the problem is not inside Gitea

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pushing a large repo with http leads to crash

8 participants