Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memleak during multiple file upload when authentication is required #24

Closed
ilyalavrinov opened this issue Sep 21, 2018 · 8 comments · Fixed by #50 or #71
Closed

memleak during multiple file upload when authentication is required #24

ilyalavrinov opened this issue Sep 21, 2018 · 8 comments · Fixed by #50 or #71

Comments

@ilyalavrinov
Copy link

Hello Collaborators,

Describe the bug
A short description of what you think the bug is.

Software

  • OS: Linux Mint 19 Tara
  • Golang: go version go1.10.1 linux/amd64
  • Version: latest studio-b12/gowebdav (loaded via go get)

To Reproduce
I've started to build a small tool which would allow file syncing via WebDAV. It's not fully ready but at least it allows to scan a folder and create a list of files which should be uploaded. The flow of the program now is the following:

  1. Create list of files for uploading
  2. Create N goroutines, each has its own gowebdav.Client
  3. Filenames for uploading are sent to these goroutines via channels (fan-out), then each file is uploaded independently via Client.WriteStream. The file is closed via defer.
  4. I used WebDAV API for Yandex.Disk, but I believe the same problem could be reproduced with any server with configured Basic authentication.

What I observe is enormous RAM usage by my tool (e.g. even if a folder contains files for 1G, it can easily take up to several gigabytes of RAM). I've tried to narrow the scope of the problem and here's what I've got:

  • GODEBUG=gctrace=1 shows that upon each gc cycle it could collect 0 space, and the space constantly grows
  • pprof.WriteHeapProfile at the end gives me a profile which shows that there's some inuse_space related to TeeReader from requests, though acquired by persistConn inside of http.Transport. It's not several gigabytes, but up to several hundred megabytes.
  • I tried to use Client.SetTransport() providing my http.Transport with reduced IdleConnTimeout, MaxIdleConns, DisableKeepAlive and other similar things like explicit call . Nothing changed.
  • Finally, I reverted gowebdav to revision prior to the fix for issue 401 Unauthorized when trying to login to Apache HTTPS Server #14 - no additional memory has been taken (memory footprint was just about several megabytes according to gctrace)

I believe there's something wrong with current usage of TeeReader introduced in #b45378c. It looks like persistConn inside of a Transport somehow doesn't allow to collect the unused objects (i.e even if a new file gets uploaded, the data for the old one doesn't get cleared). Unfortunately I'm pretty new to Go and couldn't resolve the issue by myself.

Expected
Well, no memleak

Additional context
Just in case you're curious how do I use gowebdav exactly, here is the tool I'm using.

Please let me know if you need any additional info.

@chripo
Copy link
Member

chripo commented Sep 24, 2018

thanks for reporting. i'll take a look.

@chripo
Copy link
Member

chripo commented Sep 27, 2018

i took a look inside your code and ours.
everything seems to be fine, or i can't spot the error atm.

we don't use a global state or caching or something else, therefore GO should collect all memory.
and it does it, but in it's on own way.

if some memory gets collected it's still available to the the application, so the application can use it again without acquire more from the system, which is much cheaper.

you can force it by calling FreeOsMemory

@ilyalavrinov
Copy link
Author

Thanks! I'll take a look at "runtime/debug" to dig deeper into this issue and probably collect some more data. However it still seems a bit strange that:

  • in revision b45378c of gowebdav I do not observe so huge memory usage (with the same version of my app)
  • the GC does not give back the memory to the system even if I tried to clear idle connections and other background entities.

Anyway, I'll play a bit with tools from "runtime/debug", double-check that the issue is present in the latest version of gowebdav and maybe narrow down the issue with some simple example + test config for local nginx/apache. I'll come back in a few days if I get any results.

@chripo
Copy link
Member

chripo commented Sep 28, 2018

Yes the memory consumption should be doubled since #14 because of teeing the body, which sucks badly. We still need to refactor this case and do some strategy for servers which prefere the auth in this way, see #15.

defere bb.Reset() may speed up GC.

@chripo
Copy link
Member

chripo commented Oct 21, 2018

@admirallarimda did you took a deeper look?

@ilyalavrinov
Copy link
Author

Unfortunately, not yet. I'll definitely try it out some day, but currently I cannot predict when I would have time for this.
If you think everything looks fine, feel free to close the issue. Once I dig deeper into the issue and prove that there's a real problem, I'll reopen it or send a PR if I manage to fix it somehow.

@ochanism
Copy link

bb := io.TeeReader(body, &ba)

TeeReader requires as much memory as the requested file size. This code needs to be improved to avoid out of memory when uploading large-size files.

@fghezzo
Copy link

fghezzo commented Jan 21, 2020

I also found the memory leak reported in this issue.
As a solution, after this line:

if rs.StatusCode == 401 && c.auth.Type() == "NoAuth" {

I've read the body of the response as it wasn't collected by GC. This would resolve the memory leak and also a not reported goroutine leak.

_, err = ioutil.ReadAll(rs.Body)
if err != nil {
	return nil, err
}

@chripo chripo linked a pull request Nov 8, 2021 that will close this issue
chripo added a commit that referenced this issue Feb 3, 2023
The main goal was to simplify the `req` method in
`request.go` and making it easier to add more authentication methods.

All the magic went into the `auth.go` file.

This feature introduces an `Authorizer` which acts as an
`Authenticator` factory. Under the hood it creates an authenticator
shim per request, which delegates the authentication flow
to our authenticators.

The authentication flow itself is broken down into `Authorize' and
`Verify' steps to encapsulate and control complex authentication challenges.

Furthermore, the default `NewAutoAuth' authenticator can be overridden
by a custom implementation for more control over flow and resources.
The `NewBacisAuth` Authorizer gives us the feel of the old days.
chripo added a commit that referenced this issue Feb 3, 2023
The main goal was to simplify the `req` method in
`request.go` and making it easier to add more authentication methods.

All the magic went into the `auth.go` file.

This feature introduces an `Authorizer` which acts as an
`Authenticator` factory. Under the hood it creates an authenticator
shim per request, which delegates the authentication flow
to our authenticators.

The authentication flow itself is broken down into `Authorize' and
`Verify' steps to encapsulate and control complex authentication challenges.

Furthermore, the default `NewAutoAuth' authenticator can be overridden
by a custom implementation for more control over flow and resources.
The `NewBacisAuth` Authorizer gives us the feel of the old days.
chripo added a commit that referenced this issue Feb 3, 2023
The main goal was to simplify the `req` method in
`request.go` and making it easier to add more authentication methods.

All the magic went into the `auth.go` file.

This feature introduces an `Authorizer` which acts as an
`Authenticator` factory. Under the hood it creates an authenticator
shim per request, which delegates the authentication flow
to our authenticators.

The authentication flow itself is broken down into `Authorize' and
`Verify' steps to encapsulate and control complex authentication challenges.

Furthermore, the default `NewAutoAuth` authenticator can be overridden
by a custom implementation for more control over flow and resources.
The `NewBacisAuth` Authorizer gives us the feel of the old days.
chripo added a commit that referenced this issue Feb 3, 2023
The main goal was to simplify the `req` method in
`request.go` and making it easier to add more authentication methods.

All the magic went into the `auth.go` file.

This feature introduces an `Authorizer` which acts as an
`Authenticator` factory. Under the hood it creates an authenticator
shim per request, which delegates the authentication flow
to our authenticators.

The authentication flow itself is broken down into `Authorize` and
`Verify' steps to encapsulate and control complex authentication challenges.

Furthermore, the default `NewAutoAuth` authenticator can be overridden
by a custom implementation for more control over flow and resources.
The `NewBacisAuth` Authorizer gives us the feel of the old days.
chripo added a commit that referenced this issue Feb 3, 2023
The main goal was to simplify the `req` method in
`request.go` and making it easier to add more authentication methods.

All the magic went into the `auth.go` file.

This feature introduces an `Authorizer` which acts as an
`Authenticator` factory. Under the hood it creates an authenticator
shim per request, which delegates the authentication flow
to our authenticators.

The authentication flow itself is broken down into `Authorize` and
`Verify' steps to encapsulate and control complex authentication challenges.

Furthermore, the default `NewAutoAuth` authenticator can be overridden
by a custom implementation for more control over flow and resources.
The `NewBacisAuth` Authorizer gives us the feel of the old days.
@chripo chripo linked a pull request Feb 11, 2023 that will close this issue
chripo added a commit that referenced this issue Feb 13, 2023
The changes simplify the `req` method by moving the
authentication-related code into the API.
This makes it easy to add additional authentication methods.

The API introduces an `Authorizer` that acts as an
authenticator factory. The authentication flow itself
is divided down into `Authorize` and `Verify` steps in order
to encapsulate and control complex authentication challenges.

The default `NewAutoAuth` negotiates the algorithms.
Under the hood, it creates an authenticator shim per request,
which delegates the authentication flow to our authenticators.

The `NewEmptyAuth` and `NewPreemptiveAuth` authorizers
allow you to have more control over algorithms and resources.

The API also allows interception of the redirect mechanism by setting
the `XInhibitRedirect` header.

This closes: #15 #24 #38
chripo added a commit that referenced this issue Jun 22, 2023
The changes simplify the `req` method by moving the
authentication-related code into the API.
This makes it easy to add additional authentication methods.

The API introduces an `Authorizer` that acts as an
authenticator factory. The authentication flow itself
is divided down into `Authorize` and `Verify` steps in order
to encapsulate and control complex authentication challenges.

The default `NewAutoAuth` negotiates the algorithms.
Under the hood, it creates an authenticator shim per request,
which delegates the authentication flow to our authenticators.

The `NewEmptyAuth` and `NewPreemptiveAuth` authorizers
allow you to have more control over algorithms and resources.

The API also allows interception of the redirect mechanism by setting
the `XInhibitRedirect` header.

This closes: #15 #24 #38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants