Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PutObject takes lot of memory when client logmode aws.LogRequest is enabled #1618

Closed
3 tasks done
prembhaskal opened this issue Mar 9, 2022 · 6 comments
Closed
3 tasks done
Assignees
Labels
bug This issue is a bug. dependencies This issue is a problem in a dependency.

Comments

@prembhaskal
Copy link

Documentation

Describe the bug

Hi,
With the client having ClientLogMode set to aws.LogRequest and putObject request , the client consumes memory equal to the size of the data. When log mode is not set, the client consumes only minimal memory.

Note that I understand that we should not use PutObject for files bigger than 100MB and use upload instead. But this issue exists nevertheless.

Expected behavior

It is expected that when ClientLogMode is set to LogRequestWithBody, it will consume memory equal to size of the data.
but when ClientLogMode is set to LogRequest, it should not consume memory as per the data size.

Current behavior

When clientMode is set to aws.LogRequest and putObject is done using the client, the client consumes memory equal to the size of the data(file in my case). Since only log request is set, (and not logRequestWithBody), this is somewhat unexpected.

Steps to Reproduce

code:
go code with github.com/aws/aws-sdk-go-v2 v1.13.0

package main

import (
	"context"
	"errors"
	"flag"
	"fmt"
	"os"

	"github.com/aws/aws-sdk-go-v2/aws"
	"github.com/aws/aws-sdk-go-v2/config"
	s3cred "github.com/aws/aws-sdk-go-v2/credentials"
	"github.com/aws/aws-sdk-go-v2/service/s3"
	"github.com/aws/aws-sdk-go-v2/service/s3/types"
)

var testbucket = "large-file-test"

func main() {
	fmt.Printf("start log request test\n")

	var url string
	var accessKey string
	var secretKey string
	var logrequest bool
	var filePath string
	var key string

	flag.StringVar(&url, "url", "", "s3 url")
	flag.StringVar(&accessKey, "access", "", "access key")
	flag.StringVar(&secretKey, "secret", "", "secret key")
	flag.BoolVar(&logrequest, "logrequest", false, "if you want to log request")
	flag.StringVar(&filePath, "filepath", "", "file to be put")
	flag.StringVar(&key, "key", "", "name of key")
	flag.Parse()

	var logmode aws.ClientLogMode
	if logrequest {
		logmode |= aws.LogRequest
	}

	client, err := createS3Client(url, accessKey, secretKey, logmode)
	if err != nil {
		fmt.Printf("error client create: %v\n", err)
		return
	}

	putObject(client, filePath, key)
}

func putObject(client *s3.Client, filePath, key string) {
	err := createBucketIfNotExists(client, testbucket)
	if err != nil {
		fmt.Printf("error creating bucket: %v\n", err)
		return
	}

	largefile, err := os.Open(filePath)
	if err != nil {
		fmt.Printf("error open file: %v", err)
		return
	}
	defer largefile.Close()

	putObjectInput := &s3.PutObjectInput{
		Bucket:        &testbucket,
		Key:           &key,
		Body:          largefile,
		ContentLength: -1,
	}
	_, err = client.PutObject(context.TODO(), putObjectInput)
	if err != nil {
		fmt.Printf("error in putobject: %v\n", err)
		return
	}
	fmt.Printf("putobject success\n")
}

func createBucketIfNotExists(client *s3.Client, bucket string) error {
	params := &s3.CreateBucketInput{
		Bucket: &bucket,
	}
	_, err := client.CreateBucket(context.TODO(), params)

	var errExists *types.BucketAlreadyExists
	var errOwned *types.BucketAlreadyOwnedByYou
	if err != nil {
		if errors.As(err, &errExists) || errors.As(err, &errOwned) {
			return nil
		}
		return fmt.Errorf("error create bucket: %w", err)
	}
	return nil
}

func createS3Client(url, accessKey, secretKey string, logmode aws.ClientLogMode) (*s3.Client, error) {
	customCredentialResolver := s3cred.NewStaticCredentialsProvider(
		accessKey,
		secretKey,
		"")

	customEndpointResolver := aws.EndpointResolverWithOptionsFunc(
		func(service, region string, options ...interface{}) (aws.Endpoint, error) {
			return aws.Endpoint{
				PartitionID:       "aws",
				URL:               url,
				HostnameImmutable: true,
			}, nil
		})

	cfg, err := config.LoadDefaultConfig(context.Background(),
		config.WithRegion(""),
		config.WithCredentialsProvider(customCredentialResolver),
		config.WithEndpointResolverWithOptions(customEndpointResolver),
		config.WithClientLogMode(logmode),
	)
	if err != nil {
		return nil, err
	}

	return s3.NewFromConfig(cfg), nil
}

run using any of the S3 service, eg. minio or AWS-S3
steps to reproduce issues

$ # create 1GB file
$ head -c 1G /dev/urandom > /var/tmp/large_file_1gb.out
$ # running locally using minio, for reproducing use any of the S3 service 
$ # limit memory for session to reproduce issue
$ ulimit -d 512000
$ ulimit -m 512000

$ # first run with clientmode not set, program runs fine.
$ go run main.go --url http://localhost:9000 --access access_key --secret secret_key --filepath /var/tmp/large_file_1gb.out --key testfile
start log request test
putobject success

$ # run with clientmode set to logRequest. program runs out of memory (output is truncated , whole output attached to the issue)
$ go run main.go --url http://localhost:9000 --access access_key --secret secret_key --filepath /var/tmp/large_file_1gb.out --key testfile1 --logrequest
start log request test
SDK 2022/03/09 19:42:37 DEBUG Request
PUT /large-file-test HTTP/1.1
Host: localhost:9000
User-Agent: aws-sdk-go-v2/1.13.0 os/linux lang/go/1.17.6 md/GOOS/linux md/GOARCH/amd64 api/s3/1.24.0
Content-Length: 0
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: 20beb94c-84df-470f-aa9b-82d83c3a6a9e
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=access_key/20220309//s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date, Signature=9d853d93cde4b41059aca291d3e2443df36f5b4aaeda4b20ec5d749285660df8
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220309T141237Z

fatal error: out of memory allocating heap arena metadata
...
goroutine 1 [select]:
net/http.(*persistConn).roundTrip(0xc0000e2900, 0xc0000d4940)
	/usr/local/go/src/net/http/transport.go:2614 +0x97d
net/http.(*Transport).roundTrip(0xc0000ec140, 0xc0000f2300)
	/usr/local/go/src/net/http/transport.go:594 +0x7d1
net/http.(*Transport).RoundTrip(...)
	/usr/local/go/src/net/http/roundtrip.go:18
net/http/httputil.DumpRequestOut(0xc0000f2300, 0x20)
	/usr/local/go/src/net/http/httputil/dump.go:146 +0x6b7
github.com/aws/smithy-go/transport/http.(*RequestResponseLogger).HandleDeserialize(0xc00009a548, {0x811020, 0xc00009cff0}, {{0x783a80, 0xc00009ce40}}, {0x809440, 0xc0000bc680})
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/smithy-go/transport/http/middleware_http_logging.go:42 +0xfb
github.com/aws/smithy-go/middleware.decoratedDeserializeHandler.HandleDeserialize(...)
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/smithy-go/middleware/step_deserialize.go:206
github.com/aws/aws-sdk-go-v2/aws/middleware.RecordResponseTiming.HandleDeserialize({}, {0x811020, 0xc00009cff0}, {{0x783a80, 0xc00009ce40}}, {0x8093a0, 0xc0000be7e0})
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/aws-sdk-go-v2/aws/middleware/middleware.go:57 +0x7a
github.com/aws/smithy-go/middleware.decoratedDeserializeHandler.HandleDeserialize(...)
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/smithy-go/middleware/step_deserialize.go:206
github.com/aws/aws-sdk-go-v2/service/s3.(*awsRestxml_deserializeOpPutObject).HandleDeserialize(0x790858, {0x811020, 0xc00009cff0}, {{0x783a80, 0xc00009ce40}}, {0x8093a0, 0xc0000be800})
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/aws-sdk-go-v2/service/s3/deserializers.go:10360 +0x68
github.com/aws/smithy-go/middleware.decoratedDeserializeHandler.HandleDeserialize(...)
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/smithy-go/middleware/step_deserialize.go:206
github.com/aws/aws-sdk-go-v2/service/internal/s3shared.(*metadataRetriever).HandleDeserialize(0xc0000ba480, {0x811020, 0xc00009cff0}, {{0x783a80, 0xc00009ce40}}, {0x8093a0, 0xc0000be820})
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/aws-sdk-go-v2/service/internal/s3shared/metadata_retriever.go:31 +0x5a
github.com/aws/smithy-go/middleware.decoratedDeserializeHandler.HandleDeserialize(...)
	/home/prem/go/src/repo/testcode/vendor/github.com/aws/smithy-go/middleware/step_deserialize.go:206
github.com/aws/aws-sdk-go-v2/service/internal/s3shared.(*errorWrapper).HandleDeserialize(0x10, {0x811020, 0xc00009cff0}, {{0x783a80, 0xc00009ce40}}, {0x8093a0, 0xc0000be840})

The complete command with output is attached here.
aws_sdk_go_logrequest_issue.txt

Thanks,
Prem

Possible Solution

I am not really sure, some fix needed in file vendor/github.com/aws/smithy-go/transport/http/middleware_http_logging.go and function func (r *RequestResponseLogger) HandleDeserialize( ctx context.Context, in middleware.DeserializeInput, next middleware.DeserializeHandler, )

AWS Go SDK version used

1.13.0

Compiler and Version used

go version go1.17.6 linux/amd64

Operating System and version

Linux prem-lubuntu 5.4.0-100-generic #113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

@prembhaskal prembhaskal added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 9, 2022
@vudh1 vudh1 self-assigned this Mar 9, 2022
@skmcgrail
Copy link
Member

skmcgrail commented Mar 9, 2022

From a quick glance at the net/http/httputil standard library code for DumpRequestOut implementation, I believe you are running into this comment: https://github.com/golang/go/blob/release-branch.go1.17/src/net/http/httputil/dump.go#L157-L161

If we used a dummy body above, remove it now.
TODO: if the req.ContentLength is large, we allocate memory
unnecessarily just to slice it off here. But this is just
a debug function, so this is acceptable for now. We could
discard the body earlier if this matters.

While it doesn't actually read the body it still fills in the body of the dummy request with the same amount of content just replace with all x runes.

At first glance it looks like you setting -1 on the ContentLength, as potentially trying to work around this behavior, but the SDK treats -1 as an uninitialized request ContentLength, and given you are providing a seekable body as part of the request, it will use the provided bodies true content length. That is determined by the SeekCurrent position of the file cursor when it's passed in to the SeekEnd point of the file (EOF). To not do so would be an HTTP error if the content-length header didn't match the actual body of the request.

So TLDR, it looks like this is an issue with how the standard library handles debug logging of requests, and is more likely a request to the upstream Go compiler on trying to improve this code path for logging the request without the body and additional memory allocations.

@skmcgrail skmcgrail added dependencies This issue is a problem in a dependency. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Mar 9, 2022
@skmcgrail skmcgrail assigned skmcgrail and unassigned vudh1 Mar 10, 2022
@prembhaskal
Copy link
Author

prembhaskal commented Mar 10, 2022

So TLDR, it looks like this is an issue with how the standard library handles debug logging of requests, and is more likely a request to the upstream Go compiler on trying to improve this code path for logging the request without the body and additional memory allocations.

Then can we write a DumpRequest of our own in aws-sdk-go-v2 and not use the standard library.
I mean because of this bug I cannot enable the RequestLog in the production for debugging, it can potentially kill my pod/container.

Meanwhile, let me try to raise an issue to golang and see if I can convince them to fix this issue somehow 😄

@skmcgrail
Copy link
Member

I think the issue is better solved in the standard library, and I’ve produced a fix that I can work on getting upstreamed, but if accepted I’m not sure when it would make it into the compiler release.

@skmcgrail skmcgrail removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Mar 10, 2022
@prembhaskal
Copy link
Author

hmm ok, let us take it to Go. Rewriting all that here in AWS client will be a much bigger task anyway. I hope we can make it before Go 19. 🤞

@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@skmcgrail
Copy link
Member

golang/go#51662 is the PR I have put together for this as an FYI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. dependencies This issue is a problem in a dependency.
Projects
None yet
Development

No branches or pull requests

3 participants