Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid memory address or nil pointer dereference #48

Closed
adri opened this issue Sep 4, 2020 · 13 comments
Closed

Invalid memory address or nil pointer dereference #48

adri opened this issue Sep 4, 2020 · 13 comments

Comments

@adri
Copy link
Contributor

adri commented Sep 4, 2020

I'm seeing this error in production in combination with skipper. Unfortunately I wasn't able to get the exact request yet.
Do you maybe have some pointers (pun intended) how to debug this?

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xa9dcd9]
goroutine 57768117 [running]:
github.com/yookoala/gofast.(*client).readResponse.func1(0xc00ac4e000, 0xc00a84aea0, 0xc00a81d6c0, 0xc004484540)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:211 +0x29
created by github.com/yookoala/gofast.(*client).readResponse
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:208 +0xad

Relevant code segment:
https://github.com/yookoala/gofast/blob/v0.4.0/client.go#L199-L239

@adri
Copy link
Contributor Author

adri commented Sep 4, 2020

@yookoala
Copy link
Owner

yookoala commented Sep 9, 2020

This seems strange. Does it happen occasionally or does it happen to every requests?

@yookoala
Copy link
Owner

yookoala commented Sep 9, 2020

The file did not show how you used your RoundTripper created by NewRoundTripper. Is it use-and-throw or do you reuse it multiple times?

As you can see, you'd close the client after each RoundTripper.RoundTrip() call. If you reuse that RoundTripper, you'd probably be using a handler on an already closed client connection, which would become nil after Client.Close().

The design of gofast is to have the ClientFactory reused, not the Client.

@yookoala
Copy link
Owner

yookoala commented Sep 9, 2020

I'll have a more friendly error for this situation in the future.

@ruudk
Copy link
Contributor

ruudk commented Sep 9, 2020

@yookoala Thanks for your thorough investigation on this issue. I'm the author of the gofast implementation in Skipper. I will try to create a PR to include your suggestion.

@ruudk
Copy link
Contributor

ruudk commented Sep 9, 2020

I checked, and the RoundTripper is actually created for every single request. It is never reused.

@yookoala
Copy link
Owner

yookoala commented Sep 9, 2020

Hmm... I'll do some further check.

@yookoala yookoala added the bug label Sep 9, 2020
@adri
Copy link
Contributor Author

adri commented Sep 10, 2020

I'm not sure if it's related, but before the segmentation violation there are sometimes timeouts.
The skipper instance runs in Kubernetes where the memory usage keeps rising over time. I also see sometimes out of memory kills (OOM). There might be a memory leak somewhere. I did not profile it yet, hope to get to it next week.

[APP]time="2020-09-10T08:48:14Z" level=error msg="Failed to roundtrip to fastcgi: gofast: error stream from application process gofast: timeout or canceled"
[APP]time="2020-09-10T08:48:14Z" level=error msg="error while proxying after 289.780249ms, route kube_some_route with backend lb fastcgi://10.0.84.101:9000, status code 500: dialing failed false: gofast: error stream from application process gofast: timeout or canceled, remote host: some.ip.address, request: \"GET /search/autocomplete HTTP/1.1\", user agent: \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/some.ip.address Safari/537.36\""
some.ip.address - - [10/Sep/2020:08:48:14 +0000] "GET /search/autocomplete?query=mill%C3%A9si HTTP/1.1" 500 22 "https://some-domain" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/some.ip.address Safari/537.36" 289 some-domain - -
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xa9dcd9]
goroutine 14961651 [running]:
github.com/yookoala/gofast.(*client).readResponse.func1(0xc007976000, 0xc006f16e80, 0xc006c49400, 0xc006357bc0)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:211 +0x29
created by github.com/yookoala/gofast.(*client).readResponse
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:208 +0xad

@adri
Copy link
Contributor Author

adri commented Sep 15, 2020

I'm not working often in Golang, please forgive me if what I say makes no sense. I've been looking for potential memory that is not cleaned up.

  1. Maybe c.ids.Release(reqID) is not called in certain error cases
    The ID is allocated and later released. What if there is an error like this, will the ID be released back in the pool?

  2. Looking at one Skipper pod I see the typical saw-tooth graph of a memory leak.
    I'm not sure if this has anything to do with the gofast usage or something in Skipper. If you have a tip how I can find out more please let me know.
    Screen Shot 2020-09-15 at 09 14 19
    I can see goroutines increasing:
    Screen Shot 2020-09-15 at 09 07 56
    And live objects:
    Screen Shot 2020-09-15 at 09 15 02

@adri
Copy link
Contributor Author

adri commented Sep 15, 2020

The design of gofast is to have the ClientFactory reused, not the Client.

@yookoala mentioned that the ClientFactory should be reused in the comment.

However, if I read it right, the ClientFactory is created on every request.

@yookoala @ruudk Should this not be changed in Skipper to only create the ClientFactory once?
See this line in skipper.


I think the problem is with creating new request IDs. When the ClientFactory is not reused, I guess that there are new ID pools created on every request?

Explanation
When getting a diff between two goroutine dumps I see many goroutines in newIDs.func1:

goroutine 75451 [chan send, 10 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc001323020, 0xffff)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:100 +0x78

goroutine 68511 [chan send, 11 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc002043500, 0xffff)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:100 +0x78

goroutine 53003 [chan send, 12 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc001f52f60, 0xffff)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:100 +0x78

goroutine 120002 [chan send, 6 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc001a0cba0, 0xffff)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:100 +0x78

goroutine 169080 [chan send, 2 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc000039bc0, 0xffff)
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
	/go/pkg/mod/github.com/yookoala/gofast@v0.4.0/client.go:100 +0x78

How to dump

  1. Configure -enable-profile for Skipper
  2. Proxy to a Skipper pod kubectl port-forward skipper-ingress-pzx2b 9911:9911
  3. Go to http://127.0.0.1:9911/debug/pprof/ and download a dump full goroutine stack dump, wait a few minutes, and download another dump
  4. Install goroutine-inspect and run it $GOPATH/bin/goroutine-inspect
  5. Make a diff, (l = only in original, c = in both, r = in original 2)
    original = load("goroutine.dump")
    original2 = load("goroutine2.dump")
    l, c, r = original.diff(original2)
    c.show()
    

@ruudk
Copy link
Contributor

ruudk commented Sep 15, 2020

Skipper creates a new client for every request. This is because Skipper handles a lot of different traffic to different servers. So only when the requests comes in, it knows where to send it to. It might be something that we can optimize in the future, but it is what it is right now.

We found out that when gofast creates a new client, it also spawns a goroutine that handles the ID pool. These are the leaking goroutines.

When Skipper sends the request to gofast, it never calls client.Close(). We need to fix that in Skipper for sure, but even if it would properly close the client, gofast wouldn't close the idpool. Therefore, we created a PR to address that issue.

When that is merged and tagged, we can create a PR for Skipper to properly call client.Close() when the request is done.

@yookoala
Copy link
Owner

@adri: The memory leak issue is now tracking at #69.

@yookoala
Copy link
Owner

yookoala commented Feb 20, 2021

I think I can stably reproduce this "Invalid memory address or nil pointer dereference" issue with this test:

func TestClient_close_before_readRequest(t *testing.T) {
	// Test closing client before writeRequest
	//
	// Note: closing connection could cause client.conn nil
	//       writing after close could cause error

	// create a temp dummy fastcgi application server
	p, err := newAppServer("client.test.sock", func(w http.ResponseWriter, r *http.Request) {
		t.Logf("accessing FastCGI process")
		fmt.Fprintf(w, "hello world")
	})
	if err != nil {
		t.Fatalf("Unable to start server: %s", err.Error())
	}
	defer p.Close()

	// create reusable client factory
	c, err := gofast.SimpleClientFactory(
		gofast.SimpleConnFactory(p.Network(), p.Address()),
	)()
	if err != nil {
		t.Fatalf("Unable for client factory to connect to server: %s", err.Error())
	}

	// Mock creating a gofast.Request that still
	// pending to write a body.
	pr, pw := io.Pipe()
	r, err := http.NewRequest("POST", "/add", pr)
	r.Header.Add("Content-Length", "11")
	if err != nil {
		t.Errorf("unexpected error: %#v", err.Error())
	}
	writeBody := func() {
		c.Close()
		pw.Write([]byte("hello world"))
	}
	req := gofast.NewRequest(r)

	// handle the result
	go writeBody()

	log.Printf("checkpoint 1")
	resp, err := c.Do(req)
	if err != nil {
		t.Fatalf("web server: unable to connect to process request: %s", err.Error())
		return
	}

	w := httptest.NewRecorder()
	errBuffer := new(bytes.Buffer)
	resp.WriteTo(w, errBuffer)
	if want, have := "", w.Body.String(); want != have {
		t.Errorf("expected %v, got %v", want, have)
	}
}

Still working on way to resolve it.

yookoala added a commit that referenced this issue Aug 3, 2022
possible fix of invalid memory address dereference (#74 and #48)
@adri adri closed this as not planned Won't fix, can't repro, duplicate, stale Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants