Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: TimeoutHandler hides panic locations #27375

Open
pvhau opened this issue Aug 30, 2018 · 11 comments
Open

net/http: TimeoutHandler hides panic locations #27375

pvhau opened this issue Aug 30, 2018 · 11 comments

Comments

@pvhau
Copy link

@pvhau pvhau commented Aug 30, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

v1.10.3

Does this issue reproduce with the latest release?

Not sure, I am finding a solution with my current version

What operating system and processor architecture are you using (go env)?

OS: linux, arch: amd64

What did you do?

This is a demo


import (
	"log"
	"net/http"
	"time"
)

type sampleHandler string
func (s sampleHandler) ServeHTTP(rw http.ResponseWriter, r *http.Request) {
	var mm interface{}
	log.Print("This will be panic")
	log.Print(mm.(string))
}

func main() {
	go func() {
		http.ListenAndServe("localhost:8888", http.TimeoutHandler(sampleHandler("sample"), time.Second*10, "Timeout"))
	}()

	time.Sleep(time.Second * 2)
	http.Get("http://localhost:8888/")
}

What did you expect to see?

I expect to see stack trace point to where cause panic (log.Print(mm.(string))) so that I can debug app easily

What did you see instead?

Stack trace point me panic(p) statement in net/http.timeoutHandler (go/src/net/http/server.go:3144). I know it make sense because that is the place that call panic. Are there any good solution to passing stack trace from handler inside TimeoutHandler?

@pvhau pvhau changed the title Passing panic stack trace inside TimeoutHandler Passing panic's stacktrace inside TimeoutHandler Aug 30, 2018
@FiloSottile FiloSottile added this to the Unplanned milestone Aug 31, 2018
@FiloSottile FiloSottile changed the title Passing panic's stacktrace inside TimeoutHandler net/http: TimeoutHandler hides panic locations Aug 31, 2018
@FiloSottile
Copy link
Member

@FiloSottile FiloSottile commented Aug 31, 2018

/cc @bradfitz

@slon
Copy link
Contributor

@slon slon commented Oct 11, 2019

Hello,

I would like to contribute fix for this issue.

I intend to change role of the two goroutines in net/http.(*timeoutHandler).ServeHTTP. That way, user handler panicking is still happening on the main goroutine calling the handler and stack trace is preserved for all defer-ed functions higher in the stack.

@bradfitz is this fix OK?

@pam4
Copy link

@pam4 pam4 commented Oct 12, 2019

Running the inner handler in the main goroutine would also solve more serious problems like #34608.
But currently TimeoutHandler.ServeHTTP doesn't wait for the inner handler to complete in case of timeout, which means that a "timed out" response is never late, even if the inner handler is taking a long time.
By switching goroutines we cannot prevent the "timed out" response to be delayed by the inner handler (even if we write it as soon as possible, Flush is not guaranteed, and the response is not complete until TimeoutHandler.ServeHTTP returns).
So if someone's inner handler runs too long after timeout, this change would break them.

On the other hand, the current situation is quite messy: wrong panic location, panic disappearing after timeout, and broken (racy) Push.
I suspect TimeoutHandler has a very limited user base, and the core team doesn't seem much interested either.
I would like to hear from someone who is actually using TimeoutHandler whether the proposed change is acceptable to them.

To be clear, I'm thinking about a change of this kind in TimeoutHandler.ServeHTTP (untested!):

go func() {
    <-ctx.Done()
    tw.mu.Lock()
    defer tw.mu.Unlock()
    tw.timedOut = true
}
h.handler.ServeHTTP(tw, r)
tw.mu.Lock()
defer tw.mu.Unlock()
if tw.timedOut {
    w.WriteHeader(StatusServiceUnavailable)
    io.WriteString(w, h.errorBody())
    return
}
dst := w.Header()
for k, vv := range tw.h {
    dst[k] = vv
}
if !tw.wroteHeader {
    tw.code = StatusOK
}
w.WriteHeader(tw.code)
w.Write(tw.wbuf.Bytes())

Writing the "timed out" response earlier (in the spawned goroutine) won't make any difference, because of buffering, and because there's no way (that I know of) to tell the server "I'm done" without returning from TimeoutHandler.ServeHTTP.
Also, it's good to keep the spawned goroutine as simple as possible, unless we recover panic in there too.

@CfirTsabari
Copy link

@CfirTsabari CfirTsabari commented Jan 30, 2020

@pam4
Hi, we are using the timeout handler and this issue is a pain in the code for us as well.

But delaying the response is not an acceptable solution for most cases, it would be fatal if the inner handler doesn't return, and will cause big delay on response those missed the intent of a timeout handler.

Also your solution could be implemented simply by checking the amount of time that took for the inner handler to process the request.

I think that the approach should be printing\saving the stack trace from the go routine.

@pam4
Copy link

@pam4 pam4 commented Jan 30, 2020

@CfirTsabari

Also your solution could be implemented simply by checking the amount of time that took for the inner handler to process the request.

It could be implemented without a separate goroutine, but we still want writes from the inner handler to fail on time (returning error), so that the inner handler can detect the timeout on its own, otherwise it would be completely useless.
Of course it *is* less useful, but I don't know of many real life use cases for TimeoutHandler, that's why I was asking in response to @slon proposal.

I think that the approach should be printing\saving the stack trace from the go routine.

Yes, the separate goroutine here could be changed to log the stack trace directly to the server logger in case of panic (same as here). And the main goroutine would have to run panic(ErrAbortHandler) on receive from panicChan since the panic has already been logged.
If I'm not missing something, this could solve both the wrong panic location problem (before timeout) and the panic disappearing problem (after timeout).

@CfirTsabari
Copy link

@CfirTsabari CfirTsabari commented Feb 3, 2020

@pam4

Yes, the separate goroutine here could be changed to log the stack trace directly to the server logger in case of panic (same as here). And the main goroutine would have to run panic(ErrAbortHandler) on receive from panicChan since the panic has already been logged.
If I'm not missing something, this could solve both the wrong panic location problem (before timeout) and the panic disappearing problem (after timeout).

Yes that exactly what I meant, but this requires changes in the golang net/http package so I have created an handler that intermediate between the TimeoutHandler and the inner handler that log the panic in edition to the default logging of the panic that is called by the TimeoutHandler.
This is not ideal since:

  1. Some information is unknown at the handler level (server logger, remote address and so on)
  2. I Don't completely like the idea of calling panic after recover.
  3. 2 different stacks will be printed.

UPDATE:
a better version of the workaround is here:
https://gist.github.com/CfirTsabari/786c24685189b7569500b36fe8c5da7b

But in the end it's kind of does the job:
if anyone need it here it is:
Just use PanicReportTimeoutHandler instead of http.TimeoutHandler

func PanicReportTimeoutHandler(h http.Handler, dt time.Duration, msg string) http.Handler{
	return http.TimeoutHandler(&panicReporterHandler{handler:h,logFunc:log.Printf},dt,msg)
}

type panicReporterHandler struct{
	handler http.Handler
	logFunc func( string, ...interface{})
}

func (h *panicReporterHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	defer func() {
		if err := recover(); err != nil && err != http.ErrAbortHandler {
			const size = 64 << 10
			buf := make([]byte, size)
			buf = buf[:runtime.Stack(buf, false)]
			h.logFunc("http: panic serving: (%v)\n%s", err, buf)
			panic(err)
		}

	}()
	h.handler.ServeHTTP(w,r)
}

@telendt
Copy link

@telendt telendt commented Feb 3, 2020

@CfirTsabari Thanks for the workaround. It does the job but as you noticed there are many drawbacks of this solution.

The bottom line is - http.Server has its logger and a mechanism of logging recovered errors/values together with the stacktrace but this does not work if panic happened in a different goroutine:

go/src/net/http/server.go

Lines 1769 to 1774 in 753d56d

if err := recover(); err != nil && err != ErrAbortHandler {
const size = 64 << 10
buf := make([]byte, size)
buf = buf[:runtime.Stack(buf, false)]
c.server.logf("http: panic serving %v: %v\n%s", c.remoteAddr, err, buf)
}

I believe that the only way to solve it nicely (and in backward compatible way) would be to wrap errors recovered in http.TimeoutHandler into "errors with stack" (like in https://godoc.org/github.com/pkg/errors#WithStack) before re-panicking them and using these frames for logging if present.

I know there was a proposal for extending errors with stack frames and wrapping. I also know that the wrapping/unwrapping part got accepted and we have them since Go 1.13, but I don't know what happened to errors.Frame and errors.Caller part of that proposal. Was it rejected? Are there any active discussions around tying stack traces/frames into error values?

@CfirTsabari
Copy link

@CfirTsabari CfirTsabari commented Feb 3, 2020

@telendt

I believe that the only way to solve it nicely (and in backward compatible way) would be to wrap errors recovered in http.TimeoutHandler into "errors with stack", like in https://godoc.org/github.com/pkg/errors#WithStack and using these frames for logging if present.

The error handling in go is a big issue #29934, yes adding a mechanism of error handling will most likely solve the hidden panic issue, but this is behind the scope of this issue IMHO.

@pam4
Copy link

@pam4 pam4 commented Feb 3, 2020

@CfirTsabari

Yes that exactly what I meant, but this requires changes in the golang net/http package

As far as I know, http.TimeoutHandler does not use any internals, therefore it should be possible to just copy it in your own package and make the changes.
But your workaround should have the same effect, if you fix the following points.

  1. Some information is unknown at the handler level (server logger, remote address and so on)

You can get some via Context, see logf function and exported keys.

  1. I Don't completely like the idea of calling panic after recover.

Is there a specific problem you have in mind?

  1. 2 different stacks will be printed.

You should probably just call panic(http.ErrAbortHandler) instead of panic(err) in your panicReporterHandler.ServeHTTP, to avoid printing the incorrect one.

@CfirTsabari
Copy link

@CfirTsabari CfirTsabari commented Feb 3, 2020

@pam4

As far as I know, http.TimeoutHandler does not use any internals, therefore it should be possible to just copy it in your own package and make the changes.

Yea but copying the timeout handler is also a bad practice, for example lets say someone found a bug in the net/http timeout handler and fixed, you won't be getting this bug fix.

You can get some via Context, see logf function and exported keys.

great idea 👍

Is there a specific problem you have in mind?

this hide the panic location, so as long as we use it only as addition to the timeout handler that kind of fine, but without this timeout handler the new handler will cause the same problem it was supposed to solve.

You should probably just call panic(http.ErrAbortHandler) instead of panic(err) in your panicReporterHandler.ServeHTTP, to avoid printing the incorrect one.

great idea 👍

so i have created a gist updated with your suggestions

@pam4
Copy link

@pam4 pam4 commented Feb 4, 2020

@CfirTsabari

Yea but copying the timeout handler is also a bad practice, for example lets say someone found a bug in the net/http timeout handler and fixed, you won't be getting this bug fix.

Of course. I just wanted to point out that it does not use any internals so you can do about the same things from outside. I certainly hope it *will* get fixed.

this hide the panic location, so as long as we use it only as addition to the timeout handler that kind of fine, but without this timeout handler the new handler will cause the same problem it was supposed to solve.

If you re-throw a panic from the recovering function itself, you get one more stack frame for the rethrow, but the stack trace up to the original panic location should remain unchanged. Not exactly the same thing, but it doesn't really "hide" the panic location as TimeoutHandler does.

so i have created a gist updated with your suggestions

Looks good 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.