-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: Crash in autogenerated function #43934
Comments
Can you please test with Go 1.15.7 or 1.16beta1. Thank you |
Have you tried running your program under the race detector? See https://blog.golang.org/race-detector . |
@davecheney We'll go up to 1.16beta1 and then if that doesn't produce a crash we'll also run it with the race detector. Any other tricks we need to do to get a core dump in case that would help? |
That almost certainly means the thing being called corrupted the stack pointer somehow. This is the first use of SP after the call. Do you know what code it called? You show the stack trace, but what was the error that was printed above the trace ("segmentation fault at ..." or something)? |
That's a good question. I didn't get that from our tester and the system has been cleaned up since. If we can recreate the crash I will be sure they capture it this time.
Yes, the thing being called is Lumberjack's logger's write function: // Write implements io.Writer. If a write would cause the log file to be larger
// than MaxSize, the file is closed, renamed to include a timestamp of the
// current time, and a new log file is created using the original log file name.
// If the length of the write is greater than MaxSize, an error is returned.
func (l *Logger) Write(p []byte) (n int, err error) {
l.mu.Lock()
defer l.mu.Unlock()
writeLen := int64(len(p))
if writeLen > l.max() {
return 0, fmt.Errorf(
"write length %d exceeds maximum file size %d", writeLen, l.max(),
)
}
if l.file == nil {
if err = l.openExistingOrNew(len(p)); err != nil {
return 0, err
}
}
if l.size+writeLen > l.max() {
if err := l.rotate(); err != nil {
return 0, err
}
}
n, err = l.file.Write(p)
l.size += int64(n)
return n, err
} I don't see anything special here that should be corrupting the stack. I also thought the stack was generally inaccessible to Go programs and isn't generally susceptible to corruption. |
try env GOTRACEBACK=crash |
Isn't that what happens when the code calls |
I have finally been able to consistently reproduce our code crashing. This time though it crashes on:
The block of code it crashes in does not seem like it should be possible to crash in this way: if p.h == nil {
p.h = h
}
plog.LogDebug(class, "ready",
zap.Int64("idx", p.idx),
zap.Int("num", num),
crash here >>> zap.Int("remaining", remaining),
zap.Int("offset", offset),
) Because I can consistently crash now, I ran under dlv with the race detector on, and we have no races prior to the crash. However, this is interesting:
It seems impossible to me that our code should be able to do this given that we pass the if statement which dereferences the pointer that is now pointing at
Given that the race detector didn't trip leading up to the crash, I really don't know where to take this next. |
If this is the last reference to
Did you explicitly request |
It's not the last reference, there's many more references beyond that line. The binary is compiled with I did find some sequence of events in our code and how we receive data from FUSE, along with a subtle bug in our code which leads to a situation I didn't think possible with Go. The function we crash in goes roughly like this
The subtle bug in our code lies in the helper function which prepares a blank data structure if there isn't a matching one already. What we weren't doing was heading down the path that prepared the blank data structure in some situations, leading the crashing function to think it had a valid structure when it did not. Here's the situation I didn't think possible: the helper function returns several pointers, including this necessary data structure. When this data structure creation path wasn't taken, the page output variable wasn't assigned by the helper function, which I had thought would result in return of a So now my question is, when you have a function of the form: func (t *T) helper(... input params) (page *ioPage, t2 *T2, ... err error) Shouldn't
No, I didn't dig too deep in that so I didn't realize it was behind an environment variable. Wonder where the beef is coming from then? |
Ok, that's an optimized binary. Try using
Yes. Can you share the code and the assembly for that helper function? |
I will give that a try later on this afternoon.
I'll need to explore if that's a possibility and what approvals I might need. I will get back soon hopefully on that. |
We need to work on paring this down for a minimal recreation so that we don't inadvertently expose confidential information before we can do that. I'm booked for the rest of this sprint, however we'll begin next week with our next sprint, so hopefully in the next few weeks we'll have what you're requesting ready. |
Also, nothing changed when I used the gcflags argument. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We have a project that consumes a number of external projects. As part of this, we log key points of interest using Zap (https://github.com/uber-go/zap) and Lumberjack (https://github.com/natefinch/lumberjack) during the execution of the program.
We received a report of a crash from a tester with the following call stack:
Unfortunately, no core dump was found, despite
ulimit -c unlimited
having been set prior, as well as a call to the following upon startup of our program:debug.SetTraceback("crash")
Using
go tool objdump
the compiler generated the following assembly in which we crashed:So it looks to me like we crashed on:
MOVQ 0x20(SP), AX
We create the zap logger with:
Thus according to the zap code,
zapcore.AddSync
will wrap the Lumberjack logger with awriterWrapper
which is merely an interface comprised ofio.Writer
plus aSync
method, which is supplied by zap.It looks to me like the Go compiler is autogenerating some glue to make the
writerWrapper
interface compatible with the desiredWriteSyncer
which is an interface specifying anio.Writer
andSync
method.I'm opening the issue here because it dies in between Zap and Lumberjack and seems like we have stack corruption somehow, rather than squarely in one of those two projects. It also looks like, although this is not PPC64LE, Go has previously had issues with stack corruption in or nearby autogenerated functions on #10628.
If it matters, we have attempted to log a string, a
zap.Uint64
(7 character key and uint64 value), and azap.Bool
(7 character key and bool value).What did you expect to see?
No crash.
What did you see instead?
Crash without a core file.
The text was updated successfully, but these errors were encountered: