Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible weird garbage collection bug, not sure - confused #8

Closed
cerk opened this issue Jan 16, 2015 · 11 comments
Closed

Possible weird garbage collection bug, not sure - confused #8

cerk opened this issue Jan 16, 2015 · 11 comments

Comments

@cerk
Copy link

cerk commented Jan 16, 2015

Hi, I am running gothic on Windows (via Horkonaut fork). I am running into a weird problem. When I run the following program, I click on the "Hello World" button a few times. Then I wait 8+ minutes and click on the button about 10 times. Then I wait a minute or two and the program crashes with:

fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0xffffffffffffffff pc=0x40a057]

Note that if I turn off garbage collection, the program does not crash.

I have tried to strip the program down to a minimal form:

package main

import (
    // For Windows, "github.com/Horkonaut/gothic"
    "github.com/nsf/gothic"
    // Uncomment to prevent garbage collection and eliminate bug
    //"runtime/debug"
)

type bug struct {
    *gothic.Interpreter
}

func (b *bug) TCL_OnButtonPress() {
}

func main() {
    // Uncomment to prevent garbage collection and eliminate bug
    //debug.SetGCPercent(-1)
    ir := gothic.NewInterpreter(`
        wm title . PossibleBug      
        ttk::button .button -text {Hello World} -command go::OnButtonPress      
        pack .button
            `)

    ir.RegisterCommands("go", &bug{Interpreter: ir})
    <-ir.Done
}

Note that if I eliminate the "-command go::OnButtonPress" portion of the code, it does not crash either.

Could someone try to reproduce what I am seeing on a linux machine? Any ideas on what may be going on? It looks like there is some memory interaction between cgo and go that is causing a garbage collection crash.

I am running:

go1.3.2 windows/amd64
tcl8.6 on windows

Thanks for reading!

@nsf
Copy link
Owner

nsf commented Jan 16, 2015

The key is to figure out what "unexpected signal" means on windows. Because on windows there are no posix signals. I'll take a look on it in few days.

@cerk
Copy link
Author

cerk commented Jan 17, 2015

I tried to do some research on the signal 0xc0000005 error on Windows, and all I could come up with is that it may be an access violation ("usually caused by heap memory corruption"):

http://stackoverflow.com/questions/5303524/what-exactly-is-the-scope-of-access-violation-0xc0000005

@nsf
Copy link
Owner

nsf commented Jan 17, 2015

Interesting. I tried reproducing the issue on a linux machine, no luck. Will try it on windows tomorrow.

@nsf
Copy link
Owner

nsf commented Jan 17, 2015

Also try using go 1.4.1, the latest go version, if possible. So that we stay in-sync here.

@cerk
Copy link
Author

cerk commented Jan 18, 2015

Just installed 1.4.1 and I am still getting the fatal error.

@nsf
Copy link
Owner

nsf commented Jan 19, 2015

Yes, I can reproduce the issue on windows. I'll see what I can do to fix it within next few days (that is this week for sure).

@nsf
Copy link
Owner

nsf commented Jan 19, 2015

Correction. Happend on linux too. So it's a bug in gothic I'm pretty much sure about it.

@nsf
Copy link
Owner

nsf commented Jan 19, 2015

Here's the thing. I do a lot of nasty stuff in there. And it was written ages ago with a different Go runtime. I think I'll just rewrite everything following rsc's advice to never pass a Go pointer to C. It may take few more days to happen, but nothing really complicated, code is small after all.

Because it's really hard to say where it fails. Apparently my code doesn't play well with Go's precise GC.

@cerk
Copy link
Author

cerk commented Jan 19, 2015

Yes, I tried debugging what was happening myself (before filing this issue) but could not gain any traction with the error messages that were occurring. The only reason I found this issue was because I would leave a (larger) TclTk program running and then come back a half an hour later and press a button and it would crash mysteriously. I really appreciate you taking a look at it - and no rush on the fixes.

@nsf nsf closed this as completed in d7e6351 Jan 23, 2015
@nsf
Copy link
Owner

nsf commented Jan 23, 2015

It seems to be fixed. I checked it only on linux, feel free to reopen issue if the bug is still there.

@cerk
Copy link
Author

cerk commented Jan 23, 2015

I checked it on Windows and it looks like you fixed it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants