Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map: invalid memory address or nil pointer dereference #5207

Closed
ugorji opened this issue Apr 4, 2013 · 19 comments
Closed

map: invalid memory address or nil pointer dereference #5207

ugorji opened this issue Apr 4, 2013 · 19 comments
Milestone

Comments

@ugorji
Copy link
Contributor

ugorji commented Apr 4, 2013

Previously, I was getting this error every time.

However, with the fix issue #5120, I only get it after 
a few requests in rapid sequence (ie making same http
request in a tight loop will trigger it in my application).
For example, first 5 requests are fine, and error starts
from 6th request.

  https://golang.org/issue/5120

The application runs fine in Go 1.0.3.

I tried writing a reproducer but all my reproducers work
fine. I only see the error when running my application.

In my app, SafeStore uses a RWMutex around access to the map.

runtime error: invalid memory address or nil pointer dereference
/home/ugorji/depot/golang/src/ugorji.net/baseapp/baseapp.go:413 (0x4bdc2e)
    net/baseapp.(*BaseApp).derr: debug.PrintStack()
/home/ugorji/depot/golang/src/ugorji.net/baseapp/baseapp.go:264 (0x4c1753)
    net/baseapp.func.003: gapp.derr(x, c, w, r, h.OnErrorFn)
/opt/go-tip/src/pkg/runtime/panic.c:229 (0x413531)
    panic: reflect·call(d->fn, (byte*)d->args, d->siz);
/opt/go-tip/src/pkg/runtime/panic.c:487 (0x413cd3)
    panicstring: runtime·panic(err);
/opt/go-tip/src/pkg/runtime/os_linux.c:236 (0x412c18)
    sigpanic: runtime·panicstring("invalid memory address or nil pointer dereference");
/opt/go-tip/src/pkg/runtime/hashmap.c:-145 (0x4067b5)
    hash_lookup: ???
/opt/go-tip/src/pkg/runtime/hashmap.c:1150 (0x408c62)
    mapaccess: res = hash_lookup(t, h, &ak);
/opt/go-tip/src/pkg/runtime/hashmap.c:1207 (0x408e17)
    mapaccess2: runtime·mapaccess(t, h, ak, av, ap);
/home/ugorji/depot/golang/src/ugorji.net/util/safe_store.go:76 (0x4f2f32)
    net/util.(*SafeStore).Get: return rq.get(key)

Which compiler are you using (5g, 6g, 8g, gccgo)?
6g 

Which operating system are you using?
Linux 3.5.0-26-generic #42-Ubuntu SMP x86_64 GNU/Linux

Which version are you using?  (run 'go version')
go version devel +d646225cb410 Wed Apr 03 17:03:42 2013 -0700 linux/amd64
@bradfitz
Copy link
Contributor

bradfitz commented Apr 4, 2013

Comment 1:

Have you run the race detector on your code?

@ugorji
Copy link
Contributor Author

ugorji commented Apr 4, 2013

Comment 2:

Yes. Forgot to add that. I ran the race detector on my code once I first saw the 
issue last week. One race was detected which I fixed. But the issue still persists.

@ianlancetaylor
Copy link
Contributor

Comment 3:

We should understand what is happening here before the Go 1.1 release.

Labels changed: added priority-soon, go1.1, removed priority-triage.

@robpike
Copy link
Contributor

robpike commented Apr 4, 2013

Comment 4:

Can you tweak your program so I see the actual arguments to the functions? I'm unsure
what's printing your stack trace but this:
    mapaccess2: runtime·mapaccess(t, h, ak, av, ap);
isn't what comes from crashes I see. I see actuals, not formals.

@ugorji
Copy link
Contributor Author

ugorji commented Apr 4, 2013

Comment 5:

It gets worse.
type SafeStore struct {
    Container          map[interface{}]interface{}
    safestoreHelper
}
type safestoreHelper struct {
    Lock               sync.RWMutex
    UseLock            bool
    tickerIntervalNs   int64
    ticker             *time.Ticker
    tickerLock         sync.Mutex
    onceStartPurgeLoop sync.Once
}
My code basically acquires a lock, and then calls this:
func (rq *SafeStore) get(key interface{}) (v interface{}) {
    //TODO: remove debugging
    //fmt.Printf("About to get for rq: [isNil: %v], rq.Container: [isNil: %v] %v, key: %v\n", rq==nil, rq.Container==nil, rq.Container, v))
    fmt.Printf("About to get for rq: [isNil: %v], rq.Container: [isNil: %v], key: %v\n", rq==nil, rq.Container==nil, v)
    if val, ok := rq.Container[key]; ok {
        <snip>
this method is called a lot, as I use a mutex-protected map.
Notice the two different print statements I have above:
- One prints out rq.Container (ie map[interface{}]interface{}), and the other doesn't.
- When printing out rq.Container, my app crashes. 
- When not print out rq.Container, I get the previous error everytime (from first time).
Without the prints (as before), I got the error after like 6 http requests.
With the un-commented prints introduced early here, I get the error from the 1st
request. 
This suggests some timing issues.
The key was nil, the map is not.
About to get for rq: [isNil: false], rq.Container: [isNil: false], key: <nil>
ERROR    1177 1 1 2013-04-04 18:59:57.742 - [baseapp.(*BaseApp).derr] [baseapp.go:411]
Error handling request: runtime error: invalid memory address or nil pointer dereference
/home/ugorji/depot/golang/src/ugorji.net/baseapp/baseapp.go:413 (0x4bdf0e)
    net/baseapp.(*BaseApp).derr: debug.PrintStack()
/home/ugorji/depot/golang/src/ugorji.net/baseapp/baseapp.go:264 (0x4c1a33)
    net/baseapp.func.003: gapp.derr(x, c, w, r, h.OnErrorFn)
/opt/go-tip/src/pkg/runtime/traceback_x86.c:211 (0x41d7f5)
    callers: return runtime·gentraceback(pc, sp, nil, g, skip, pcbuf, m, nil, nil);
/opt/go-tip/src/pkg/runtime/runtime.c:275 (0x419d72)
    Caller: if(runtime·callers(1+skip-1, rpc, 2) < 2) {
/opt/go-tip/src/pkg/runtime/debug/stack.go:46 (0x5b131c)
    stack: pc, file, line, ok := runtime.Caller(i)
/opt/go-tip/src/pkg/runtime/debug/stack.go:25 (0x5b125c)
    PrintStack: os.Stderr.Write(stack())
/home/ugorji/depot/golang/src/ugorji.net/baseapp/baseapp.go:413 (0x4bdf0e)
    net/baseapp.(*BaseApp).derr: debug.PrintStack()
/home/ugorji/depot/golang/src/ugorji.net/baseapp/baseapp.go:264 (0x4c1a33)
    net/baseapp.func.003: gapp.derr(x, c, w, r, h.OnErrorFn)
/opt/go-tip/src/pkg/runtime/panic.c:229 (0x413531)
    panic: reflect·call(d->fn, (byte*)d->args, d->siz);
/opt/go-tip/src/pkg/runtime/panic.c:487 (0x413cd3)
    panicstring: runtime·panic(err);
/opt/go-tip/src/pkg/runtime/os_linux.c:236 (0x412c18)
    sigpanic: runtime·panicstring("invalid memory address or nil pointer dereference");
/opt/go-tip/src/pkg/runtime/hashmap.c:-145 (0x4067b5)
    hash_lookup: ???
/opt/go-tip/src/pkg/runtime/hashmap.c:1150 (0x408c62)
    mapaccess: res = hash_lookup(t, h, &ak);
/opt/go-tip/src/pkg/runtime/hashmap.c:1207 (0x408e17)
    mapaccess2: runtime·mapaccess(t, h, ak, av, ap);
/home/ugorji/depot/golang/src/ugorji.net/util/safe_store.go:108 (0x4f3574)
    net/util.(*SafeStore).get: if val, ok := rq.Container[key]; ok {
/home/ugorji/depot/golang/src/ugorji.net/util/safe_store.go:78 (0x4f3169)
    net/util.(*SafeStore).Get: return rq.get(key)
<<snip>>
===============================================
However, with the print that I commented out, the program crashes. My recover() up at 
the top of the http Request call doesn't even intercept it. 
The stack trace is below. Basically, there's a crash printing a non-nil map.
This may be a different issue.
 
The stack trace from this is like:
unexpected fault address 0x7604a9
fatal error: fault
[signal 0xb code=0x2 addr=0x7604a9 pc=0x4091ea]
goroutine 17 [running]:
[fp=0x7f955a450438] runtime.throw(0x9921f7)
    /opt/go-tip/src/pkg/runtime/panic.c:473 +0x67
[fp=0x7f955a450450] runtime.sigpanic()
    /opt/go-tip/src/pkg/runtime/os_linux.c:239 +0xe7
[fp=0x7f955a450478] reflect.mapiterinit(0x67d220, 0x7604a0, 0x156)
    /opt/go-tip/src/pkg/runtime/hashmap.c:1384 +0x8a
[fp=0x7f955a4504f8] reflect.Value.MapKeys(0x67d220, 0x7f955a5607a8, 0x156, 0x280,
0x97a260, ...)
    /opt/go-tip/src/pkg/reflect/value.go:1140 +0xe0
[fp=0x7f955a450f80] fmt.(*pp).printReflectValue(0xc20013b270, 0x67d220, 0x7f955a5607a8,
0x156, 0x76, ...)
    /opt/go-tip/src/pkg/fmt/print.go:891 +0x1f3f
----- stack segment boundary -----
[fp=0x7f955a4514f8] fmt.(*pp).printValue(0xc20013b270, 0x67d220, 0x7f955a5607a8, 0x156,
0x76, ...)
    /opt/go-tip/src/pkg/fmt/print.go:850 +0x2f5
[fp=0x7f955a451f80] fmt.(*pp).printReflectValue(0xc20013b270, 0x725540, 0x7f955a5607a8,
0x196, 0x76, ...)
    /opt/go-tip/src/pkg/fmt/print.go:930 +0x325b
----- stack segment boundary -----
[fp=0x7f955a4434f8] fmt.(*pp).printValue(0xc20013b270, 0x725540, 0x7f955a5607a8, 0x196,
0x76, ...)
    /opt/go-tip/src/pkg/fmt/print.go:850 +0x2f5
[fp=0x7f955a443f80] fmt.(*pp).printReflectValue(0xc20013b270, 0x7604a0, 0x7f955a5607a8,
0x160, 0x76, ...)
    /opt/go-tip/src/pkg/fmt/print.go:1003 +0x2dc3
----- stack segment boundary -----
[fp=0x7f955a5602f0] fmt.(*pp).printField(0xc20013b270, 0x7604a0, 0x7f955a5607a8, 0x76,
0x0, ...)
    /opt/go-tip/src/pkg/fmt/print.go:812 +0x3d1
[fp=0x7f955a560670] fmt.(*pp).doPrintf(0xc20013b270, 0x7f6a50, 0x34, 0x7f955a560798,
0x4, ...)
    /opt/go-tip/src/pkg/fmt/print.go:1110 +0x1270
[fp=0x7f955a5606c0] fmt.Fprintf(0xc2000c97b0, 0xc200000008, 0x7f6a50, 0x34,
0x7f955a560798, ...)
    /opt/go-tip/src/pkg/fmt/print.go:213 +0x80
[fp=0x7f955a560718] fmt.Printf(0x7f6a50, 0x34, 0x7f955a560798, 0x4, 0x4, ...)
    /opt/go-tip/src/pkg/fmt/print.go:222 +0x8c
[fp=0x7f955a5607e0] ugorji.net/util.(*SafeStore).get(0x7f955a5607a8, 0x688800,
0xc2001bf7d0, 0x0, 0x0, ...)
    /home/ugorji/depot/golang/src/ugorji.net/util/safe_store.go:106 +0x18a
<<snip>>

@ugorji
Copy link
Contributor Author

ugorji commented Apr 4, 2013

Comment 7:

I did some more testing. 
Within each put (which is already protected by a mutex), I wrote out the value of a
field.
var safestoreSeq uint64 = 0
func NewSafeStore(useLock bool) *SafeStore {
    oldSafestoreSeq := safestoreSeq
    a := SafeStore{
        id: atomic.AddUint64(&safestoreSeq, 1),
        Container: make(map[interface{}]interface{}),
    }
    fmt.Printf(">>>> New SafeStore: last seq: %v, New Safestore Id: %v\n", oldSafestoreSeq, a.id)
<<snip>>
func (rq *SafeStore) put(key interface{}, val interface{}, ttlNs int64) {
    fmt.Printf(">>>> Putting SafeStore: %v, key: %#v, val: %#v\n", rq.id, key, val)
ugorji@ugorji-n53sv:~/depot/golang/src/ugorji.net/ndb$ cat
~/depot/dist/blackannex/out.txt | grep '>> New Safe'
>>>> New SafeStore: last seq: 0, New Safestore Id: 1
>>>> New SafeStore: last seq: 1, New Safestore Id: 2
>>>> New SafeStore: last seq: 2, New Safestore Id: 3
>>>> New SafeStore: last seq: 3, New Safestore Id: 4
Doing same for Putting Safestore got: 
>>>> Putting SafeStore: 139755915567960, key: "2339338531442364162", val:
<nil>
>>>> Putting SafeStore: 3, key: "2339338531442364162", val: <nil>
>>>> Putting SafeStore: 139755915567960, key: "2316257583352081154", val:
<nil>
>>>> Putting SafeStore: 3, key: "2316257583352081154", val: <nil>
See how every other time, the value of the safestore id is changed to like
139755915567960,
where it should be 1, 2, 3 or 4.
Seems there's some data corruption somewhere.
BTW, All access to the map here is protected by mutex Lock/defer Unlock. 
The Put/Get manages lock and then calls put/get respectively. I also ran 
race detector successfully.

@ugorji
Copy link
Contributor Author

ugorji commented Apr 4, 2013

Comment 8:

On #5, I initially printed out v (return value) instead of key (argument).
I've corrected that and confirm that the key is never nil. 
For each map, the keys were either strings or *reflect.Type.
P.S. I had to use interface{} due to unavailability of generics.

@robpike
Copy link
Contributor

robpike commented Apr 4, 2013

Comment 9:

How often does it fail? Can you trigger a failure with GOGC=off? I'd like to narrow down
the culprits.

@ugorji
Copy link
Contributor Author

ugorji commented Apr 4, 2013

Comment 10:

It fails everytime. It's pretty consistent now with the print statements I put. 
Without the print statements, I start getting the error after like 6 tries.
However, now, without GC (ie with export GOGC=off), it happens much earlier in the
request. 
>>>> Putting SafeStore: 0, key: "2328923957304084738", val: <nil>
ERROR    488 1 3 2013-04-04 20:51:30.109 - [baseapp.(*BaseApp).derr] [baseapp.go:411]
Error handling request: runtime error: invalid memory address or nil pointer dereference
It feels more like data corruption issue.
With GOGC off, the id of my safestore as seen above is now always 0. Before, it would
toggle between it's real value (1..4) and some very large number. Also, I see some
values of some fields in pointers changing randomly e.g. from true to false.

@bradfitz
Copy link
Contributor

bradfitz commented Apr 5, 2013

Comment 11:

If you don't want to put your application's code online, could you email it to the Go
team with instructions on how to run it and make the code crash?  We won't share it and
will delete it when we're done debugging.
We really need a way to reproduce these crashes.

@ugorji
Copy link
Contributor Author

ugorji commented Apr 5, 2013

Comment 12:

Hi Brad. Just got back home now and saw your message. Lemme do this when I wake up
tomorrow (I'm on the East Coast). I'd spend like an hour trying again to get a
standalone reproducer, and if so, I'd just send the whole thing. Please let me know an
email I can send it to. (Yours?)

@ugorji
Copy link
Contributor Author

ugorji commented Apr 5, 2013

Comment 13:

Hi guys, 
I wouldn't be able to send out the application for testing today. I have to tease it out
of the datastore so that it can be run standalone, and haven't finished that yet.
For now, I'd attach three files to the Issue.
- Old implementation of SafeStore (*.bak.go)
- New Implementation of SafeStore (still causing invalid memory error)
- Test I had, which cannot reproduce the error
Maybe something will jump out. BTW, with the new safe_store.go, I don't get the hang
anymore that I reported in https://golang.org/issue/5207
However, I still see that the value of SafeStore.useLock is seeing false (even though I
explicitly set it to true at creation time, and never change it).

Attachments:

  1. safe_store.go (7143 bytes)
  2. safe_store_test.go (2752 bytes)
  3. safe_store.go__04052013_104845.bak.go (7003 bytes)

@ugorji
Copy link
Contributor Author

ugorji commented Apr 6, 2013

Comment 14:

Remy helped narraw this down in #5206.
It was passing the Mutex by value somewhere. I was "copying" SafeStore, which copied the
Mutex it contained. 
I eliminated that and the error is gone. 
I'm not sure why it works with Go 1.0.3, or why I saw the symptoms previously, but it is
a programmer error.
Thanks much for your help narrowing this down.

@bradfitz
Copy link
Contributor

bradfitz commented Apr 6, 2013

Comment 15:

Status changed to Retracted.

@remyoudompheng
Copy link
Contributor

Comment 16:

The race detector should have emitted an error because of the incorrect locking. It
would deserve a more in-depth analysis to fix it but I fear the example is too complex.

@dvyukov
Copy link
Member

dvyukov commented Apr 7, 2013

Comment 17:

Please try on current tip, we've fixed some GC-related issues.

@dvyukov
Copy link
Member

dvyukov commented Apr 7, 2013

Comment 18:

Race detector currently catches the following:
func TestRaceMutexOverwrite(t *testing.T) {
        c := make(chan bool, 1)
        var mu sync.Mutex
        go func() {
                mu = sync.Mutex{}
                c <- true
        }()
        mu.Lock()
        <-c
}
Concurrent copy and lock/unlock is not easy to implement. If mutex is copied as a part
of bigger struct, then the race will most likely be caught on other members.

@ugorji
Copy link
Contributor Author

ugorji commented Apr 7, 2013

Comment 19:

Hi Dmitri,
Yes, I tested with your changes against my old code where I was 
"copying" the struct which held a mutex value, and I don't see 
any of the crashes, or memory corruption anymore.
i.e. I had SafeStore { mu sync.Mutex; m map[...] } and I copied SafeStore.
Great fix. 
(Luckily for me, the issues with memory corruption, etc exposed 
a bug in my code so it's good for me all around.
BTW, I ran the race detector previously and it didn't detect this race 
i.e. with the mutex as part of a bigger struct which was copied.

@dvyukov
Copy link
Member

dvyukov commented Apr 7, 2013

Comment 20:

Fixed by:
https://code.google.com/p/go/source/detail?r=8580ea3506e434fe2d959efe9063b165c1966663

Owner changed to @dvyukov.

Status changed to Fixed.

@rsc rsc added this to the Go1.1 milestone Apr 14, 2015
@rsc rsc removed the go1.1 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants