Changefeed crash #345

gthmac · 2016-07-26T01:44:35Z

Having recurring crashes with the ChangeFeed-cursors (please see below).

Thanks in advance for looking into it, Dan. :-)

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x62df73]

goroutine 7932 [running]:
github.com/dancannon/gorethink.(*Cursor).bufferNextResponse(0xc820517b80, 0x0, 0x0)
    /home/jenkins/.gvm/pkgsets/go1.5.1/global/src/bitbucket.org/cloudintel/vatomizer/Godeps/_workspace/src/github.com/dancannon/gorethink/cursor.go:632 +0x263
github.com/dancannon/gorethink.(*Cursor).seekCursor(0xc820517b80, 0x100000001, 0x0, 0x0)
    /home/jenkins/.gvm/pkgsets/go1.5.1/global/src/bitbucket.org/cloudintel/vatomizer/Godeps/_workspace/src/github.com/dancannon/gorethink/cursor.go:570 +0xe7
github.com/dancannon/gorethink.(*Cursor).nextLocked(0xc820517b80, 0xd9bfe0, 0xc8200769a0, 0xfeb601, 0xc8200769a0, 0x0, 0x0)
    /home/jenkins/.gvm/pkgsets/go1.5.1/global/src/bitbucket.org/cloudintel/vatomizer/Godeps/_workspace/src/github.com/dancannon/gorethink/cursor.go:205 +0x3c
github.com/dancannon/gorethink.(*Cursor).Next(0xc820517b80, 0xd9bfe0, 0xc8200769a0, 0xd9bfe0)
    /home/jenkins/.gvm/pkgsets/go1.5.1/global/src/bitbucket.org/cloudintel/vatomizer/Godeps/_workspace/src/github.com/dancannon/gorethink/cursor.go:188 +0xb0
github.com/dancannon/gorethink.(*Cursor).Listen.func1(0xdd3720, 0xc8206341e0, 0xc820517b80)
    /home/jenkins/.gvm/pkgsets/go1.5.1/global/src/bitbucket.org/cloudintel/vatomizer/Godeps/_workspace/src/github.com/dancannon/gorethink/cursor.go:447 +0x19d
created by github.com/dancannon/gorethink.(*Cursor).Listen
    /home/jenkins/.gvm/pkgsets/go1.5.1/global/src/bitbucket.org/cloudintel/vatomizer/Godeps/_workspace/src/github.com/dancannon/gorethink/cursor.go:456 +0x49

The text was updated successfully, but these errors were encountered:

dancannon · 2016-07-26T08:23:55Z

Hey, thanks for reporting the crash, I will try to get it fixed as soon as possible. Since you are vendoring the library could you confirm which version of GoRethink you are currently using?

gthmac · 2016-07-26T22:08:15Z

Hi Dan, thanks for the super fast reply and pls excuse my delay in answering as well as not having added this information at the first place.

This is what I am using:
{
"ImportPath": "github.com/dancannon/gorethink",
"Comment": "v2.1.1",
"Rev": "d970d3cce3e907bd864200d4fb7410bca05b9264"
},

dancannon · 2016-07-27T19:56:01Z

Sorry to keep asking for more information but I am having some trouble replicating this issue, would it be possible to see the query that is causing the panic and the code you are using to listen and process the changefeed (especially if you are closing the cursor). If you can think of anything else that might help replicate the issue please post that as well.

gthmac · 2016-07-27T20:13:50Z

Hi Dan, here is the code segment that produces this behavior. I upgraded to this version of gorethink from an older one (which did not have the problem, i. e. the code snippet from below worked just fine). The version change was as follows:
-Current version: v1.3.2 (RethinkDB v2.2)
+Current version: v2.1.1 (RethinkDB v2.3)

if actionEntry.Wait {

        mychan := make(chan events.DocChange)
        // start the changeFeed from DB
        cursor, dberr := r.Table(t.EventTable).Get(id).Changes().Run(cfg.EventDB)
        if dberr != nil {
            errmsg := fmt.Sprintf("%+v", dberr)
            cilog.Log(sys.ERROR, cierrors.LogError(&API_ERR_DB_GET, errmsg))
            replyWithError(c, &API_ERR_DB_GET)
            return
        }
        cursor.Listen(mychan)
        defer cursor.Close()

        var change events.DocChange

        var res t.VEvent
        for done := false; !done; {
            select {

            case change = <-mychan:
                _ = change

            case <-time.After(time.Duration(50) * time.Millisecond):
                _ = done
            }
            resp, dberr := r.Table(t.EventTable).Get(id).Run(cfg.EventDB)
            if dberr != nil {
                errmsg := fmt.Sprintf("%+v", dberr)
                cilog.Log(sys.ERROR, cierrors.LogError(&API_ERR_DB_GET, errmsg))
                replyWithError(c, &API_ERR_DB_GET)
                return
            }
            defer resp.Close()
            if resp.IsNil() {
                continue
            }
            dberr = resp.One(&res)
            if dberr != nil {
                errmsg := fmt.Sprintf("%+v", dberr)
                cilog.Log(sys.ERROR, cierrors.LogError(&API_ERR_DB_GET, errmsg))
                replyWithError(c, &API_ERR_DB_GET)
                return
            }
            if res.Status != string(events.EventClosed) {
                continue
            }
            done = true

        }

gthmac · 2016-07-27T23:01:01Z

I meanwhile changed the code above to poll rather than to wait for change feeds until the driver is fixed. But apparently I am crashing in another section that still relies on change feeds which is this code. Again, both code paths were perfectly working before the upgrade (see above).
Thanks again for your help, Dan. Really appreciated, as always.

func (cic *CiCache) listenForChanges(cursor *r.Cursor) {

    ch := make(chan docChange)
    defer cursor.Close()
    cursor.Listen(ch)

    cilog.Log(syslog.LOG_INFO,
        "Cache %s: starting to listen for changes...",
        cic.tableName)
    for change := range ch {
        newkeyval := change.NewVal[cic.key]
        oldkeyval := change.OldVal[cic.key]
        var keyval interface{}
        if newkeyval == nil {
            keyval = oldkeyval
        } else {
            keyval = newkeyval
        }
        if keyval != nil {
            cic.cache.Delete(keyval.(string))
            cilog.Log(syslog.LOG_DEBUG,
                "Cache %s: received change for key=%s",
                cic.tableName,
                keyval)
        }
    }

}

dancannon · 2016-07-27T23:31:22Z

~~Ah so this is something that has recently changed, that should help track this down, do you remember what version you were on before?~~

Sorry didn't see the comment with the versions at first!

gthmac · 2016-07-27T23:56:27Z

My previous version was: v1.3.2 (RethinkDB v2.2)

dancannon · 2016-07-28T10:36:51Z

I am still not able to replicate the issue but I will keep trying again this evening, its possible that this is caused by the keep alive issue that was fixed in v2.1.2 which caused connections to get stuck.

Also does the panic occur as soon as your application starts or is it after it has been running for a while? Thanks for being so patient while I look into this and sorry again for the trouble this is causing.

gthmac · 2016-07-29T01:57:36Z

Hi Dan,

I personally do not think that this crash is related to the connection loss issue. The reason why I think so, is the following: The main crashing code is based on an event bus mechanism, i. e. an event gets created, the change feed is created on this single event record, the event is being processed (which takes between milliseconds to rarely seconds) and thereafter the event rec is updated, which triggers the change feed. Hence, I don't believe I loose the connection in this time frame.

Secondly, the application does not crash at startup but after a certain amount of time (minutes to hours). It looks to me as if the result structure of the feed may be causing the problem.

I can add additional debug info or try to come up with a small code piece to reproduce the problem on your end, if that would help.

dancannon · 2016-07-29T22:37:17Z

I think I may have figured out what is causing this issue, I have pushed a possible fix to the branch hotfix/cursor-panic. Would it be possible for you to test your application with this branch since I have not been able to reproduce this.

gthmac · 2016-07-31T20:24:09Z

Awesome, Dan - thank you very much. I will be able to try this on Tuesday and will give you immediate feedback. Thanks again.

dancannon · 2016-08-09T20:20:13Z

Hey @gthmac, did the new release fix the issue for you? (I assume so but it never hurts to double check 😄 )

gthmac · 2016-08-11T09:44:17Z

Hi Dan, my sincere apologies for yet not having come back to you. I have been sick and didn't do the testing yet. I hope to do it at the end of this week.

Thanks again for your efforts, Dan.

dancannon · 2016-08-11T10:13:58Z

Sorry to hear that, hope you recover soon! There's no rush to do the testing if you are not well, take your time 😄 Thanks again.

dancannon · 2016-08-17T21:03:10Z

I will close this issue as it has been open for quite a while now, if you see the issue again let me know and we can reopen it.

dancannon added t:bug p:high s:investigating labels Jul 26, 2016

dancannon modified the milestone: v2.2.0 Aug 4, 2016

dancannon added s:complete and removed s:investigating labels Aug 17, 2016

dancannon closed this as completed Aug 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changefeed crash #345

Changefeed crash #345

gthmac commented Jul 26, 2016 •

edited by dancannon

Loading

dancannon commented Jul 26, 2016

gthmac commented Jul 26, 2016

dancannon commented Jul 27, 2016

gthmac commented Jul 27, 2016 •

edited by dancannon

Loading

gthmac commented Jul 27, 2016 •

edited by dancannon

Loading

dancannon commented Jul 27, 2016 •

edited

Loading

gthmac commented Jul 27, 2016

dancannon commented Jul 28, 2016

gthmac commented Jul 29, 2016

dancannon commented Jul 29, 2016

gthmac commented Jul 31, 2016

dancannon commented Aug 9, 2016

gthmac commented Aug 11, 2016

dancannon commented Aug 11, 2016

dancannon commented Aug 17, 2016

Changefeed crash #345

Changefeed crash #345

Comments

gthmac commented Jul 26, 2016 • edited by dancannon Loading

dancannon commented Jul 26, 2016

gthmac commented Jul 26, 2016

dancannon commented Jul 27, 2016

gthmac commented Jul 27, 2016 • edited by dancannon Loading

gthmac commented Jul 27, 2016 • edited by dancannon Loading

dancannon commented Jul 27, 2016 • edited Loading

gthmac commented Jul 27, 2016

dancannon commented Jul 28, 2016

gthmac commented Jul 29, 2016

dancannon commented Jul 29, 2016

gthmac commented Jul 31, 2016

dancannon commented Aug 9, 2016

gthmac commented Aug 11, 2016

dancannon commented Aug 11, 2016

dancannon commented Aug 17, 2016

gthmac commented Jul 26, 2016 •

edited by dancannon

Loading

gthmac commented Jul 27, 2016 •

edited by dancannon

Loading

gthmac commented Jul 27, 2016 •

edited by dancannon

Loading

dancannon commented Jul 27, 2016 •

edited

Loading