Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash #20

Closed
ReneKroon opened this issue Jun 17, 2019 · 3 comments

Comments

@ReneKroon
Copy link
Owner

commented Jun 17, 2019

Jun 17 03:32:09 pro-boqs-app-003 boqs: goroutine 484637918 [running]:
Jun 17 03:32:09 pro-boqs-app-003 boqs: net/http.(*conn).serve.func1(0xc029a32a00)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/net/http/server.go:1769 +0x139
Jun 17 03:32:09 pro-boqs-app-003 boqs: panic(0xc8b580, 0x1517740)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/runtime/panic.go:522 +0x1b5
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.priorityQueue.Less(0xc0009cd800, 0x47, 0x100, 0x47, 0x23, 0x5d06ed99)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/priority_queue.go:43 +0x16e
Jun 17 03:32:09 pro-boqs-app-003 boqs: container/heap.up(0xef6c60, 0xc000968480, 0x47)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/container/heap/heap.go:93 +0x9f
Jun 17 03:32:09 pro-boqs-app-003 boqs: container/heap.Fix(0xef6c60, 0xc000968480, 0x47)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /usr/local/go/src/container/heap/heap.go:86 +0x94
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.(*priorityQueue).update(...)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/priority_queue.go:18
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.(*Cache).getItem(0xc0001852d0, 0xc0162dbd60, 0x1c, 0x0, 0xed0f58cf0)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/cache.go:45 +0xeb
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache.(*Cache).Get(0xc0001852d0, 0xc0162dbd60, 0x1c, 0x0, 0x0, 0xeeea00)
Jun 17 03:32:09 pro-boqs-app-003 boqs: /go/src/github.com/bolcom/boqs/vendor/github.com/ReneKroon/ttlcache/cache.go:173 +0x50
Jun 17 03:32:09 pro-boqs-app-003 boqs: github.com/bolcom/boqs/internal/dao/cluster.(*PgTopologyCache).SubscribedQueues(0xc000972200, 0xc0162dbd60, 0x1c, 0x183ad9b0, 0xed176645f, 0x152b840, 0xc034089290, 0x3, 0x0, 0x120, ...)
@ReneKroon

This comment has been minimized.

Copy link
Owner Author

commented Jun 17, 2019

Crash in configuration that was not explicitly tested. Configuration was added.

As it turns out the switch to RWLock was not a good one. There are data races all over the place if TTL is extended on hit. Another issue comes from

cache.expirationNotification <- true

This channel is async, but blocking and therefore can cause blocking conditions in conjunction with the locks on Get/Set calls and the cleanup routine.

All races have been fixed.

@ReneKroon ReneKroon referenced this issue Jun 17, 2019
@froodian

This comment has been minimized.

Copy link

commented Jun 17, 2019

sorry to introduce instability, thanks for the fix

@ReneKroon

This comment has been minimized.

Copy link
Owner Author

commented Jun 18, 2019

No problem as i did'nt see the issues in the change either, but it did found some design guidelines:

  • Since there are no recursive locks the locking should always be done over whole 'top-level' operations in the Cache struct
  • Some parts of the code released a lock and then acquired a second one during the same operation. This is also a bad pattern as it effectively invalidates the data retrieved during the first lock.
@ReneKroon ReneKroon closed this Jun 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.