Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

page already freed #72

Closed
djadala opened this issue Dec 22, 2017 · 11 comments
Closed

page already freed #72

djadala opened this issue Dec 22, 2017 · 11 comments

Comments

@djadala
Copy link

djadala commented Dec 22, 2017

Openig same issue as boltdb/bolt#731, because coreos/bbolt also crashes in same way:

package main

import (
	"log"
	"math/rand"

	"github.com/coreos/bbolt"
	// "github.com/boltdb/bolt"
)

var BC = make(chan []byte)

func main() {

	go Add(BC, "/opt/test.db")

	s := make([]byte, 30)
	for i := 0; i < 100000000; i++ {
		rand.Read(s)
		BC <- s
	}

	close(BC)

}

func Add(c chan []byte, bfile string) {
	const bname = `bc41`

	db, err := bolt.Open(bfile, 0600, nil)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	err = db.Update(func(tx *bolt.Tx) error {
		_, err := tx.CreateBucketIfNotExists([]byte(bname))
		if err != nil {
			return err //fmt.Errorf("create bucket: %s", err)
		}
		return nil
	})
	if err != nil {
		log.Fatal(err)
		panic(err)
	}

	more := true
	for more {
		err := db.Update(func(tx *bolt.Tx) error {
			b := tx.Bucket([]byte(bname))
			for i := 0; i < 100000; i++ {
				v := <-c
				if v == nil {
					more = false
					return nil
				}
				err = b.Put(v, []byte("0"))
				if err != nil {
					return err
				}
			}
			return nil
		})
		if err != nil {
			log.Fatal(err)
			panic(err)
		}
	}
}
panic: page 3955 already freed

goroutine 5 [running]:
github.com/coreos/bbolt.(*freelist).free(0xc420074210, 0x7, 0x7fbb550ef000)
	/home/jambo/golang/src/github.com/coreos/bbolt/freelist.go:143 +0x4ad
github.com/coreos/bbolt.(*node).spill(0xc42163d810, 0xc4209c7800, 0x55f920)
	/home/jambo/golang/src/github.com/coreos/bbolt/node.go:363 +0x210
github.com/coreos/bbolt.(*node).spill(0xc42163d7a0, 0x0, 0x0)
	/home/jambo/golang/src/github.com/coreos/bbolt/node.go:350 +0xbf
github.com/coreos/bbolt.(*node).spill(0xc42163d490, 0xc4229c36e0, 0xc420045ab0)
	/home/jambo/golang/src/github.com/coreos/bbolt/node.go:350 +0xbf
github.com/coreos/bbolt.(*Bucket).spill(0xc420052580, 0xc4229c3600, 0xc420045d28)
	/home/jambo/golang/src/github.com/coreos/bbolt/bucket.go:568 +0x4d3
github.com/coreos/bbolt.(*Bucket).spill(0xc4200900f8, 0x2116cdd17, 0x5721a0)
	/home/jambo/golang/src/github.com/coreos/bbolt/bucket.go:535 +0x417
github.com/coreos/bbolt.(*Tx).Commit(0xc4200900e0, 0x0, 0x0)
	/home/jambo/golang/src/github.com/coreos/bbolt/tx.go:160 +0x129
github.com/coreos/bbolt.(*DB).Update(0xc420088000, 0xc42006bf98, 0x0, 0x0)
	/home/jambo/golang/src/github.com/coreos/bbolt/db.go:674 +0xf2
main.Add(0xc420072060, 0x4ea01b, 0x12)
	/home/jambo/go/src/jambo/tests/error1/error1.go:50 +0x1ac
created by main.main
	/home/jambo/go/src/jambo/tests/error1/error1.go:15 +0x5a
go version 
go version go1.9.1 linux/amd64

uname -a
Linux bee 4.9.0-0.bpo.4-amd64 #1 SMP Debian 4.9.51-1~bpo8+1 (2017-10-17) x86_64 GNU/Linux
@xiang90
Copy link
Contributor

xiang90 commented Dec 22, 2017

/cc @SaranBalaji90

would you like to investigate into this one? this might be a good start for you to understand the storage layer of etcd.

@iaburton
Copy link

iaburton commented Jan 2, 2018

Hello! Just a random passerby, but I wanted to point something out (disclaimer, I haven't run this code).

It looks like there is a race condition in the code that could be causing this. The byte slice being sent/received from the channel, appears to be racing between rand.Read and b.Put. While the channel send makes a copy of the slice header, the underlying array is not copied, so rand.Read is writing into the slice while b.Put is reading from it.

@xiang90
Copy link
Contributor

xiang90 commented Jan 3, 2018

@iaburton

Even if there is a race in application code, bolt should not panic right? Blotdb panics with an internal inconsistent state.

@djadala
Copy link
Author

djadala commented Jan 3, 2018

@iaburton
Oh, i am blind, of course there is race condition.
And i should have listened to my grandmother when she told me to run under race detector.

@xiang90

Even if there is a race in application code, bolt should not panic right?

When internal state is inconsistent, panic is best thing to do.

There is modified example without race, that works fine:

package main

import (
	"log"
	"math/rand"

	"github.com/coreos/bbolt"
	// "github.com/boltdb/bolt"
)

var BC = make(chan []byte)

func main() {

	go Add(BC, "/opt/test.db")

	s1 := make([]byte, 30)
	s2 := make([]byte, 30)
	for i := 0; i < 100000000; i++ {
		rand.Read(s1)
		BC <- s1
		s1,s2 = s2,s1
	}

	close(BC)

}

func Add(c chan []byte, bfile string) {
	const bname = `bc41`

	db, err := bolt.Open(bfile, 0600, nil)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	err = db.Update(func(tx *bolt.Tx) error {
		_, err := tx.CreateBucketIfNotExists([]byte(bname))
		if err != nil {
			return err //fmt.Errorf("create bucket: %s", err)
		}
		return nil
	})
	if err != nil {
		log.Fatal(err)
		panic(err)
	}

	more := true
	for more {
		err := db.Update(func(tx *bolt.Tx) error {
			b := tx.Bucket([]byte(bname))
			for i := 0; i < 100000; i++ {
				v := <-c
				if v == nil {
					more = false
					return nil
				}
				err = b.Put(v, []byte("0"))
				if err != nil {
					return err
				}
			}
			return nil
		})
		if err != nil {
			log.Fatal(err)
			panic(err)
		}
	}
}

Please close this issue.

@xiang90
Copy link
Contributor

xiang90 commented Jan 3, 2018

Ideally races happen in application code will not panic bolt, and lead it to a corrupted and unrecoverable state. This issue is a common user buffer problem. The proper way to handle it is to copy, check before use any user buffer. At least, current behavior should be documented. @djadala

@iaburton
Copy link

iaburton commented Jan 3, 2018

@xiang90

I'm not sure what else bolt should do here in this kind of race, but I think the current behavior is fine, it certainly makes the problem more visible.

The implementation of b.Put does copy the key before storing it, which isn't mentioned in the documentation, perhaps that's worthy of including. The docs do say that the value must be valid the entire time, so it sort of implies that the key doesn't have to.

@xiang90
Copy link
Contributor

xiang90 commented Jan 3, 2018

I'm not sure what else bolt should do here in this kind of race, but I think the current behavior is fine, it certainly makes the problem more visible.

The problem is that no one will enable race detector for production usage. I am fine with boltdb panics itself when there is a race. I am not so happy about the state being corrupted with page 3955 already freed, which is hard to recover from.

@xiang90
Copy link
Contributor

xiang90 commented Jan 3, 2018

The docs do say that the value must be valid the entire time, so it sort of implies that the key doesn't have to.

I re-read the doc. It does mention this accurately. Thanks for pointing this out.

@iaburton
Copy link

iaburton commented Jan 3, 2018

The problem is that no one will enable race detector for production usage. I am fine with boltdb panics itself when there is a race. I am not so happy about the state being corrupted with page 3955 already freed, which is hard to recover from.

True they won't for production, but they should during testing. And that is unfortunate I agree. There is another issue on the original bolt about an integrity checking function, perhaps that might be a solution. I cannot think of a way to keep this from happening (at any point during runtime) while also not penalizing correctly behaving applications. I don't think most packages (depending on what they do) should concern themselves with the application racing, so long as they don't race (the packages), and their API is reasonable. If the application races and not the package, that's up to the programmer of the "main" application to detect/fix/handle (this includes if the application race adversely effects the packages it imports, as it did here).

Granted, I haven't used bolt since the fork so I'm just brainstorming, something may have changed I'm not aware of 😄

@djadala
Copy link
Author

djadala commented Jan 3, 2018

panic message can be changed from:
page x already freed
to something like:
page x already freed, most probably race in program, please run under race detector
?

@djadala
Copy link
Author

djadala commented Jan 3, 2019

As this is not bug in bolt, but race condition in programs that use bolt, i close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants