Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add surgery clear-page command #389

Merged
merged 1 commit into from
Jan 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
59 changes: 58 additions & 1 deletion cmd/bbolt/surgery_commands.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ func (cmd *surgeryCommand) Run(args ...string) error {
return newRevertMetaPageCommand(cmd).Run(args[1:]...)
case "copy-page":
return newCopyPageCommand(cmd).Run(args[1:]...)
case "clear-page":
return newClearPageCommand(cmd).Run(args[1:]...)
default:
return ErrUnknownCommand
}
Expand Down Expand Up @@ -225,7 +227,7 @@ func (cmd *copyPageCommand) Run(args ...string) error {
return fmt.Errorf("copyPageCommand failed: %w", err)
}

fmt.Fprintf(cmd.Stdout, "The page %d was copied to page %d", srcPageId, dstPageId)
fmt.Fprintf(cmd.Stdout, "The page %d was copied to page %d\n", srcPageId, dstPageId)
return nil
}

Expand All @@ -241,3 +243,58 @@ at dstPageId in DST.
The original database is left untouched.
`, "\n")
}

// clearPageCommand represents the "surgery clear-page" command execution.
type clearPageCommand struct {
*surgeryCommand
}

// newClearPageCommand returns a clearPageCommand.
func newClearPageCommand(m *surgeryCommand) *clearPageCommand {
c := &clearPageCommand{}
c.surgeryCommand = m
return c
}

// Run executes the command.
func (cmd *clearPageCommand) Run(args ...string) error {
// Parse flags.
fs := flag.NewFlagSet("", flag.ContinueOnError)
help := fs.Bool("h", false, "")
if err := fs.Parse(args); err != nil {
return err
} else if *help {
fmt.Fprintln(cmd.Stderr, cmd.Usage())
return ErrUsage
}

if err := cmd.parsePathsAndCopyFile(fs); err != nil {
return fmt.Errorf("clearPageCommand failed to parse paths and copy file: %w", err)
}

// Read page id.
pageId, err := strconv.ParseUint(fs.Arg(2), 10, 64)
if err != nil {
return err
}

if err := surgeon.ClearPage(cmd.dstPath, guts_cli.Pgid(pageId)); err != nil {
return fmt.Errorf("clearPageCommand failed: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make it make it another command in surgeon ?

I imagine it exactly as a place to land this types of 'operations' and CLI should be free of deep business logic assumptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean command surgery write-page?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should put the logic within this file:

internal/surgeon/surgeon.go

So have there a method:

func ClearPage(path string, pgid pgId) error {...}

and call it from the CLI rather than move the WritePage method into guts_cli.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially makes sense to me.

Added ClearPage in surgeon package. But I still think it makes more sense to get both ReadPage and WritePage in package guts_cli.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ReadPage and WritePage are more low-level, and it'd better to put them in guts_cli. While CopyPage and ClearPage is relatively high-level as compared to ReadPage & WritePage, so it's OK to put them in surgeon package. .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not proud of ReadPage & WritePage implementation to make them part of shared library.

They are very inefficient in generic usage scenario - i.e. they shouldn't be used to read/write multiple pages. For this reason (and their simplicity) I though about them as private (yes: should be lower case) part of surgeon - not something for a wider consumption in other use-cases.

Second idea is that some day guts_cli.go and page.go should get merged in a single file instead of 2 sibling implementations of the same data model. Thus I would prefer avoid further divergence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second idea is that some day guts_cli.go and page.go should get merged in a single file instead of 2 sibling implementations of the same data model. Thus I would prefer avoid further divergence.

Thanks @ptabor . This point totally makes sense to me. We need to refactor the design/implementation, of course in separate PRs.

They are very inefficient in generic usage scenario - i.e. they shouldn't be used to read/write multiple pages. For this reason (and their simplicity) I though about them as private (yes: should be lower case) part of surgeon - not something for a wider consumption in other use-cases.

Yes, I am thinking the same thing, I mean the behavior of reading/writing single or multiple pages. It should be fine for now because the implementation of ReadPage and WritePage is Symmetrical. Both of them supporting operating multiple pages.

I'm not proud of ReadPage & WritePage implementation to make them part of shared library.

The implementation of gut_cli package has too much overlap with the existing bbolt package, we really need to merge them. It's the next next step we can do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the page.overflow > 0 , then it's dangerous to write only one of the pages. If we really need to do this, it definitely should be internal functions/methods. Let's investigate/do this separately.


fmt.Fprintf(cmd.Stdout, "Page (%d) was cleared\n", pageId)
return nil
}

// Usage returns the help message.
func (cmd *clearPageCommand) Usage() string {
return strings.TrimLeft(`
usage: bolt surgery clear-page SRC DST pageId

ClearPage copies the database file at SRC to a newly created database
file at DST. Afterwards, it clears all elements in the page at pageId
in DST.

The original database is left untouched.
`, "\n")
}
35 changes: 34 additions & 1 deletion cmd/bbolt/surgery_commands_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ func TestSurgery_CopyPage(t *testing.T) {

defer requireDBNoChange(t, dbData(t, srcPath), srcPath)

// revert the meta page
// copy page 3 to page 2
t.Log("copy page 3 to page 2")
dstPath := filepath.Join(t.TempDir(), "dstdb")
m := NewMain()
Expand All @@ -87,6 +87,39 @@ func TestSurgery_CopyPage(t *testing.T) {
assert.Equal(t, pageDataWithoutPageId(srcPageId3Data), pageDataWithoutPageId(dstPageId2Data))
}

// TODO(ahrtr): add test case below for `surgery clear-page` command:
// 1. The page is a branch page. All its children should become free pages.
func TestSurgery_ClearPage(t *testing.T) {
pageSize := 4096
db := btesting.MustCreateDBWithOption(t, &bolt.Options{PageSize: pageSize})
srcPath := db.Path()

// Insert some sample data
t.Log("Insert some sample data")
err := db.Fill([]byte("data"), 1, 20,
func(tx int, k int) []byte { return []byte(fmt.Sprintf("%04d", k)) },
func(tx int, k int) []byte { return make([]byte, 10) },
)
require.NoError(t, err)

defer requireDBNoChange(t, dbData(t, srcPath), srcPath)

// clear page 3
t.Log("clear page 3")
dstPath := filepath.Join(t.TempDir(), "dstdb")
m := NewMain()
err = m.Run("surgery", "clear-page", srcPath, dstPath, "3")
require.NoError(t, err)

// The page 2 should have exactly the same data as page 3.
t.Log("Verify result")
dstPageId3Data := readPage(t, dstPath, 3, pageSize)

p := guts_cli.LoadPage(dstPageId3Data)
assert.Equal(t, uint16(0), p.Count())
assert.Equal(t, uint32(0), p.Overflow())
}

func readPage(t *testing.T, path string, pageId int, pageSize int) []byte {
dbFile, err := os.Open(path)
require.NoError(t, err)
Expand Down
31 changes: 31 additions & 0 deletions internal/guts_cli/guts_cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,10 @@ func (p *Page) Overflow() uint32 {
return p.overflow
}

func (p *Page) String() string {
return fmt.Sprintf("ID: %d, Type: %s, count: %d, overflow: %d", p.id, p.Type(), p.count, p.overflow)
}

// DO NOT EDIT. Copied from the "bolt" package.

// TODO(ptabor): Make the page-types an enum.
Expand Down Expand Up @@ -178,6 +182,14 @@ func (p *Page) SetId(target Pgid) {
p.id = target
}

func (p *Page) SetCount(target uint16) {
p.count = target
}

func (p *Page) SetOverflow(target uint32) {
p.overflow = target
}

// DO NOT EDIT. Copied from the "bolt" package.
type BranchPageElement struct {
pos uint32
Expand Down Expand Up @@ -276,6 +288,25 @@ func ReadPage(path string, pageID uint64) (*Page, []byte, error) {
return p, buf, nil
}

func WritePage(path string, pageBuf []byte) error {
page := LoadPage(pageBuf)
pageSize, _, err := ReadPageAndHWMSize(path)
if err != nil {
return err
}
expectedLen := pageSize * (uint64(page.Overflow()) + 1)
if expectedLen != uint64(len(pageBuf)) {
return fmt.Errorf("WritePage: len(buf):%d != pageSize*(overflow+1):%d", len(pageBuf), expectedLen)
}
f, err := os.OpenFile(path, os.O_WRONLY, 0)
if err != nil {
return err
}
defer f.Close()
_, err = f.WriteAt(pageBuf, int64(page.Id())*int64(pageSize))
return err
}

// ReadPageAndHWMSize reads Page size and HWM (id of the last+1 Page).
// This is not transactionally safe.
func ReadPageAndHWMSize(path string) (uint64, Pgid, error) {
Expand Down
29 changes: 13 additions & 16 deletions internal/surgeon/surgeon.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ package surgeon

import (
"fmt"
"os"

"go.etcd.io/bbolt/internal/guts_cli"
)

Expand All @@ -13,25 +11,24 @@ func CopyPage(path string, srcPage guts_cli.Pgid, target guts_cli.Pgid) error {
return err1
}
p1.SetId(target)
return WritePage(path, d1)
return guts_cli.WritePage(path, d1)
}

func WritePage(path string, pageBuf []byte) error {
page := guts_cli.LoadPage(pageBuf)
pageSize, _, err := guts_cli.ReadPageAndHWMSize(path)
func ClearPage(path string, pgId guts_cli.Pgid) error {
// Read the page
p, buf, err := guts_cli.ReadPage(path, uint64(pgId))
if err != nil {
return err
return fmt.Errorf("ReadPage failed: %w", err)
}
if pageSize != uint64(len(pageBuf)) {
return fmt.Errorf("WritePage: len(buf)=%d != pageSize=%d", len(pageBuf), pageSize)
}
f, err := os.OpenFile(path, os.O_WRONLY, 0)
if err != nil {
return err

// Update and rewrite the page
p.SetCount(0)
p.SetOverflow(0)
if err := guts_cli.WritePage(path, buf); err != nil {
return fmt.Errorf("WritePage failed: %w", err)
}
defer f.Close()
_, err = f.WriteAt(pageBuf, int64(page.Id())*int64(pageSize))
return err

return nil
}

// RevertMetaPage replaces the newer metadata page with the older.
Expand Down
2 changes: 1 addition & 1 deletion internal/tests/tx_check_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ func TestTx_RecursivelyCheckPages_CorruptedLeaf(t *testing.T) {
require.NoError(t, err)
require.Positive(t, p.Count(), "page must be not empty")
p.LeafPageElement(p.Count() / 2).Key()[0] = 'z'
require.NoError(t, surgeon.WritePage(db.Path(), pbuf))
require.NoError(t, guts_cli.WritePage(db.Path(), pbuf))

db.MustReopen()
require.NoError(t, db.Update(func(tx *bolt.Tx) error {
Expand Down