-
-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reformat journal index #7780
Reformat journal index #7780
Conversation
@max-hoffman DOLT
|
…t into max/streamline-journal-index
…t into max/streamline-journal-index
…t into max/streamline-journal-index
#benchmark |
…t into max/streamline-journal-index
@max-hoffman DOLT
|
…t into max/streamline-journal-index
#benchmark |
@max-hoffman DOLT
|
@max-hoffman DOLT
|
@max-hoffman DOLT
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally seems fine. A few comments about recovery and the bootstrap process.
As discussed offline, not sure if this will impact existing database with large journals – the loss of their index will cause a high one time startup cost on upgrade.
go/store/nbs/journal_index_record.go
Outdated
recTag, err := rd.ReadByte() | ||
if err != nil { | ||
if errors.Is(err, io.EOF) { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to get the number of bytes we read after the last successfuly indexRecMeta
callback back to the caller so that they can file.Seek()
and file.Truncate()
the output file to start at that point. Then we can start writing new records without causing a CRC failure in a later bootstrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the points where we return nil
from ErrUnexpectedEOF
down below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of context that I was missing, the way this used to work was if anything went wrong we delete the index and exit process. The next startup loads all chunks into novel, and the next batch flush writes all chunks into a fresh index. Startup/clear/exit is the repeatable retry loop if anything goes wrong. Shitty for the next index flush, but it works.
The rewrite has different semantics that you pointed out. Now we are concerned with hanging index lookups after the last metadata record. I added a seek/truncate, which clears the handing lookups. We have to do an extra step where we add the hanging lookups back to the index for consistency. So I basically ignore most errors here, missing/malformed/io.EOF, we just truncate and rebuild. I think that should be equally repeatable, as long as there isn't a pathological loop where the index can't get a foothold for some reason.
Like one thing that is maybe a bit annoying in both versions is that we could rewrite the entire index, the server quits before writing a root value, and then next startup has to do it all over again. Reread all of the lookups, delete the index b/c no batch metadata, and then rebuilds the same index again. A 45 minute startup becoming like 3 hours would be annoying.
go/store/nbs/journal_index_record.go
Outdated
batch = nil | ||
batchCrc = 0 | ||
default: | ||
return fmt.Errorf("expected record to start with a chunk or metadata type tag") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some ways different, in some ways...maybe not so different? Why an error here but ErrUnexpectedEOF
is not one? If we there could be garbage beyond where we read...
go/store/nbs/journal_writer.go
Outdated
if _, err = wr.index.Write(buf); err != nil { | ||
return err | ||
} | ||
writeJournalIndexMeta(wr.indexWriter, root, wr.indexed, end, wr.batchCrc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the comment here to be more accurate.
@max-hoffman DOLT
|
@max-hoffman DOLT
|
@max-hoffman DOLT
|
@max-hoffman DOLT
|
@coffeegoddd DOLT
|
Change the way we write journal index lookups. Each write appends a lookup to a
bufio.Writer
that lazily writes to disk. And after some increment we flush a CRC/root value record for consistency checking the index during bootstrap. This avoids big stalls for flushing a batch of index records. We also only write anaddr16
now, because that's what we load into the default chunk address map.Databases with the older format will pay a one-time startup penalty to rewrite the journal index. In testing this appears to be 5-10% of the import time for the database.