Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Write-ahead Logging #261

Merged
merged 43 commits into from
Jul 17, 2023
Merged

Implement Write-ahead Logging #261

merged 43 commits into from
Jul 17, 2023

Conversation

HHoflittlefish777
Copy link
Contributor

@HHoflittlefish777
Copy link
Contributor Author

HHoflittlefish777 commented Mar 19, 2023

There are some things that need help:

  • During disk flushing, in order to avoid blocking writes, I cloned the buffer so that I can continue to accept write requests while flushing the disk. However, during asynchronous callbacks, all channels will be notified. At this time, write requests that only write to the buffer should not receive this request, but they receive it.how can I resolve this issue?

@codecov-commenter
Copy link

codecov-commenter commented Mar 19, 2023

Codecov Report

Merging #261 (56ae9db) into main (3c3ef3c) will decrease coverage by 4.32%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main     #261      +/-   ##
==========================================
- Coverage   44.25%   39.94%   -4.32%     
==========================================
  Files          93      100       +7     
  Lines        9814    10868    +1054     
==========================================
- Hits         4343     4341       -2     
- Misses       5053     6109    +1056     
  Partials      418      418              
Impacted Files Coverage Δ
pkg/run/channel_closer.go 0.00% <0.00%> (ø)
pkg/run/channel_group_closer.go 0.00% <0.00%> (ø)
pkg/wal/wal.go 0.00% <0.00%> (ø)

... and 5 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@hanahmily
Copy link
Contributor

There are some things that need help:

  • During disk flushing, in order to avoid blocking writes, I cloned the buffer so that I can continue to accept write requests while flushing the disk. However, during asynchronous callbacks, all channels will be notified. At this time, write requests that only write to the buffer should not receive this request, but they receive it.how can I resolve this issue?

You should have switched to a new flushChannel on switching the buffer. Theoretically, a flush channel sends a notice which indicates its corresponding buffer gets flushed. Based on that, the flush channel seems a field in the buffer structure instead.

@hanahmily
Copy link
Contributor

@HHoflittlefish777 Please fix test failures.

@HHoflittlefish777
Copy link
Contributor Author

HHoflittlefish777 commented Mar 20, 2023

There are some things that need help:

  • During disk flushing, in order to avoid blocking writes, I cloned the buffer so that I can continue to accept write requests while flushing the disk. However, during asynchronous callbacks, all channels will be notified. At this time, write requests that only write to the buffer should not receive this request, but they receive it.how can I resolve this issue?

You should have switched to a new flushChannel on switching the buffer. Theoretically, a flush channel sends a notice which indicates its corresponding buffer gets flushed. Based on that, the flush channel seems a field in the buffer structure instead.

I have two channel field in the Log struct, should I move them into buffer struct?

@hanahmily
Copy link
Contributor

I have two channel field in the Log struct, should I move them into buffer struct?

Make sense to me. I submitted the Buffer patch(#262), in which you could find a similar structure. I hope you could get some inspiration from it.

pkg/wal/wal.go Outdated Show resolved Hide resolved
Copy link
Contributor

@hanahmily hanahmily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look at the implementation. There are several channels to sync operations. The std testing framework can't handle this pattern appropriately. I recommend :

There is a comprehensive test case which you could get some inspiration from:
https://github.com/apache/skywalking-banyandb/blob/359fd2a84a525351a8fc0e2313b31f7e601f88e7/banyand/tsdb/buffer_test.go

pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal_test.go Outdated Show resolved Hide resolved
pkg/wal/wal.go Outdated Show resolved Hide resolved
pkg/wal/wal.go Show resolved Hide resolved
pkg/wal/wal.go Outdated
timer := time.NewTimer(bufferBatchInterval * time.Millisecond)
select {
case request := <-log.writeChannel:
bufferSize += seriesIDLength + timestampLength + len(request.data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of seriesID is not fixed.

pkg/wal/wal.go Outdated
buf := log.buffer
// Clear buffer to receive Log request.
log.newBuffer()
if log.closeFlag {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't archive synchronization through a local varable. To safely share data between goroutines is by using channel or context.

pkg/wal/wal.go Outdated
}()
}

func (log *Log) asyncBatchflush(buffer buffer, flushCh chan string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (log *Log) asyncBatchflush(buffer buffer, flushCh chan string) {
func (log *Log) asyncBatchFlush(buffer buffer, flushCh chan string) {

pkg/wal/wal.go Outdated
@@ -43,18 +128,450 @@ type WAL interface {
// Write a logging entity.
// It will return immediately when the data is written in the buffer,
// The returned function will be called when the entity is flushed on the persistent storage.
Write(seriesID common.SeriesID, timestamp time.Time, data []byte) (func(), error)
Write(seriesID []byte, timestamp time.Time, data []byte) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you drop off the returned function to get flushed event?

pkg/wal/wal.go Outdated
}

// Compression and flush.
compressionData := snappy.Encode(nil, bytesBuffer.Bytes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of compressing the entire segment, it is recommended to separately encode/compress timestamps and values.

The pkg/encoding/xor package offers methods for encoding uint64. These can be used to encode/decode the UNIX epoch (nanosecond) of a timestamp.

For quick compression of values, snappy can be utilized.

By following this approach, there is no need for bytesBuffer during flushing. Instead, data can be written directly to workSegment.file.

@wu-sheng
Copy link
Member

wu-sheng commented May 7, 2023

This PR seems not to update for weeks. Is this still being processed?

@hanahmily
Copy link
Contributor

This PR seems not to update for weeks. Is this still being processed

Certainly, I will processed with it later, after completing load testing.

@hailin0
Copy link
Contributor

hailin0 commented May 17, 2023

I want to try to continue, Thanks

@HHoflittlefish777
Copy link
Contributor Author

@hailin0 Thanks for you interest in it, but WAL doesn't have much work to continue with, just fix the review. There is still a lot of work to be done in BanyanDB, and you can do some other work you like.

@hanahmily
Copy link
Contributor

@hailin0 Please address the issues outlined in the review. If you encounter any difficulties or have questions, feel free to reach out for assistance.

@HHoflittlefish777
The remaining tasks will be assigned to @hailin0, who is interested in contributing to the project. Completing this PR will serve as an excellent introduction for him to become familiar with the project.

@HHoflittlefish777
Copy link
Contributor Author

That is fine, Looking forward to your further contributions to BanyanDB @hailin0.

@hailin0
Copy link
Contributor

hailin0 commented May 21, 2023

thanks, i am doing

@hanahmily
Copy link
Contributor

@hailin0 any update?

pkg/run/channel_closer.go Outdated Show resolved Hide resolved
pkg/run/channel_closer_test.go Show resolved Hide resolved
Copy link
Contributor

@hanahmily hanahmily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you add benchmark to the write and rotate operations?

CHANGES.md Outdated Show resolved Hide resolved
api/common/id.go Outdated Show resolved Hide resolved
@hailin0
Copy link
Contributor

hailin0 commented Jul 6, 2023

Would you add benchmark to the write and rotate operations?

Yes, I would add benchmark testing

@hailin0
Copy link
Contributor

hailin0 commented Jul 16, 2023

image

mem
image

cpu
image
image
image

profile.zip

Copy link
Contributor

@hanahmily hanahmily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you to everyone who participated.

@hanahmily hanahmily merged commit 482831f into apache:main Jul 17, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants