Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Compute WAL size and use it during retention size checks #651

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

dipack95
Copy link

@dipack95 dipack95 commented Jul 3, 2019

Signed-off-by: Dipack P Panjabi dpanjabi@hudson-trading.com

Compute the size of the WAL and use it while calculating if exceeding the max retention size limit.

(PR #650 broke somehow, and so this is a new one with the same changes)

wal/wal.go Outdated
if first == -1 || last == -1 {
return 0, fmt.Errorf("no segments found in WAL")
}
return int64((last - first + 1) * w.segmentSize), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about if the last segment is not complete?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, the WAL.Segments() call returns the indices of the first and last segments on disk. Considering the fact that segment files are created on disk when they're initialised, means that the function will return the index of the current, incomplete segment as last.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is right and the last segment will not be the full w.segmentSize

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I do not need to modify my WAL size calculations right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I mean here you assume that all segments are of size w.segmentSize, but in reality the last segment is not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've modified the code to now read the contents of the data/wal/ directory and add the sizes of all the files, including the checkpoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preffered the other way , just need to figure out how to calculate it for the active segment.

Copy link
Author

@dipack95 dipack95 Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When calculating the WAL size using the segments, it too scans the WAL dir and returns the names of the segment files, the size of which we assume to be w.segmentSize, but as @brian-brazil said, its not necessary that each of the segments is fully occupied. This way we get the true occupied size on disk, which would be more accurate, and we wouldn't have to account for the active segments' pages sizes either.

Of course, I'm open to suggestions if there's a better way of accurately determining the size that you can think of!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krasi-georgiev Were you able to take a look at an alternative approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not yet, probably next week. I haven't forgotten , just need to finish some other things I have started.

@krasi-georgiev
Copy link
Contributor

just had a quick look:

the wal.Reader tracks the total bytes processed https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/reader.go#L33.

After loading the wal with head.Init you will have the total wal size so far on disk.

after that in wal.Log can add to that total the additional added bytes

so total wal size would be the size when loading the wal + any additional added samples with wal.Log

@codesome @gouthamve what do you think?

@brian-brazil
Copy link
Contributor

Will that still work with compression?

@krasi-georgiev
Copy link
Contributor

I just double checked, and YES it will work as Reader.total tracks total bytes read from the disk and the decompression happens after that.

@krasi-georgiev
Copy link
Contributor

and for the wal.Log can track the additional bytes that go in after this:
https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L571-L581

@dipack95
Copy link
Author

Do you think it makes sense to update the WAL size after the pages have been flushed to disk as part of wal.flushPage(..)?

@krasi-georgiev
Copy link
Contributor

Why than and not when doing the wal.Log?

@dipack95
Copy link
Author

There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count. Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the wal.segment.Write(..) call.

@krasi-georgiev
Copy link
Contributor

There could be a possibility that a page cannot be flushed to disk, in which case the byte count that we add (I assume) immediately after optionally compressing the write buffer would differ from the actual written count.

When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.

Similarly, the actual number of bytes written could differ from the size of the buffer actually passed to the wal.segment.Write(..) call.

yes you are right here. Best to use n to populate the total written to disk.

https://github.com/prometheus/tsdb/blob/b40cc43958a563aa06a73b2c8d9090e66109ad1b/wal/wal.go#L471

@dipack95
Copy link
Author

When a flush fails Prometheus will exit with an error so the next time it start the wal size will be populated correctly. This means that we shouldn't worry about failed flushed.

Makes sense.

And then we could subtract the size of newly created block from the WAL size, at this line:

https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/db.go#L459

@krasi-georgiev
Copy link
Contributor

aaah , wait checkpointing deletes wal files so need to look at the code to see how to handle that as well.

@dipack95
Copy link
Author

I was unable to find a direct reference to the size of the chunks written to the block on disk. I was thinking maybe we could modify this function to return the number of chunk bytes written, the total number of bytes written (including meta, tombstone, index), and error?

https://github.com/prometheus/tsdb/blob/7dd5e177aa89828b443e7a331610958d25dc8355/compact.go#L526

This way, we could pass it up the call chain.

@krasi-georgiev
Copy link
Contributor

compaction hasn't got much relation to WAL size.
I think the wal checkpointing is the only blocker here.

@dipack95
Copy link
Author

The way that I understand it, the data is read from the head (and in turn the WAL) using the same compact(..) method, which we must take into account as it removing data from the WAL and writing it to persistent storage. Or have I misunderstood the structure?

@krasi-georgiev
Copy link
Contributor

I think there is always some duplicate data in the wal and last block. IIRC when loading the wal all samples after the maxt of the last block are ignored.

@dipack95
Copy link
Author

Okay. Based on this fact, maybe we could adopt the current approach of just reading the data/wal/ directory and using its size? As it currently seems like there a lot of forces in play when it comes to interacting with the WAL.

@krasi-georgiev
Copy link
Contributor

krasi-georgiev commented Jul 19, 2019

There would be a way to handle all cases, but it would def be more complicated.
I am not strongly against it the DirSize(.. approach and would like to hear what @bwplotka @codesome @gouthamve think.

@dipack95
Copy link
Author

dipack95 commented Aug 1, 2019

@krasi-georgiev Gentle nudge :)

@krasi-georgiev
Copy link
Contributor

Still waiting for some input from @bwplotka @codesome @gouthamve

Copy link
Contributor

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine in general, but I am wondering if it would error out if we are calculating size while files are being added or removed. I haven't given much thought yet it, will have a look.

wal/wal.go Outdated Show resolved Hide resolved
@dipack95
Copy link
Author

dipack95 commented Aug 2, 2019

Looks like the Windows and Linux Travis builds failed due to some permission issues.

@krasi-georgiev
Copy link
Contributor

restarted the tests.

Why not remove the duplicate DirSize and use this one for the other tests as well?

@dipack95
Copy link
Author

dipack95 commented Aug 6, 2019

@krasi-georgiev I can do that once it passes review I suppose.

Signed-off-by: Dipack P Panjabi <dpanjabi@hudson-trading.com>
@dipack95
Copy link
Author

dipack95 commented Aug 9, 2019

Considering there was no more activity on this PR, I've followed @krasi-georgiev's suggestion and converted both the WAL and the test code to use a singular DirSize(..) impl. :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants