-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Truncating queue files #374
Comments
Hi Rupinder, Thank you for the question. Best Regards, |
Hi Rupinder,
However, please note that this is not a supported API, and relies on the internal format of the queue-file header. This code works with the current version of chronicle-queue, but may break in future versions. If you decide to use such a method, you should carry out extensive testing to make sure that you are not accidentally truncating data from the end of the queue files. A safer way to solve the problem is to acquire more disk space. |
The simplest way to reduce the waste at the end is to reduce the block
size. If you have a default block size of 64 MB and you have 1 TB and
rolling files you will run out of space due to this waste after retaining
34 years of data. However if you reduce the block size to say 2 MB you
would be able to run for 1000 years. Note: Linux uses sparse files so you
only ever waste 4KB per day.
The space wasted is worth about 8 cents per day retained with high end
redundant SSDs so I suggest you not spend more than a few dollars of your
time on it.
Peter.
On 16 Aug. 2017 12:31, "rupinder10" <notifications@github.com> wrote:
I have a situation where the rolling daily files see periods of high
activity. So some files are large and some relatively very small. I would
like to reclaim the wasted space in the files that dont use it. Is there a
way to do this in the StoreListener so that when we release a file we can
truncate its length to needed size only ?
This is on Windows so rapidly growing files cause issues with the team that
manages the file system.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#374>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABBU8b2jL2kiXzne5H94hFYSyn4qhlLuks5sYxl-gaJpZM4O5K2N>
.
|
Peter, Your points are very valid. I was debating whether to do it or not and this certainly gives me some arguments to justify not doing it. Another fallow up question then is about the block size. What is the impact of selecting a small block size when the files have to grow ? And what if the files are written using a certain block size but read with a different one. Does that cause issues ? I ask because, in some cases, files are copied around systems and the reader may not use the same block size. One way would be to wrap it in another API that does not allow them to change the block size. |
The block size indirectly determines the maximum safe message size. Some
operations require to be written to a single block including overlap. To be
completely safe, make the block size at least 4x the maximum message size.
Windows limits the mapping to be a multiple of at least 64 KB.
In terms for differing block sizes, the files will need end up being the
size based on the largest block size. Mixing them is likely to defeat the
whole purpose if doing this. Btw if you use read only mode, windows won't
let you memory map a region larger than what is on disk.
Given disk space is cheap I rarely find good cause to change the block size
and haven't benchmarked it's impact on windows. On Linux the jitter
increases if you go outside 16 to 256 MB which is why we picked a mid point
which doesn't waste much space.
Regards Peter.
…On 17 Aug. 2017 8:51 am, "rupinder10" ***@***.***> wrote:
Peter,
Your points are very valid. I was debating whether to do it or not and
this certainly gives me some arguments to justify not doing it.
Another fallow up question then is about the block size. What is the
impact of selecting a small block size when the files have to grow ? And
what if the files are written using a certain block size but read with a
different one. Does that cause issues ? I ask because, in some cases, files
are copied around systems and the reader may not use the same block size.
One way would be to wrap it in another API that does not allow them to
change the block size.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#374 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABBU8XglRoKCmxi3g8pK3Go1YZcAfsMUks5sZDdBgaJpZM4O5K2N>
.
|
Another thing to note, queue files are zipped fairly well, especially the empty sections at the end. So one can implement a cron job to zip the files from old roll cycles if space is an issue. |
I have a situation where the rolling daily files see periods of high activity. So some files are large and some relatively very small. I would like to reclaim the wasted space in the files that dont use it. Is there a way to do this in the StoreListener so that when we release a file we can truncate its length to needed size only ?
This is on Windows so rapidly growing files cause issues with the team that manages the file system.
The text was updated successfully, but these errors were encountered: