Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Adds gzip compression to backup log files #16
Conversation
donovansolms
commented
Oct 27, 2015
|
I see the failed test on AppVeyor. I'll investigate the "The process cannot access the file because it is being used by another process" error and resolve. |
|
Thats super common if you have code that tries to delete or rename a file On Tue, Oct 27, 2015, 10:40 AM Donovan Solms notifications@github.com
|
donovansolms
commented
Oct 28, 2015
|
I've fixed the issue on Windows. The original file was being close with a defer but renamed before the defer could execute. Let me know if you would like more tests for this feature, I'm happy to add them. |
natefinch
reviewed
Oct 28, 2015
| + | ||
| + // Create the buffer | ||
| + var size int64 = info.Size() | ||
| + rawbytes := make([]byte, size) |
natefinch
Oct 28, 2015
Owner
You can't do that. What if the file is 500mb? You'll allocate 500mb of RÅM on this one line.
You should just open a new file, wrap the file in a gzipwriter and use io.copy to copy from the old file to the new.
natefinch
reviewed
Oct 28, 2015
| + | ||
| + // If a blank filename is passed | ||
| + // return false | ||
| + if filename == "" { |
natefinch
Oct 28, 2015
Owner
You don't really need this. This is not a general function that will be used by third parties... it's used in exactly one place, and we know the filename won't be empty. Plus, os.Open will return an error if you try to open an empy filename.
natefinch
reviewed
Oct 28, 2015
| @@ -205,6 +214,14 @@ func (l *Logger) openNew() error { | ||
| return fmt.Errorf("can't rename log file: %s", err) | ||
| } | ||
| + // Compress the backup if enabled | ||
| + if l.CompressBackups == true { | ||
| + err := compressLog(newname) |
natefinch
Oct 28, 2015
Owner
please use the single line form of the if statement here
if err := compressLog(newname); err != nil {
donovansolms
commented
Oct 28, 2015
|
I've fixed the large allocation issue, thanks for pointing that out. I also updated the code style to fit the rest with the single line form if. Finally I updated the test to verify that the compressed file actually contains the content originally written to it. Please let me know if I misunderstood anything. |
s7v7nislands
commented
Oct 19, 2016
|
ping, will this merged? |
|
I'll take a look at this today. Sorry for the delay.. |
natefinch
requested changes
Oct 24, 2016
This looks mostly ok (with a couple tweaks), but there's a piece that is missing. The problem comes with robustness - if the program goes down before finishing compressing the backup, it should figure that out when it starts back up and try again. This is a little tricky, but seems doable.
Also, I have one general style comment - many of the comments you have added state what the code is doing, rather than why its doing it. Most of the time, the what should be pretty obvious (there are exceptions, but not usually in simple code like this). I suggest removing most of the comments you added about what the code is doing, unless you think it's unclear what the code is intending to do.
| @@ -205,6 +211,13 @@ func (l *Logger) openNew() error { | ||
| return fmt.Errorf("can't rename log file: %s", err) | ||
| } | ||
| + // Compress the backup if enabled | ||
| + if l.CompressBackups == true { |
natefinch
Oct 24, 2016
Owner
this should really just be if l.CompressBackups { - no need to check == true.
| @@ -205,6 +211,13 @@ func (l *Logger) openNew() error { | ||
| return fmt.Errorf("can't rename log file: %s", err) | ||
| } | ||
| + // Compress the backup if enabled | ||
| + if l.CompressBackups == true { | ||
| + if err := compressLog(newname); err != nil { |
natefinch
Oct 24, 2016
Owner
The problem with this is that a simple log statement is now going to be blocked on gzipping a multi megabyte log file. This should probably be run in a goroutine so it won't block the program.... probably making this part of the deleteAll method (and renaming it to cleanup() or something) would be a good idea.
| + if err != nil { | ||
| + return err | ||
| + } | ||
| + |
natefinch
Oct 24, 2016
Owner
you can (and should) defer reader.Close() here even if you explicitly close it later... multiple closes are ok on files.
newhook
commented
Oct 27, 2016
|
@donovansolms Are you planning on addressing Nate's comments? We need this code for our production systems, so if you are not going to be able to address them in short order I can take a look. |
donovansolms
commented
Oct 28, 2016
|
Hi, yes @newhook, apologies, will get to this over the weekend. |
donovansolms
commented
Oct 31, 2016
|
@natefinch Thanks for the review. I've made the style changes and added the robustness. I moved the compression of logs to When a log file is missed due to the application going down while compressing, it will be compressed on the next rotate. For my use case that is acceptable, what is your opinion? |
|
I don't think it's ok to wait until the next rotate to compress the file. The problem is that instead of a nicely compressed backup (usually, what, like 10-20% of uncompressed size?), you'll have the full size log file there... which if you need that space, could screw you up. And actually, this brings up an interesting point about compressed backups... it changes the math you need to do for how much space you're using. Before, 100MB max size with 3 backups meant the max you'd use is 400MB of disk space (main log + 3 backups)... but now that same config would be generally far less... but's it's nondeterministic just how much space you'd be using, since it would depend on the compression ratio. I'm not a huge fan of nondeterministic results, but I'm not exactly sure how to fix that problem, so that people can know the max amount of disk space their configuration will use. Any suggestions on a nice user experience for this that doesn't totally break backwards compatibility would be good. I'll review the actual changes probably tomorrow, maybe tonight if I have more time than I expect. |
donovansolms
commented
Nov 1, 2016
•
|
No problem. I'll add compressing the missed file(s) on startup (probably around I get what you are saying about calculating the max amount of space the configuration will use, I also don't have a clear answer to that at the moment. |
newhook
commented
Nov 1, 2016
|
I'm not sure I see a huge need to compress immediately especially if it makes the code more complex and potentially brittle. If you change the logging parameters (unless this has changed) log files are not cleaned up until rotation, so this seems to be a similar situation. |
donovansolms
commented
Nov 2, 2016
|
The change sounded more complicated than it turned out to be. After the log file is created or opened on startup, |
newhook
commented
Nov 15, 2016
|
Is there something more to do for this PR? |
|
I'll take a look this weekend. Sorry, been slammed with other things lately. |
newhook
commented
Nov 17, 2016
|
I've merged this and have been using it production for a few days now. Seems to work! I think I'd prefer though if Close would block until any outstanding compression is complete. |
natefinch
requested changes
Nov 24, 2016
Looking better, still a few niggles. Sounds like this is something Canonical is going to want, so I may be able to work on this during work hours, which would greatly increase my ability to spend time on it, hopefully starting next week.
| + defer l.cmu.Unlock() | ||
| + files, err := l.oldLogFiles(false) | ||
| + if err != nil { | ||
| + fmt.Errorf("Unable to compress log files: %s", err) |
natefinch
Nov 24, 2016
Owner
This doesn't actually do anything. It returns a new error that is then discarded. Ideas on how to handle errors from here are welcome, for sure. Generally one would just log them, but.... we're kind low level for that. Maybe let the user pass in a method to handle compression failures? I don't know.
| +// directory as the current log file, sorted by ModTime. Setting | ||
| +// includeCompressed to true will include files with the given | ||
| +// compressFileExtension into the returned list | ||
| +func (l *Logger) oldLogFiles(includeCompressed bool) ([]logInfo, error) { |
natefinch
Nov 24, 2016
Owner
includeCompressed is not doing what it seems like you're saying it's doing. It's not one or both, it's one or the other. When you use includeCompressed, it'll never return uncompressed files, because they don't have a matching extension. That's not exactly ok, since this could skip a file that hasn't been compressed yet.
| +// compressLog compresses the log with given filename using Gzip compression | ||
| +func compressLog(filename string) error { | ||
| + | ||
| + reader, err := os.Open(filename) |
natefinch
Nov 24, 2016
Owner
This should probably not be called reader. Input or something would be better. Reader I generally reserve only for actual io.Reader values.
| + defer gzwriter.Close() | ||
| + | ||
| + if _, err := io.Copy(gzwriter, reader); err != nil { | ||
| + return err |
donovansolms
commented
Nov 24, 2016
•
|
Thanks for the update @natefinch. We're really busy in December, so I might not get to this before the end of the year. If you say Canonical might want it, is it not worth rather going with the suggestions in Issue #13 (Compressing rolled files) so as to not lock this to gzip? |
|
That's ok, I may tackle this myself next week. |
howbazaar
commented
Apr 5, 2017
|
ping... any progress? |
|
I'll take a look in the morning.
…
|
|
It was late when I was looking at this, so I forgot where we were. I haven't really had the time to spend on this. I think this PR could be a starting off point, but it still needs a lot of thought. there's basically three things you have to think about:
|
howbazaar
commented
Apr 5, 2017
|
How would you feel about juju devs picking this up? |
|
Please feel free. I'll give reviews high priority. |
|
This is implemented in #43 |
donovansolms commentedOct 27, 2015
This PR adds gzip compression to backup log files via the CompressBackups config option. Tests added and passed. Please let me know if I missed anything.
Note: We're using this library in production (including gzipped backups) with max size as 100MB. Compression doesn't have any noticeable performance impact aside from saving a lot of drive space.