Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rotation may overwrite existing log file #150

Closed
newpavlov opened this issue Sep 22, 2023 · 9 comments
Closed

Rotation may overwrite existing log file #150

newpavlov opened this issue Sep 22, 2023 · 9 comments
Labels

Comments

@newpavlov
Copy link

I have an app which allow to remotely manage logs. It uses Cleanup::KeepCompressedFiles, i.e. it compresses logs immediately after rotation. It has commands to list, rotate, and copy (rotated) logs.

After cold start of the application with existing logs, the list command produces the following output:

$ log list
app_r2023-09-22_16-41-41.log.gz	993 B
app_r2023-09-22_16-41-49.log.gz	349 B
app_r2023-09-22_16-42-44.log.gz	381 B
app_r2023-09-22_16-43-10.log.gz	383 B
app_rCURRENT.log		5165 B

Next, the rotate command is executed with the following result (it returns only rotated and compresses logs):

$ log rotate
app_r2023-09-22_16-41-41.log.gz	993 B
app_r2023-09-22_16-41-49.log.gz	349 B
app_r2023-09-22_16-42-44.log.gz	381 B
app_r2023-09-22_16-43-10.log.gz	383 B
app_r2023-09-22_16-43-38.log.gz	2026 B

Listing logs result in expected result:

$ log list
app_r2023-09-22_16-41-41.log.gz	993 B
app_r2023-09-22_16-41-49.log.gz	349 B
app_r2023-09-22_16-42-44.log.gz	381 B
app_r2023-09-22_16-43-10.log.gz	383 B
app_r2023-09-22_16-43-38.log.gz	2026 B
app_rCURRENT.log		582 B

So far so good. But if I manually rotate again (more than one second later), the last 2 KB log file gets replaced with new data:

$ log rotate
app_r2023-09-22_16-41-41.log.gz	993 B
app_r2023-09-22_16-41-49.log.gz	349 B
app_r2023-09-22_16-42-44.log.gz	381 B
app_r2023-09-22_16-43-10.log.gz	383 B
app_r2023-09-22_16-43-38.log.gz	398 B

$ log list
app_r2023-09-22_16-41-41.log.gz	993 B
app_r2023-09-22_16-41-49.log.gz	349 B
app_r2023-09-22_16-42-44.log.gz	381 B
app_r2023-09-22_16-43-10.log.gz	383 B
app_r2023-09-22_16-43-38.log.gz	398 B
app_rCURRENT.log		581 B

Parts of relevant code:

fn log_list_inner(&self) -> Result<Vec<(String, u64)>, Error> {
    let selector = LogfileSelector::default()
        .with_compressed_files()
        .with_r_current();
    self.logger
        .existing_log_files(&selector)?
        .into_iter()
        .map(path2name_size)
        .collect()
}

fn log_rotate_inner(&self) -> Result<Vec<(String, u64)>, Error> {
    self.logger.flush();
    self.logger.trigger_rotation()?;
    let mut res = Vec::new();
    let selector = LogfileSelector::default().with_compressed_files();
    for _ in 0..ROTATE_SLEEP_NUM {
        std::thread::sleep(ROTATE_SLEEP);
        res = self.logger.existing_log_files(&selector)?;
        if res.iter().all(is_log_gz) {
            break;
        }
    }
    res.into_iter().map(path2name_size).collect()
}

Rotating logs after that without restarting the application works without issues.

I can not provide a minimal reproduction of the issue right now. Hopefully, you will be able to resolve the issue with provided information.

@emabee
Copy link
Owner

emabee commented Sep 22, 2023

I think you ran through the following within one sec:

Rcurrent.log
——- rotate:
Ts1.log
——- compress:
Ts1.log.gz
——- next log line is written:
Rcurrent.log
Ts1.log.gz
——- rotate again:
Ts1.log
Ts1.log.gz
——- compress - boom
Ts1.log.gz

This wasn‘t possible without the explicitly triggered rotation. Fix should not be difficult.

@newpavlov
Copy link
Author

newpavlov commented Sep 22, 2023

I am not sure I follow. So during the first rotate Ts1 is timestamp of rotation and for the second one timestamp of the first log line? Why such discrepancy?

It does not look like the issue gets triggered for further rotation as long as the app does not get restarted.

I think I will try the TimestampsDirect naming to work around this issue.

@emabee
Copy link
Owner

emabee commented Sep 23, 2023

Ts1 is meant to be in all cases the timestamp of the infix in the filename. It has precision of second, that‘s why the collision can happen if all actions are done in a single second.

Collisions during rotate are detected and escaped with adding restartXXX to the infix.
The same is missing in the compress step.

@newpavlov
Copy link
Author

newpavlov commented Sep 23, 2023

Yes, lack of collision detection during compression is an issue, you probably should use something like NamedTempFile::persist_noclobber. But there is a different issue with timestamps used to name rotatef files. Note that between manual rotations passes more than one second.

Let's say I have app_rCURRENT.log created at 18:25 (i.e. it's creation/birth time, not access or modification time) and nothing more. On 18:30 I launch my app and pass "rotate" command after a bunch of logs were written. Now I have two log files: current and 18-25-xx.log.gz. The current file has creation time 18:30.

After 5 minutes during which a bunch of logs were written into the current file I execute rotation command again. I would expect to get 3 files: current, 18-30-yy.log.gz, and 18-25-xx.log.gz, but I get only two: current and 18-25-xx.log.gz, with data which should've been saved into 18-30-y.log.gz overwriting 18-25-xx.log.gz.

The third manual rotation at 18:40 which contains logs from 18:35 to 18:40 gets rotated with name 18:30-zz.log.gz.

So it looks like the code chooses timestamp for rotated files incorrectly. The first rotation after cold start uses "current" file creation timestamp, while the subsequent ones use incorrectly cached values.

The issue may not be even specific to manual rotation, it's just the easiest way to trigger it.

@newpavlov newpavlov changed the title Manual rotation may overwrite existing log file Rotation may overwrite existing log file Sep 26, 2023
@newpavlov
Copy link
Author

I think the issue is caused by this line. You use creation date of rotated file for naming next rotated file. It's wrong by itself as can be seen in my previous comment (you should use current time instead), but it causes name collision when appending is enabled. During initialization you use creation date of current file, which inevitable becomes creation date of next rotation file since you use fs::rename which IIUC preserves creation date.

emabee added a commit that referenced this issue Sep 27, 2023
- with sub-second rotations
- with cleanup when all logfiles should be compressed
@emabee
Copy link
Owner

emabee commented Sep 27, 2023

Issues are now hopefully all fixed with 0.27.1.

Thanks for reporting and the discussions!

@emabee emabee closed this as completed Sep 27, 2023
@newpavlov
Copy link
Author

newpavlov commented Sep 27, 2023

As per my two previous comments, I don't think this issue is fully fixed. Yes, v0.27.1 no longer overwrites previously rotated log files on compression, but it still uses arguably incorrect timestamps for naming rotated files.

After rotating twice today with existing current file created two days ago I got the following files:

app_r2023-09-25_15-20-38.log.gz	2.46 KiB
app_r2023-09-25_15-20-38.restart-0000.log.gz	344 B

The second file contains records only from 2023-09-27 16:17:42 to 2023-09-27 16:17:48.

@emabee
Copy link
Owner

emabee commented Sep 27, 2023

Sorry for not replying to this part directly. I was distracted by testing and resolving the naming conflicts that also existed.
I Indeed I did not understand your point with the second rotation, your last comment changed that.
It should be solved now with 0.27.2.

@newpavlov
Copy link
Author

Thank you! I confirm that 0.27.2 has fixed the issue.

@emabee emabee added the bug label Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants