Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incremental mode with ssh does not work until log file rotates #723

Closed
fluca1978 opened this issue Mar 23, 2022 · 8 comments
Closed

incremental mode with ssh does not work until log file rotates #723

fluca1978 opened this issue Mar 23, 2022 · 8 comments

Comments

@fluca1978
Copy link
Contributor

Hi all,
I'm observing a strange, at least to me, problem with pgbadger 11.7 and SSH connection.

$ pgbadger ssh://luca@miguel//var/log/postgresql/* -I -O /var/www/html/pgbadger/miguel/ -X --start-monday -v
DEBUG: Looking for remote filename using command: ssh luca@miguel "ls '/var/log/postgresql/'postgresql-*.log"
DEBUG: pgBadger version 11.7.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Found log offset 252798162 in file /var/www/html/pgbadger/miguel//LAST_PARSED
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Fri.log'" | awk '{print $5}'
DEBUG: Remote file size: 243912552
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Mon.log'" | awk '{print $5}'
DEBUG: Remote file size: 286059956
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Sat.log'" | awk '{print $5}'
DEBUG: Remote file size: 138413444
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Sun.log'" | awk '{print $5}'
DEBUG: Remote file size: 117596970
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Thu.log'" | awk '{print $5}'
DEBUG: Remote file size: 235139815
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Thu_old.log'" | awk '{print $5}'
DEBUG: Remote file size: 152461209
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Tue.log'" | awk '{print $5}'
DEBUG: Remote file size: 294148883
DEBUG: Looking for file size using command: ssh luca@miguel "ls -l '/var/log/postgresql/postgresql-Wed.log'" | awk '{print $5}'
DEBUG: Remote file size: 264510327
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Fri.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Fri.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Fri.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Fri.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Mon.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Mon.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Mon.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Mon.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Sat.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Sat.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Sat.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Sat.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Sun.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Sun.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Sun.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Sun.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Thu.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Thu.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Thu.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Thu.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Thu_old.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Thu_old.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Thu_old.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Thu_old.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Tue.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Tue.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Tue.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Tue.log, there no new lines in this file
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Wed.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql-Wed.log
DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Wed.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql-Wed.log, there no new lines in this file
DEBUG: Starting progressbar writer process
DEBUG: Reading temporary last parsed line from /tmp/tmp_LAST_PARSED
DEBUG: the log statistics gathering took: 5 wallclock secs ( 0.02 usr  0.02 sys +  0.29 cusr  0.13 csys =  0.46 CPU)
DEBUG: no new entries in your log(s) since last run.
DEBUG: building reports took:  0 wallclock secs ( 0.00 usr +  0.00 sys =  0.00 CPU)
DEBUG: the total execution time took:  5 wallclock secs ( 0.02 usr  0.02 sys +  0.29 cusr  0.13 csys =  0.46 CPU)

if I keep executing the pgbadger no new logs entries are found, while the log file (in this case postgresql-Wed.log) is increasing.
I've already checkd that the ssh luca@miguel "cat '/var/log/postgresql/postgresql-Wed.log'" command is working and is outputting all the entries in the log file, so it seems there is no problem in retrieving the log content.
The LAST_PARSED file is fine and contains:

$ cat /var/www/html/pgbadger/miguel/LAST_PARSED
2022-03-23 17:26:43     252798162       2022-03-23 17:26:43 ...

If I rsync locally the file then pgbadger builds incrementally on the very same file, while via SSH keeps stating that the file has been already parsed.

@fluca1978
Copy link
Contributor Author

After digging, I found that the problem seems to be in the check_file_changed routine at line 3786: it seems the routine inspects the first line of the file, that clearly has a date less than the last saved one, and so it exits, reporting no change at all.
Apparently, this seems to be a problem related to ssh URI, but it seems to work the following not-URI invocation:

$ pgbadger --ssh-user luca -r miguel  /var/log/postgresql/* -I -O /var/www/html/pgbadger/miguel/ -X --start-monday -v

It could be that piping ssh into cat has some differences. What I see, from the debug output, is that when using the ssh URI the command used to retrieve the log is:

DEBUG: Retrieving log entries using command: ssh luca@miguel "cat '/var/log/postgresql/postgresql-Thu.log'" |

while when using -r is

DEBUG: Retrieving log entries using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel " cat  '/var/log/postgresql/postgresql-Thu.log''' |

The only difference, to me, are the connection options that should not cause the problem.

@fluca1978
Copy link
Contributor Author

Apparently the key-changing fact is the -r option, since with such specified it seems to work even if an ssh URI is specified:

$ pgbadger  -r miguel  'ssh://luca@miguel//var/log/postgresql/*' -I -O /var/www/html/pgbadger/miguel/ -X --start-monday -v

@darold
Copy link
Owner

darold commented Mar 25, 2022

Hi Luca,

Thanks for the report, commit 093e233 adds the ssh options to the command. Please let us know if that's solve your issue.

@fluca1978
Copy link
Contributor Author

Uhm, it seems not working to me without the -r option.
I see that the ssh options are now there in the URI only invocation, but the file is not detected as "new", while it is:

% git log -n 3
commit 3b03a6cc529ae8cc9c74c8bf007571f252a2eee0
commit 0ac9817f406361fd2900abe92fdbc9f9f731b470
commit 093e2339513a0c0239af3877cde74993f6c3c8ef

% ./pgbadger 'ssh://luca@miguel//var/log/postgresql/*.log'   -I -O ~/tmp -v 
DEBUG: Looking for remote filename using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "ls '/var/log/postgresql'*.log"
DEBUG: pgBadger version 11.7.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Found log offset 227658987 in file /home/luca/tmp/LAST_PARSED
DEBUG: Looking for file size using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "ls -l '/var/log/postgresql/postgresql.log'" | awk '{print $5}'
DEBUG: Remote file size: 182512807
DEBUG: Retrieving log entries using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "cat '/var/log/postgresql/postgresql.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql.log
DEBUG: Retrieving log entries using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "cat '/var/log/postgresql/postgresql.log'" |
DEBUG: this file has already been parsed: ssh://luca@miguel//var/log/postgresql/postgresql.log, there are no new lines in this file
DEBUG: Reading temporary last parsed line from /tmp/tmp_LAST_PARSED
DEBUG: the log statistics gathering took: 1 wallclock secs ( 0.01 usr  0.01 sys +  0.05 cusr  0.02 csys =  0.09 CPU)
DEBUG: no new entries in your log(s) since last run.
DEBUG: building reports took:  0 wallclock secs ( 0.00 usr +  0.00 sys =  0.00 CPU)
DEBUG: the total execution time took:  1 wallclock secs ( 0.01 usr  0.01 sys +  0.05 cusr  0.02 csys =  0.09 CPU)

while if, immediatly after, I do the same with -r

% ./pgbadger 'ssh://luca@miguel//var/log/postgresql//*.log'   -I -O ~/tmp -v -r miguel
DEBUG: Looking for remote filename using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "ls '/var/log/postgresql//'*.log"
DEBUG: pgBadger version 11.7.
DEBUG: Output 'html' reports will be written to out.html
DEBUG: Found log offset 227658987 in file /home/luca/tmp/LAST_PARSED
DEBUG: Looking for file size using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "ls -l '/var/log/postgresql/postgresql.log'" | awk '{print $5}'
DEBUG: Remote file size: 182512807
DEBUG: Starting progressbar writer process
DEBUG: Retrieving log sample using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "tail -n 100 '/var/log/postgresql/postgresql.log'" |
DEBUG: Autodetected log format 'default' from ssh://luca@miguel//var/log/postgresql/postgresql.log
DEBUG: pgBadger will use log format default to parse ssh://luca@miguel//var/log/postgresql/postgresql.log.
DEBUG: timezone not specified, using -3600 seconds
DEBUG: Processing remote log file: miguel:ssh://luca@miguel//var/log/postgresql/postgresql.log
DEBUG: Retrieving log entries using command: ssh -o ConnectTimeout=10 -o PreferredAuthentications=hostbased,publickey luca@miguel "cat '/var/log/postgresql/postgresql.log'" |
DEBUG: Starting offset 227658987 is greater than total size 182512807 for file ssh://luca@miguel//var/log/postgresql/postgresql.log
DEBUG: Reverting start offset 227658987 to 0 for file ssh://luca@miguel//var/log/postgresql/postgresql.log, stoppping offset is 182512807
DEBUG: Starting reading file "miguel:ssh://luca@miguel//var/log/postgresql/postgresql.log"...
DEBUG: Start parsing postgresql log at offset 0 of file "ssh://luca@miguel//var/log/postgresql/postgresql.log" to 182512807

DEBUG: Unknown stderr log line format:  ORDER BY 1,2
DEBUG: setting current position in log to 182512807of 182512807 (99.60%), queries: 422990, events: 3s: 72992, events: 2
DEBUG: Saving last parsed line into /tmp/tmp_LAST_PARSED (2022-03-25 19:25:45eri182512807), events: 3

DEBUG: Reading temporary last parsed line from /tmp/tmp_LAST_PARSED
DEBUG: Saving last parsed line into /home/luca/tmp/LAST_PARSED
DEBUG: the log statistics gathering took:79 wallclock secs ( 0.00 usr  0.01 sys + 78.64 cusr  2.78 csys = 81.43 CPU)
DEBUG: Building incremental report for 2022-03-25
LOG: Ok, generating HTML daily report into /home/luca/tmp/2022/03/25/...
LOG: Ok, generating HTML weekly report into /home/luca/tmp/2022/week-13/...
LOG: Ok, generating global index to access incremental reports...
DEBUG: writing global index into /home/luca/tmp/index.html
DEBUG: building reports took:  5 wallclock secs ( 4.67 usr +  0.02 sys =  4.69 CPU)
DEBUG: the total execution time took: 84 wallclock secs ( 4.67 usr  0.03 sys + 78.64 cusr  2.78 csys = 86.12 CPU)

so the r option still triggers something different in the behavior.

@darold
Copy link
Owner

darold commented Mar 26, 2022

Ok, I was able to reproduce, the problem was that I was testing change with a new log file not with new entries in the last log. Commit 6a4750b fixes this issue.

@fluca1978
Copy link
Contributor Author

Great job! It works for me now.

@darold
Copy link
Owner

darold commented Apr 7, 2022

@fluca1978 please give a try to latest commit 44f0be5. I have done a full review of the log file detection so please confirm that is not break your reports. It should not impact remote log file but just to be sure.

@fluca1978
Copy link
Contributor Author

It is working for me. I've tested against v11.8 at 996c199.

I see messages in the debug output that state what is happening, I don't remember having seen such messages before:

DEBUG: This file should be parsed from the beginning: ssh://luca@venkman//var/postgresql//log/postgresql.log, the file is new, the entire file must be parsed
DEBUG: Reverting start offset 9002266 to 0 for file ssh://luca@venkman//var/postgresql//log/postgresql.log, stopping offset is 15627864

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants