Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log Output gone mad in 2.5.2 #1368

Closed
richiereynolds opened this issue Jul 31, 2015 · 30 comments
Closed

Log Output gone mad in 2.5.2 #1368

richiereynolds opened this issue Jul 31, 2015 · 30 comments
Labels
Milestone

Comments

@richiereynolds
Copy link

Hi, Just upgraded to 2.5.2 from 2.4.2 and some of the Log Output now have newlines after virtually every character.
E.g. where I used to have:

20:19:29 LOG_TRACE : "Tue Jul 21 20:19:29.521 2015" [App:App_Initialise]
20:19:29 Process id: 39913
etc.

I now have:

15:56:10 LO.LO1.tpdev-perf1l.tpsecm 1. Starts the server if currently CLOSED. Starting server
15:56:10
15:56:10 L
15:56:10 O
15:56:10 G
15:56:10 _
15:56:10 T
15:56:10 RA
etc.

I think it's not the display as the new version displays the old version's log files correctly so it must be how the server is storing the logs. Logs form the rundeck server itself are ok too so I suspect it's something to do with capturing remote output.

@richiereynolds
Copy link
Author

Yes, it's the log files that now have all the extra lines:
^text/x-rundeck-log-v2.0^
^2015-07-31T14:56:08Z|stepbegin||{node=localhost|step=1|stepctx=1|user=Richard.Reynolds}|^
^2015-07-31T14:56:08Z|nodebegin||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|Starting server^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|L^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|O^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|G^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|_^
^2015-07-31T14:56:10Z|||{node=LO.LO1.tpdev-perf1l.tpsecm|step=1|stepctx=1|user=tpsecm}|T^

This makes the logs unreadable. I'm running on linux RHEL 6.5 and the servers I'm installing on are the same.

@gschueler
Copy link
Member

very odd. does it happen for all executions?

@richiereynolds
Copy link
Author

Yes, but it seems to be certain steps, I'm wondering if it's triggered by a certain string? it seems to be when capturing the output of my StartServer and CloseServer scripts (which haven't changed since RD 2.4.2 which was fine with them)
These are /bin/csh scripts. bash scripts don't seem to be having the same problem. Some shell interaction?

@gschueler
Copy link
Member

any way to isolate a test case which displays the behavior? eg. a standalone csh script which does the same thing?

@richiereynolds
Copy link
Author

Yep, it seems that rundeck interprets every "write" as a newline rather than waiting for newline chars. The fact that the output is not deterministic (don't always get a single char on a line) seems to suggest there's timing involved in there too, is it just polling and always putting a newline on the end of whatever it finds?

#!/bin/csh
# \
cd `dirname $0`
# \
exec tclsh `basename $0` $*

puts -nonewline "1"
flush stdout
puts -nonewline "2"
flush stdout
puts -nonewline "3"
flush stdout
puts -nonewline "a b c"
puts "done"

exit 0

give the opuput:

12:11:57 tpaqu_dev_live_1l 1
12:11:57 2
12:11:57 3
12:11:57 a b cdone

If I comment out the flushes I get:
12:11:43 123a b cdone

Any ideas on fixing this? It's a showstopper for us, logs need to be reliable and I really don't want to downgrade again.

@SentinelDoorbell
Copy link

+1

@dmalnati
Copy link

dmalnati commented Aug 4, 2015

same

@puremourning
Copy link
Contributor

same here (though i suspect at least 3 of us work for the same company...)

@unluddite
Copy link

ditto

@ocrespo
Copy link

ocrespo commented Aug 4, 2015

Same here

@dave-knight
Copy link

+1

2 similar comments
@ckyma
Copy link

ckyma commented Aug 5, 2015

+1

@mambu
Copy link

mambu commented Aug 5, 2015

+1

@richiereynolds
Copy link
Author

Even knowing which release this was introduced would be good, maybe I don't have to downgrade all the way back to 2.4.2 and lose other fixes?

@puremourning
Copy link
Contributor

@richiereynolds, I suspect it was this change : 8777cf5

Just a hunch, though

@puremourning
Copy link
Contributor

Hmmm maybe not.

@gschueler
Copy link
Member

@puremourning it might have been. that change was meant to solve the problem that if the remote execution produced only one line of output, but had no newline char at the end, the output would not be shown in the log, so additional calls to .flush() were made.

@richiereynolds if you try 2.5.1 and you don't see the same problem, it likely was that change

@richiereynolds
Copy link
Author

@puremourning I think you're right. It does seem to have appeared in 2.5.2. I compared by running the 2.5.1 vs 2.5.2 launcher app.

I used an inline script to call the script in comment 6 on a node.

Initially I couldn't recreate using the launcher on 2.5.2 but I added a couple of 2 second pauses after the flushes and the problem was visible i.e. it's definitely timing related. Basically it looks like how the logs appear from remote nodes is now non deterministic. If you do get sensible logs back it's only by chance i.e you're outputting a line at a time, maybe with some time between each output.

So now the question is, if I use rpm to downgrade to 2.5.1, what's the impact? Will my database and jobs still work? How screwed am I?

@richiereynolds
Copy link
Author

Here's the script I used to see the difference:

#/bin/csh
# \
cd `dirname $0`
# \
exec tclsh `basename $0` $*
puts -nonewline "This"
flush stdout
after 2000
puts -nonewline " should"
flush stdout
after 2000
puts -nonewline " all be on"
flush stdout
after 2000
puts -nonewline " the same"
puts " line."
puts "And this should be on another line."
puts -nonewline "And this should have no newline at the end"
exit 0

@gschueler
Copy link
Member

@richiereynolds you should be able to downgrade to 2.5.1 without issue, other than reintroducing the list of fixed bugs: https://github.com/rundeck/rundeck/blob/v2.5.2/RELEASE.md

as a workaround, is it possible to remove/reduce the "flush" statements from your script?

@richiereynolds
Copy link
Author

No, afraid it's from another department and used company wide and globally so changing it would be a big deal.
Similarly, if we use other 3rd party software we'd have to expect that lines appear on the RD logs as they come out of the software so I think downgrading is the only option.

Downgrading worked by the way, had no problems with that.

@richiereynolds
Copy link
Author

Can we get this tagged as a bug? There are a couple of fixes in 2.5.2 I was waiting for but this one is a showstopper for me for upgrades beyond 2.5.1.

@gschueler
Copy link
Member

@richiereynolds definitely a bug somewhere, but i haven't had a chance to dig into it

@puremourning
Copy link
Contributor

@gschueler @richiereynolds FYI I have built a version of rundeck equivalent to v2.5.2 but with the suspected change removed and the problem has gone away.

@gschueler
Copy link
Member

@puremourning thanks for verifying. any chance you can add a testcase ? 😁

@bon77
Copy link

bon77 commented Nov 6, 2015

Same problem here. Has this been fixed in a later version?

@jamieps
Copy link
Contributor

jamieps commented Nov 10, 2015

I've retested this on the latest version, 2.6.1, and it's still a problem.

@richiereynolds
Copy link
Author

Can this get tagged as a bug? It's a total show stopper for us upgrading beyond 2.5.1 and there are some other fixes we'd love to get.

@ahonor ahonor added the bug label Nov 11, 2015
@ljohnston
Copy link

+1

This is a huge problem. Any schedule for getting this addressed? I'd be interested in knowing how long we'll need to stay on 2.5.1.

@gschueler gschueler added this to the 2.6.3 milestone Feb 3, 2016
@gschueler gschueler added the p1 label Feb 11, 2016
@gschueler
Copy link
Member

hi all, I am working on a fix for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests