Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Console logs stop updating when non-ascii characters are encountered #3926

Closed
BrewRunner opened this issue Oct 9, 2017 · 24 comments · Fixed by #4044
Closed

Bug: Console logs stop updating when non-ascii characters are encountered #3926

BrewRunner opened this issue Oct 9, 2017 · 24 comments · Fixed by #4044
Milestone

Comments

@BrewRunner
Copy link

BrewRunner commented Oct 9, 2017

Bug Report

Summary

Console logs containing non-ASCII characters are truncated at the point that the non-ASCII characters occur. Only the portion of the console log that is generated prior to the occurrence of the non-ASCII characters is displayed. The log is still generated and view-able once the stage completes via the "raw output" link, however the non-ASCII characters are replaced with this symbol "�".

We discovered this issue when upgrading from version 17.2 to version 17.10 and were forced to back out of the upgrade as a result of the inability to follow the progress of any stages that display non-ASCII characters in the console log. Many of these stages are lengthy, causing the stage to appear stalled since the log is no longer updating and the "raw output" is not available until the stage completes.

The raw output in 17.2 also replaces non-ASCII characters with "�", but does process and display them correctly in the scrolling log.

The specific characters in our logs that may be contributing to the problem are: æ, ä, ö, ê, à, ü, ö, ß, ó, í, ç, á,

Environment

GoCD 17.10

  • non-ASCII characters cause console log to stop rendering "real time"
  • non-ASCII characters are replaced by "�" in the "raw output"

GoCD 17.2

  • non-ASCII characters are replaced by "�" in the "raw output"

Windows Server 2008 R2
Number of pipelines 343
Number of agents 24

AND

Windows 7
Number of pipelines 195
Number of agents 12

Firefox Version 56.0
Google Chrome Version 61.0.3163.100
Microsoft Edge 38.14393.1066.0

Steps to Reproduce

  1. Generate an output containing multiple instances of non-ASCII characters
  2. View the console log real time
  3. View the raw output

Expected Results

Scrolling console log should continue to scroll and correctly display all characters. Raw output should correctly display all characters.

Actual Results

Scrolling console log stops scrolling at the point at which the non-ASCII characters are being processed. Raw output replaces non-ASCII characters with "�".

Possible Fix

Render the characters in the scrolling console log in the same manner that they were rendered in 17.2 and below (I'm not sure exactly where the change took place that caused the problem between versions 17.2 and 17.10).

Log snippets

Log is generated normally until this point: (captured from the raw output since the scrolling log truncates prior to displaying this info). This is just a small portion of the log - there are several occurrences of non-ASCII characters in this log.

image

@BrewRunner BrewRunner changed the title Console logs containing non-ascii characters are truncated Bug: Console logs stop updating when non-ascii characters are encountered Oct 11, 2017
@ankitsri11
Copy link
Contributor

Hi @BrewRunner - I was able to replicate the issue on windows server 16. However, it is not reproducible on MacOS.

One thing which I noticed is that file.encoding value is different in both the case. I suspect this might be causing the problem.

For windows - "-Dfile.encoding\u003dcp1252"
MacOS - "file.encoding": "UTF-8"

Can you check your http://go_server_url/go/api/support and confirm value for encoding.

@BrewRunner
Copy link
Author

I do see that same value on both my production system (running Windows Server 2008 and GoCD 17.2) and on my test system (running Windows 7 and GoCD 17.10):

  "-Dfile.encoding\u003dCp1252",

@BrewRunner
Copy link
Author

Specifying UTF-8 encoding does not change the behavior on my Windows machines. I started the service with the following in wrapper-server.conf: wrapper.java.additional.16=-Dfile.encoding=UTF-8 and verified in api/support that it was started with encoding set to UTF-8 and the problem persisted.

I then forced all encoding values to UTF-8, again with no change in behavior:
wrapper.java.additional.16=-Dfile.encoding=UTF-8
wrapper.java.additional.17=-Dsun.jnu.encoding=UTF-8
wrapper.java.additional.18=-Dsun.io.unicode.encoding=UTF-8

It appears that the problem lies elsewhere.

@ankitsri11
Copy link
Contributor

Upgraded to the latest 17.11 and can see below error logged:

2017-10-30 18:49:16,351 ERROR [qtp1405336328-32] ConsoleLogSocket:79 - Session[/0:0:0:0:0:0:0:1:51354:JobIdentifier[testing_3930, 5, 5, defaultStage, 1, defaultJob, 8]:1617202632] closing session because an error was thrown
java.io.UncheckedIOException: java.nio.charset.MalformedInputException: Input length = 1
	at java.io.BufferedReader$1.hasNext(BufferedReader.java:574)
	at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)
	at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
	at com.thoughtworks.go.domain.ConsoleStreamer.stream(ConsoleStreamer.java:59)
	at com.thoughtworks.go.server.websocket.ConsoleLogSender.sendLogs(ConsoleLogSender.java:145)
	at com.thoughtworks.go.server.websocket.ConsoleLogSender.process(ConsoleLogSender.java:88)
	at com.thoughtworks.go.server.websocket.ConsoleLogSocket.onConnect(ConsoleLogSocket.java:66)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.eclipse.jetty.websocket.common.events.annotated.CallableMethod.call(CallableMethod.java:70)
	at org.eclipse.jetty.websocket.common.events.JettyAnnotatedEventDriver.onConnect(JettyAnnotatedEventDriver.java:148)
	at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.openSession(AbstractEventDriver.java:227)
	at org.eclipse.jetty.websocket.common.WebSocketSession.open(WebSocketSession.java:421)
	at org.eclipse.jetty.websocket.server.WebSocketServerConnection.onOpen(WebSocketServerConnection.java:72)
	at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:185)
	at org.eclipse.jetty.server.HttpConnection.completed(HttpConnection.java:345)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:436)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
	at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
	at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
	at java.io.InputStreamReader.read(InputStreamReader.java:184)
	at java.io.BufferedReader.fill(BufferedReader.java:161)
	at java.io.BufferedReader.readLine(BufferedReader.java:324)
	at java.io.BufferedReader.readLine(BufferedReader.java:389)
	at java.io.BufferedReader$1.hasNext(BufferedReader.java:571)
	... 23 common frames omitted

@ankitsri11
Copy link
Contributor

I tried on other versions of GoCD (tried with the zip installer on windows box) and could see below behaviour -

17.2 - 17.4 - console log is getting populated without any issue.
17.5 - 17.7 - console log is getting populated but all the special characters are replaced with "�".
17.8 - 17.11 - console log doesn't get populated once the job is finished and displays "waiting for console log..." message. Console is only visible via raw output link or from artifact tab.

@BrewRunner - If you wish to upgrade, you can try upgrading to 17.7 until we fix this issue.

@BrewRunner
Copy link
Author

Any chance of this being addressed in an upcoming release?

@arvindsv
Copy link
Member

Yes, I think we should.

I believe @maheshp looked at this - and has some findings. Right, @maheshp? Or, am I confusing it with something else?

@maheshp
Copy link
Contributor

maheshp commented Nov 28, 2017

@arvindsv I was actually looking at #3930 . Not sure if the changes for it would fix this issue. This issue needs to be looked at though.

@BrewRunner we are close to making a release, it be tough to get it in for this release.

@arvindsv arvindsv added this to the Release 17.13 milestone Nov 28, 2017
@arvindsv
Copy link
Member

Ok, let's consider it for the next one then. I'll move it there.

ketan added a commit to ketan/gocd that referenced this issue Nov 30, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 1, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 4, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 4, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 4, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
@ketan
Copy link
Member

ketan commented Dec 5, 2017

@ankitsri11 and @BrewRunner — I think I have a fix for this issue. While this fix has missed the train for the release due tomorrow, this can be merged in time for 18.1.

Can you download the server and agent from #4044 (look for the installers-PR/dist pipeline which will contain the installer)

Let me know if it fixes the problem.

@ketan ketan modified the milestones: Release 17.13, Release 18.1 Dec 5, 2017
@BrewRunner
Copy link
Author

I am attempting to test this, but a different issue seems to have been introduced that prevents my builds from getting far enough to test the fix.

The GO_P4_CLIENT_[material name] environment variable now contains the server name instead of the agent name. Historically, the format for the variable has been cruise-[agent hostname]-[pipelinename]-[random hash]. Now the format is cruise-[server hostname]-[pipelinename]-[random hash]. The historic format containing the agent name is still used when syncing the material from Perforce, but the variable now instead contains the server name, which causes the perforce client to not be found when it is referenced post sync.

@jyotisingh
Copy link
Contributor

@arvindsv - Logged a separate issue for the issue (#4058) with build-compose workflow that you pointed out here

ketan added a commit to ketan/gocd that referenced this issue Dec 7, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 10, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 11, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
akshaydewan pushed a commit to akshaydewan/gocd that referenced this issue Dec 15, 2017
…side as it contains the hostname of the agent which would be used to access the repo when the agent checks out the material. This got broken when we moved the setting of all Go generated environment variables over to the server side - gocd#3926

* Also fixed a bug where the GO_P4_CLIENT environment variable would mismatch the actual client being used by the agent when a destination directory was specified for the p4 material
ketan added a commit to ketan/gocd that referenced this issue Dec 21, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Dec 21, 2017
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
@rajiesh rajiesh modified the milestones: Release 18.1, Release 18.2 Jan 24, 2018
@BrewRunner
Copy link
Author

I noticed that this did fix did not make it into 18.1. Is there a build available containing this fix that I can evaluate?

@ketan
Copy link
Member

ketan commented Feb 6, 2018

@BrewRunner — sorry for not providing an update earlier. This one is scheduled to be fixed in 18.2, I'll let you know once a build is available that you can test on.

Thank you for volunteering to evaluate.

ketan added a commit to ketan/gocd that referenced this issue Feb 6, 2018
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Feb 6, 2018
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Feb 6, 2018
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Feb 6, 2018
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
ketan added a commit to ketan/gocd that referenced this issue Feb 6, 2018
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
@ketan
Copy link
Member

ketan commented Feb 6, 2018

@BrewRunner can you try with a build at https://build.gocd.org/go/tab/build/detail/installers-PR/894/dist/1/dist#tab-artifacts? (Click on the guest login button on the login screen when prompted).

This is an experimental build, so we don't recommend using this on a production server. If you can confirm that this works fine for you, I'll merge that code into the mainline for 18.2.

@BrewRunner
Copy link
Author

The console log did not stop updating using this experimental build. The non-ascii characters were replaced with "�", just as they previously have been in the raw output. Is this is the expected behavior for the change that was made?

@ketan
Copy link
Member

ketan commented Feb 6, 2018

The console log did not stop updating using this experimental build.

Thank you for confirming that.

The non-ascii characters were replaced with "�", just as they previously have been in the raw output. Is this is the expected behavior for the change that was made?

Can you confirm if the font you see supports unicode characters? I'm testing this on chrome (running on osx) & safari (running on osx) and IE 11 and IE Edge and things render fine for me.

Screenshots for IE11 and edge:

screen shot 2018-02-07 at 4 21 11 am

screen shot 2018-02-07 at 4 22 43 am

When you select one of the log lines in chrome dev tools, what value do you see for the font, can you also confirm what browser and OS you're on?

screen shot 2018-02-07 at 4 26 52 am

ketan added a commit to ketan/gocd that referenced this issue Feb 7, 2018
As part of commit 7fc200b, the
`Content-Type` header was set to `text/plain; charset=utf-8`. However
the `HttpServletResponse`'s `getWriter()` was returning a writer bound
to the system default character encoding. We now force the encoding of
the writer to `utf-8`.

Fixes gocd#3926, gocd#3442
@ketan ketan reopened this Feb 7, 2018
@ketan
Copy link
Member

ketan commented Feb 7, 2018

A fix has been merged, but reopening for some thorough testing

@BrewRunner
Copy link
Author

BrewRunner commented Feb 7, 2018

I'm using Windows 10 Pro, normally with Firefox Version 58.0.1, but the same behavior is evident using Microsoft Edge 38.14393.1066.0 and Chrome Version 64.0.3282.140.

It looks like Chrome is rendering Consolas for me in both Go Version: 18.2.0 (6025-6c8a9c8407604546b70a8f3e1269ff8a0c2892e4), which is not displaying properly, and Go Server Version: 17.2.0 (4587-442a7d67087844e44a2dcf5824164e1dfe00cc45), which is displaying properly.

18.2:
18 2

17.2:
17 2

@ketan
Copy link
Member

ketan commented Feb 7, 2018

@naveenbhaskar (our in-house expert on all things UX) has been tasked with identifying a font-face that supports the most unicode range and something we can distribute with gocd, this will avoid having to fallback fonts provided by the OS and allow us the flexibility to update the fonts at a cadence faster than the release cadence of most OSes.

@ketan
Copy link
Member

ketan commented Mar 6, 2018

I'm going to close this issue as fixed, since we are consistently handling charset encoding across the board. This is currently set to utf-8 and is configurable.

I'm opening #4487 to track the specific issue about packaging a monospace font that works across the board for multiple languages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants