Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force file.encoding to utf-8 #3068

Closed
ketan opened this issue Jan 4, 2017 · 3 comments
Closed

Force file.encoding to utf-8 #3068

ketan opened this issue Jan 4, 2017 · 3 comments
Milestone

Comments

@ketan
Copy link
Member

ketan commented Jan 4, 2017

Issue Type
  • Feature enhancement
Summary

Currently the file encoding depends on a variety of factors, which causes unexpected behavior wrt server and db functionality. Additionally different encodings on the server and agent process also means that the console logs sometimes get rendered with unexpected glyphs.

Would like to change the encoding on the server, using the startup scripts, which seems fairly straightforward - with an option to override it if necessary. It seems that a change on the agent startup scripts would not necessarily help users who are upgrading, we may need to consider putting in some code in the launcher to force the encoding, with some options to be able to override it.

@arvindsv
Copy link
Member

arvindsv commented Jan 4, 2017

Yes, we should do this. But, we need to test it properly. We've never been able to properly reproduce a different encoding.

@ketan
Copy link
Member Author

ketan commented Jan 10, 2017

Adding my comment from #3079 (comment)

But, as I said, this assumes all console output to be UTF-8, right?

Correct, I was talking about this with @maheshp yesterday. And realized that this does not feel like it's the right approach to the problem. Here's what we discussed —

GoCD "reads" bytes from various streams. The encoding of these streams is not directly controlled by GoCD —

  • password file typically containing usernames with multi-byte chars (server)
  • SCM material process execution (git log, svn log, hg log, etc) (server). SCM updates on the agent can probably be ignored, since GoCD does not read the output of such commands.
  • Build script execution (agent)

Git, SVN and HG support a --encoding argument to override the commit encoding. However all of them claim that the encoding defaults to utf-8. And encourage that you users use that encoding. Git specifically stores the encoding of each commit object.

For git — see this and this.
For hg - see hg(1) (search for encoding)
For svn - see http://svnbook.red-bean.com/en/1.7/svn.ref.svn.html

As far as reading these console outputs is concerned — I'm not sure what's the best way to proceed, I'm sure jGit can probably help us read the git log output in a better manner. We might need to look at other libraries for svn, hg and p4.

As far as console output for builds is concerned, we might need to consider not reading the output as utf-8. However that'd mean we figure out(not sure how), or allow the user to specify what the encoding is(not sure where, i.e. on each agent, or per job) — and more importantly, save that encoding (on the server) along with the console log, so we can render it in the correct encoding when sending it to the browser (on the job detail page).

@ketan
Copy link
Member Author

ketan commented May 30, 2018

Closed via #4044

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants