Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SSH command stream channel handling #58

Merged
merged 1 commit into from
Dec 16, 2016

Conversation

rinrinne
Copy link
Contributor

Hi, long time no see :)

I have had a concern about GerritConnection for a long time. Its thread is blocked by readLine(). Due to this, we had to close connection rather than channel. It made a lot of exceptions whenever disconnected.

This might solve issues around SSH connection handling.

SSH command stream is opened on channel. So channel should be
closed at first if disconnecting. But it was impossible because
BufferedReader.readLine() blocks thread.

This patch implements character buffer to use BufferedReader.read()
as non-blocking operation instead readLine().

This patch will decrease SSH related exception in log.
for (int i = 0; i < cb.length(); i++) {
if (cb.charAt(i) == '\n') {
line = getSubSequence(cb, 0, i).toString().trim();
cb.position(i + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you risk landing on a \r here in case the server is Windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, we can ignore the difference of EOL on each OS architecture. Because almost network communication is done by generic rule. e.g. HTTP's basic rule is "EOL is CRLF" defined by RFC2616.
In this case, we can use '\n'(LF) for buffer splitting. After that, '\r'(CR) might be left, but it is removed by String.trim().

I believe this is the same logic as BufferedReader.readLine().
http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/9b8c96f96a0f/src/share/classes/java/io/BufferedReader.java#l313

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm misunderstanding. EOL is either CRLF(Win) or LF(Linux, MacOS10). So buffer splitting is done by LF. CR in duplicated string will be stripped by String.trim(). No operation regarding CR handling on buffer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I seemed to be remembering a time when MacOS or some other had LFCR, but that is probably not the case any more.

@rsandell
Copy link
Collaborator

Glad to see you again! :)

what type of manual tests have you made to verify this?

@rinrinne
Copy link
Contributor Author

I tested with Jenkins + GT and Gerrit.

  • Connect and disconnect from Jenkins UI (several times)
  • Shutdown Gerrit
  • Keep connection (1h)
  • Receive events(100-200)
    Unfortunately I have test environment only, so amount of tests might not be enough...

@rsandell
Copy link
Collaborator

That seems like enough to me :)

@rsandell rsandell merged commit 8e6af6c into sonyxperiadev:master Dec 16, 2016
@hugares
Copy link
Contributor

hugares commented May 5, 2017

We have an issue that might be caused by this change.

Approximately once per day, the connection thread is no longer connected to Gerrit (confirmed from gerrit side that it's not connected) but the connection is not restarted automatically even with the watchdog enabled. A thread dump revealed that connection thread is always at line 420 in GerritConnection ( sleep(SSH_RX_SLEEP_MILLIS);). The only way out is to restart connection.

It looks like the new implementation misses the disconnection and still think it's connected although it's not.

I just reverted to gerrit-events 2.10.0 to test if this change is really the cause.

@rinrinne
Copy link
Contributor Author

rinrinne commented May 5, 2017

Which loop? 406-420 or 377-422? If latter case, INFO message in line 404 would be shown in log repeatedly.

@hugares
Copy link
Contributor

hugares commented May 5, 2017

406-420 and there was nothing in the log (I enabled debug logs)

@rinrinne
Copy link
Contributor Author

rinrinne commented May 5, 2017

Thanks, Maybe InputStream provided by SSH command execution channel have not recognized abnormal channel close as EOF. If so, Reader.read() at line 406 may not return -1.

@rinrinne
Copy link
Contributor Author

rinrinne commented May 8, 2017

@hugares , create new PR #62 based on my assumption. Please check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants