Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-48761] - Restore binary compatibility with agents running old Remoting versions #3212

Merged

Conversation

3 participants
@oleg-nenashev
Copy link
Member

oleg-nenashev commented Jan 2, 2018

See JENKINS-48761.

This change reverts new API usages in #3145 and some other places. I made a mistake with this APIs, and my action item is to create a unit test which verifies that the master is at least able to establish connection with an agent with old Remoting version.

Proposed changelog entries

  • Major bug: Restore binary compatibility of Jenkins core logic with old Remoting versions to allow connecting old agents (regression in 2.98)
  • ...

Submitter checklist

  • JIRA issue is well described
  • Changelog entry appropriate for the audience affected by the change (users or developer, depending on the change). Examples
    * Use the Internal: prefix if the change has no user-visible impact (API, test frameworks, etc.)
  • Appropriate autotests or explanation to why this change has no tests
  • For dependency updates: links to external changelogs and, if possible, full diffs

Desired reviewers

@reviewbybees @jglick

protected static Channel _getOpenChannelOrFail() throws ChannelClosedException {
final Channel ch = _getChannelOrFail();
if (ch.isClosingOrClosed()) { // TODO: Since Remoting 2.33, we still need to explicitly declare minimal Remoting version
throw new ChannelClosedException("The associated channel " + ch + " is closing down or has closed down", ch.getCloseRequestCause());

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 2, 2018

Author Member

This one is 3.11, will need an older API as well

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 2, 2018

The current code will run on Remoting agents 3.0+ at least, likely on even older ones

@jglick
Copy link
Member

jglick left a comment

OK as a hotfix.

The broader issue is that we have no static check for the case that a core or plugin remote Callable is using an API member introduced in a newer version of Remoting than the agent uses. Not sure what to do about that.

@@ -1308,7 +1308,11 @@ public int join() throws InterruptedException, IOException {
Channel taskChannel = null;
try {
// Sync IO will fail automatically if the channel is being closed, no need to use getOpenChannelOrFail()
taskChannel = Channel.currentOrFail();
// TODOL Replace by Channel#currentOrFail() when Remoting version allows

This comment has been minimized.

Copy link
@jglick

jglick Jan 2, 2018

Member

typo

@Override
public void checkRoles(RoleChecker checker) throws SecurityException {
checker.check(this,Roles.SLAVE);
}

private static final long serialVersionUID = 1L;
//TODO: replace by Callable#getChannelOrFail() once Minimal supported Remoting version is 3.15 or above
@Restricted(NoExternalUse.class)

This comment has been minimized.

Copy link
@jglick

jglick Jan 2, 2018

Member

I think you could do something like

@Override
public Channel getChannelOrFail() throws ChannelStateException {
    // copy implementation from Callable
}

which would avoid polluting the API with even a restricted function, while still fixing the problem.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

I was not sure it's 100% safe for defaults and decided not to try due to the urgency. Can try it now since I have a test which should reveal the problem if any

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

Not at all urgent, just an idea which might simplify the patch.

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 2, 2018

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 2, 2018

@jenkinsci/code-reviewers does it deserve an out-of-order release? I'd guess so, but maybe it makes sense to take another day for test automation, etc.

@jglick

This comment has been minimized.

Copy link
Member

jglick commented Jan 2, 2018

playing with createOnlineSlave() for old Remoting versions

Would be tricky since you remoting.jar is picked up from the Maven test classpath, which cannot contain multiple versions. Better would be to use docker-fixtures from jenkinsci/jenkins/test/, which would allow us to connect to an actual agent running in another environment and using a pinned version of Remoting. (For that matter, did none of the existing tests using Dockerized SSH agents in acceptance-test-harness catch this problem?)

does it deserve an out-of-order release?

I am not sure. It only affects people running old agent versions. If they are going to bother updating core to get an out-of-order release, maybe they could just as easily update their agent. No strong opinion.

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

For that matter, did none of the existing tests using Dockerized SSH agents in acceptance-test-harness catch this problem?

Let's see. But we do not run ATH before weeklies now

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

@jglick After some consideration I decided to write a test without Docker containers. It required doing some foundation work for JENKINS-48766, but I think we need to do it anyway.

}

@Test
@Issue("JENKINS-48761")

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

This test fails on Jenkins 2.99/98. ChannelClosedException API compatibility is not tested though

@oleg-nenashev oleg-nenashev requested a review from jglick Jan 3, 2018

pom.xml Outdated
@@ -104,6 +104,10 @@ THE SOFTWARE.

<maven-war-plugin.version>3.0.0</maven-war-plugin.version> <!-- JENKINS-47127 bump when 3.2.0 is out. Cf. MWAR-407 -->

<!-- Minimal Remoting version, which is tested for API compatibility -->

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

Just picked first version with excessive stability improvements and available changelog: https://github.com/jenkinsci/remoting/blob/master/CHANGELOG.md#260 . This version has been released on June 10, 2016 => old enough.

IMHO the version should be bumped to 3.0 once https://issues.jenkins-ci.org/browse/JENKINS-48766 is implemented. It would unblock #3188 from @Vlatombe

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

@reviewbybees Needs re-reviews (mostly tests and Restricted logic in the core for enabling such kind of tests)

@@ -0,0 +1,6 @@
# Remoting version, which is embedded into the core
# This version MAY differ from what is really classloaded (see the "Pluggable Core Components", JENKINS-41196)
remoting.embedded.version=${remoting.version}

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

Or this could be computed from remoting*.jar!/META-INF/MANIFEST.MF#Version using Which.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

I would prefer to keep it as is.

  • At some point we may want to write a "Remoting Compatibility Tester" tool or so
  • Not sure how which is going to behave once we have pluggable core components in place
remoting.embedded.version=${remoting.version}

# Minimal Remoting version on external agents which is supported by the core
remoting.minimal.supported.version=${remoting.minimal.supported.version}

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

s/minimal/minimum/g


// TODO: Make the API public (JENKINS-48766)
/**
* Provides information about Remoting versions used withing the core.

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

typo

/**
* Provides information about Remoting versions used withing the core.
* @author Oleg Nenashev
* @since TODO

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

Restricted, so delete.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

There is a TODO for making it public above

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

So if and when that happens, you will add a @since with that version, as with adding any new API. We should only have @since TODO comments in the code base for those things which we have forgotten to fill in after merging.

static {
Properties props = new Properties();
try (InputStream is = RemotingVersionInfo.class.getResourceAsStream(RESOURCE_NAME)) {
if(is!=null) {

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

No need to check for null. If the file is missing, core is corrupt, so die now.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

The implementation is similar to jenkins.model.Jenkins's appoarch to reading jenkins-version.properties. I agree this should not be a problem, but there may be edge cases (e.g. custom packaging). I see no problem with having a fault-tolerant code since it is a regression fix

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

In the case of custom packaging, it is the responsibility of the packager to not make mistakes. If they do, their WAR will be broken. We should not engineer in robustness for something that is not a user error and is trivially detected.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

Hmm, another potential use-case is calling this class from the agent. I doubt we would be dare enough to send Jenkins core over Remoting. Do we do that in the case of such requests? I hope not

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

Let's keep it as is. I will improve checks when exposing it to public API

try {
return new VersionNumber(prop);
} catch (RuntimeException ex) {
LOGGER.log(Level.WARNING, String.format("Failed to parse version for for property %s in %s. Raw Value: %s",

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

Ditto. We are in full control of this file, so there is no reason to tolerate any errors.

}
}

@CheckForNull

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

@Nonnull since we know these will be defined.

@Test
public void shouldLoadEmbeddedVersionByDefault() {
assertThat("Remoting Embedded version is not defined",
RemotingVersionInfo.getEmbeddedVersion(), notNullValue());

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

Unnecessary if they are just @Nonnull to begin with, since other tests would die if someone screwed this up.

@Before
public void extractAgent() throws Exception {
agentJar = new File(tmpDir.getRoot(), "old-agent.jar");
FileUtils.copyURLToFile(getClass().getResource("/old-remoting/remoting-minimal-supported.jar"), agentJar);

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

BTW as a stylistic matter it is safest to use OldRemotingAgentTest.class in place of getClass(). (Unless the test suite is final in which case they are equivalent.)

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

yup

try {
monitorMethod = AbstractAsyncNodeMonitorDescriptor.class.getDeclaredMethod("monitor", Computer.class);
} catch (NoSuchMethodException ex) {
System.out.println("Cannot invoke monitor " + monitor + ", no monitor(Computer.class) method in the Descriptor. It will be ignored. " + ex.getMessage());

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

This is test code. Just throw it up!

Or, better, make that method @Restricted(NoExternalUse.class) public.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

Right, it should never happen

@jglick jglick dismissed their stale review Jan 3, 2018

stale

@oleg-nenashev oleg-nenashev requested a review from jglick Jan 3, 2018

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

@jglick addressed some comments. One biggest open question is RemotingVersionInfo. You propose to make the checks stricter and to throw exceptions on initialization errors. I kinda agree with that, but I propose to postpone it and to handle it in https://issues.jenkins-ci.org/browse/JENKINS-48766 while making the API public.

Now it's used only for tests, and I would prefer to keep it permissive in order to avoid new regressions.

@jglick

jglick approved these changes Jan 3, 2018

@@ -32,7 +32,6 @@
import hudson.slaves.Cloud;
import jenkins.util.SystemProperties;
import hudson.Util;
import hudson.cli.declarative.CLIMethod;

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

unrelated but OK

<version>${remoting.minimum.supported.version}</version>
<type>jar</type>
<outputDirectory>${project.build.outputDirectory}/old-remoting</outputDirectory>
<destFileName>remoting-minimal-supported.jar</destFileName>

This comment has been minimized.

Copy link
@jglick

jglick Jan 3, 2018

Member

BTW may want to s/minimal/minimum/ here too for consistency.

This comment has been minimized.

Copy link
@oleg-nenashev

oleg-nenashev Jan 3, 2018

Author Member

Will address it in a follow-up

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

@reviewbybees done
@jenkinsci/code-reviewers I kindly ask for approval to got forward with merge and out-of-order release. Community stats are bad due to that regression:

screen shot 2018-01-03 at 19 24 19

@abayer

abayer approved these changes Jan 3, 2018

@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

Merging it, this change will unlikely make the situation worse

@oleg-nenashev oleg-nenashev merged commit 8eb4254 into jenkinsci:master Jan 3, 2018

1 check passed

continuous-integration/jenkins/pr-head This commit looks good
Details
@oleg-nenashev

This comment has been minimized.

Copy link
Member Author

oleg-nenashev commented Jan 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.