Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-52165] Calling Controller.watch API #63

Merged
merged 31 commits into from Sep 13, 2018

Conversation

Projects
None yet
5 participants
@jglick
Copy link
Member

jglick commented Feb 9, 2018

JENKINS-52165 Split out of #21. Downstream of jenkinsci/durable-task-plugin#60. Subsumes #73. Should improve performance by switching durable task logging and exit status checking from being pull (initiated by master at intervals) to push (initiated by agent at close intervals).

@reviewbybees

@jglick jglick requested review from abayer , rsandell , oleg-nenashev and svanoort Feb 9, 2018

@jglick jglick changed the title [JENKINS-38381] Calling Controller.watch API [JENKINS-38381] [JENKINS-37575] Calling Controller.watch API Feb 9, 2018

@jglick

This comment has been minimized.

Copy link
Member Author

jglick commented Feb 9, 2018

By the way jenkinsci/support-core-plugin#128 should make it easier to see the impact of this on actual Remoting usage. Have not yet tried it myself.

@oleg-nenashev
Copy link
Member

oleg-nenashev left a comment

Merge conflict, but the code itself LGTM

@oleg-nenashev
Copy link
Member

oleg-nenashev left a comment

merge conflict

@oleg-nenashev

This comment has been minimized.

Copy link
Member

oleg-nenashev commented Apr 2, 2018

Approved once the upstream PRs are integrated && the conflict is fixed

@vivek

This comment has been minimized.

Copy link
Contributor

vivek commented Apr 3, 2018

This looks good to go if merge conflicts fixed? ping @jglick.

@svanoort

This comment has been minimized.

Copy link
Member

svanoort commented Apr 3, 2018

@vivek The upstream isn't really ready-to-go yet, and the whole thing needs some sort of proper and thorough testing in a simulated environment. Oleg's review found some issues already.

We could of course YOLO-merge all this but I make no promises on the fallout if so.

@jglick

This comment has been minimized.

Copy link
Member Author

jglick commented Apr 3, 2018

Well, the upstream “issues” are just things like Javadoc that should be improved, not user-visible problems, but yes this all deserves testing in a realistic environment. Once @svanoort is ready to start that I will polish up all the PRs.

jglick added some commits Jun 8, 2018

@jglick jglick referenced this pull request Jun 11, 2018

Merged

[JEP-210] Integration tests with DurableTaskStep #21

7 of 7 tasks complete

@jglick jglick changed the title [JENKINS-38381] [JENKINS-37575] Calling Controller.watch API [JENKINS-38381] Calling Controller.watch API Jun 25, 2018


private static final long MIN_RECURRENCE_PERIOD = 250; // ¼s
private static final long MAX_RECURRENCE_PERIOD = 15000; // 15s
private static final float RECURRENCE_PERIOD_BACKOFF = 1.2f;
private static final long WATCHING_RECURRENCE_PERIOD = Main.isUnitTest ? /* 5s */5_000: /* 5m */300_000;

This comment has been minimized.

@jglick

jglick Jun 25, 2018

Author Member

Should probably be switched to nonfinal constant initialized via system property and overridden in the relevant test only.

this.remote = ws.getRemote();
setupTimer();
try {

This comment has been minimized.

@jglick

jglick Jun 25, 2018

Author Member

Should introduce a system property to act as a kill switch and fall back to polling.

@jglick jglick changed the title [JENKINS-38381] Calling Controller.watch API [JENKINS-52165] Calling Controller.watch API Jun 25, 2018

@jglick jglick requested a review from dwnusbaum Jun 25, 2018

jglick added some commits Aug 10, 2018

After discussion with @dwnusbaum, attempting to proceed after errors …
…which do not seem to have closed or broken the channel.

@jglick jglick requested review from svanoort and dwnusbaum Aug 10, 2018

/** If set to false, disables {@link Execution#watching} mode. */
@SuppressWarnings("FieldMayBeFinal")
private static boolean USE_WATCHING = !"false".equals(System.getProperty(DurableTaskStep.class.getName() + ".USE_WATCHING"));

/**
* Represents one task that is believed to still be running.

This comment has been minimized.

@jglick

jglick Aug 11, 2018

Author Member

I will flesh this out with an explanation of the expected state changes.

This comment has been minimized.

@jglick

jglick Aug 13, 2018

Author Member

(done)

os = ps;
}
try {
IOUtils.copy(stream, os);

This comment has been minimized.

@dwnusbaum

dwnusbaum Aug 13, 2018

Member

It looks like all PrintStream#write overloads synchronize on this and null-check out before writing (since it is nulled out once the stream is closed). Do we need to do either of those here as well? This usage should always be from the same thread IIUC, but do other threads have access to listener.getLogger()?

This comment has been minimized.

@jglick

jglick Aug 14, 2018

Author Member

Agreed, that sounds safer.

if (ps.getClass() == PrintStream.class) {
// Try to extract the underlying stream, since swallowing exceptions is undesirable and PrintStream.checkError is useless.
OutputStream os = (OutputStream) printStreamDelegate.get(ps);
synchronized (os) { // like PrintStream.write overloads do

This comment has been minimized.

@dwnusbaum

dwnusbaum Aug 14, 2018

Member

Did you mean synchronized (ps)?

This comment has been minimized.

@jglick

jglick Aug 14, 2018

Author Member

Indeed that seems to be the lock, will change.

if (ps.getClass() == PrintStream.class) {
// Try to extract the underlying stream, since swallowing exceptions is undesirable and PrintStream.checkError is useless.
synchronized (ps) { // like PrintStream.write overloads do
IOUtils.copy(stream, (OutputStream) printStreamDelegate.get(ps));

This comment has been minimized.

@dwnusbaum

dwnusbaum Aug 14, 2018

Member

Is it possible for the PrintStream to be closed? If so, we'd get an NPE because printStreamDelegate.get(ps) will be null.

This comment has been minimized.

@jglick

jglick Aug 14, 2018

Author Member

It should not be closed. If it is, we expect this method to throw an exception. I can make it throw a specific exception for better diagnostics.

This comment has been minimized.

@dwnusbaum

dwnusbaum Aug 14, 2018

Member

Ok, I guess it would be clear to anyone investigating where the NPE came from anyway, so it probably doesn't need a specific exception, but 3f8ced4 won't hurt.


// ditto
@Override public void problem(Exception x) {
Functions.printStackTrace(x, listener().getLogger());

This comment has been minimized.

@jglick

jglick Aug 17, 2018

Author Member

Seems to work. Due to an apparent mistake in jenkinsci/pipeline-cloudwatch-logs-plugin#5 affecting agent-side, but not master-side, logging, I saw in the build log:

java.lang.NullPointerException
	at io.jenkins.plugins.pipeline_log_fluentd_cloudwatch.FluentdLogger$FluentdOutputStream.eol(FluentdLogger.java:128)
	at hudson.console.LineTransformationOutputStream.eol(LineTransformationOutputStream.java:60)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:74)
	at java.io.PrintStream.write(PrintStream.java:480)
	at org.jenkinsci.plugins.credentialsbinding.impl.BindingStep$Filter$1.eol(BindingStep.java:183)
	at hudson.console.LineTransformationOutputStream.eol(LineTransformationOutputStream.java:60)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:74)
	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1793)
	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769)
	at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$HandlerImpl.output(DurableTaskStep.java:571)
	at org.jenkinsci.plugins.durabletask.FileMonitoringTask$Watcher.run(FileMonitoringTask.java:477)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

with the build proceeding nonetheless.

jglick added some commits Aug 22, 2018

private static final Field printStreamDelegate;
static {
try {
printStreamDelegate = FilterOutputStream.class.getDeclaredField("out");

This comment has been minimized.

@svanoort

svanoort Sep 6, 2018

Member

Ahh, reflection hacks: the universal panacea!

This comment has been minimized.

@jglick

jglick Sep 7, 2018

Author Member

Unfortunately I could not find a better way to do this.

} catch (Exception x) {
// Try to report it to the master.
try {
execution.problem(x);

This comment has been minimized.

@svanoort

svanoort Sep 6, 2018

Member

I predict we'll be seeing a lot of this particular report when there are network issues. Thankfully it won't be that often due to the between recurrence periods.

@svanoort
Copy link
Member

svanoort left a comment

I guess it's as good as it's going to get for now -- happy to see us doing something with errors besides letting the PrintStream swallow it

@svanoort svanoort merged commit 3bd3498 into jenkinsci:master Sep 13, 2018

2 checks passed

continuous-integration/jenkins/incrementals Deployed to Incrementals.
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details

@jglick jglick deleted the jglick:watch-JENKINS-38381 branch Sep 13, 2018

orgi pushed a commit to orgi/workflow-durable-task-step-plugin that referenced this pull request Nov 13, 2018

Merge pull request jenkinsci#63 from jglick/watch-JENKINS-38381
[JENKINS-52165] Calling Controller.watch API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.