-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Global Time Out #81
Conversation
@oleg-nenashev thanks for the encouragement of a global build timeout. I'd appreciate your thoughts on this implementation if you get a chance. |
Hi @sghill Thanks for this PR! However, could you please fix the sole failing test for the Windows build before I can do a thorough review of it? Please note also that some PR's have recently been merged so please be careful to rebase before pushing again if you have not already done so. |
db24213
to
a728515
Compare
Hi @krisstern, thanks for the ping. I rebased the PR and the tests are all passing. |
Thanks @sghill for the rebase! Let me start the reviewing process soon. |
HI @sghill My apologies for having to ask you to rebase one more time, as I have just merged a PR to switch Java support from version 8 to 11. Without this change I cannot check out then try out your new feature, as I was experiencing some setup issues previously. Many thanks for this! |
Hi @krisstern, I've rebased and updated to Mockito 4 but I'm getting some odd errors from the windows build. Is it possible there is an issue with the Java install on this agent?
Separately, I wanted to share that I'd be happy with other implementations as well. I tried to make this PR testable and fit well within the plugin by supporting existing strategies, but I only plan to use a fixed 24 hour timeout on our Jenkins controllers. If there is a simpler way to achieve that, I'm all for it. |
Hi @sghill I think it may have to do with the signature currently set for the animal-sniffer-maven-plugin to be 17, as there are currently no available signature for Java 11: Lines 67 to 85 in 65e8275
So it may have been introduced by me inadvertently while trying for a quick fix that worked previously. So I would recommend you do try and tweak the POM a bit according to mojohaus/animal-sniffer#62 (comment). I think the error we are encountering is similar to the one for Java 8 discussed on Stack Overflow at https://stackoverflow.com/q/45997103/9959070. |
@krisstern The build is green. My read of this comment is the animal-sniffer-plugin can be removed because the maven-compiler-plugin now covers the same featureset when the release flag is set (it already is):
Unfortunately removing the plugin block from the pom doesn't work, so it seems the plugin is requiring 11 but disallowing any new APIs after 8. It'd be nice to keep the minimum version at 8 if that's the case. |
Yeah, I can understand. On top of the aforementioned comment there is also mojohaus/animal-sniffer#62 (comment), so maybe I could take another look after this PR is reviewed and processed to see what to do with |
Hi @sghill I am in the process of reviewing the code in this PR. So far so good, I cannot help but noticed the following default time limit of 3 minutes or 180 seconds for the I think the timeout settings of the other strategies look okay. But would like to know if we could increase the default to something slightly longer like 5 minutes or 300 seconds for these two strategies only, respectively? Or did you have anything particular in mind when setting the default to 3 minutes effectively to both these strategies. Three minutes seem a bit short for me, but of course it depends on what we are trying to build as well. Say previously in your example you had been using 4 minutes for the |
I have reviewed the code, everything looks okay |
@oleg-nenashev If you have some time, please review this PR so I could process it promptly. |
Hi @krisstern, definitely didn't intend for a 3 minute default. I think that's coming from here? I imagine very few users would use the default, so perhaps it's best to clear it out if possible? I intend to set to 24 hours on our instances. |
Hi @sghill Or alternatively we could overwrite the timeout duration for the strategy say for the Absolute Timeout Strategy to something else here... I noticed the timeout settings of the Deadline Timeout Strategy, the Elastic Timeout Strategy, and also the Likely Stuck Timeout Strategy seem more agreeable. So maybe we could look into tweaking the default timeout settings somewhere like the following two places for the Absolute Timeout Strategy class file? I will need time to look into the impact of modifying the build-timeout-plugin/src/main/java/hudson/plugins/build_timeout/impl/AbsoluteTimeOutStrategy.java Line 32 in 65e8275
build-timeout-plugin/src/main/java/hudson/plugins/build_timeout/impl/AbsoluteTimeOutStrategy.java Line 43 in 65e8275
|
Or we could leave things as they are... And let the user choose a timeout value manually different from the default one. |
@sghill Some conflicts have been introduced because of some urgent PR that needed to be closed. Would you like to rebase your PR so I could have it merge as soon as possible? My apologies for the confusion. |
On each build: 1. Find a configured global time out 2. Schedule the configured time out operations on a dedicated ScheduledExecutorService with one thread 3. Once complete, cancel the task Implemented with a hudson.model.listeners.RunListener so it can be applied to every build. The existing implementation uses a hudson.tasks.BuildWrapper, which requires a checkbox on each build.
Replace deprecated #initMocks with MockitoRule
Hi @krisstern. I have rebased, but noticed a couple issues I'd like your input on before merging.
|
Hi @sghill Thanks for the prompt response! Let me answer the checkbox question first. I think to disable it you can modify Line 7 in c7809e4
I think it is better to get rid of the option |
The build seems to have passed all CI/CD tests on GitHub though, so should be okay. Could be something with your local setup. |
BTW, have you tried running |
Unchecking the Enable Global Timeout box will now unset any configured timeout. This also adds several INFO-level logs about what configuration was loaded, updated, and cleared. Additionally the constructor-injected Jenkins instance has been moved to a setter. It's the same amount of code, but no longer requires suppressing the unused warning on the default constructor.
Hi @krisstern,
I'm seeing it disabled by default, are you seeing different behavior? When I remove the I think it should automatically show on page load as checked if configured, but when unchecked the configuration is cleared. I've updated the code to do this -- if the submitted form object is null, we now manually null out the class properties before saving.
I tried New LogsI've added On load:
On update:
On clearing the 'Enable' checkbox:
|
Hi @sghill My apologies I think I may have misunderstood your question regarding the checkbox earlier. So I have tested your latest branch locally on my computer and everything works fine. I have approved the PR and will merge it over the next few days. Thanks for your contribution! |
Thanks @krisstern! Is there a release coming up soon? Also happy to try out a candidate and report back. |
Hi @krisstern, can I help create a new release? Looks like the last one was May 2020. |
Hi @sghill. Yup, I agree it is time for me to cut a new release. Let me study up the docs and get back to you shortly. I had missed your prevous message hence the late reply. |
Hi @sghill I have been trying to get the plugin code working locally on my new laptop over the last few days. However, I have been running into some problems getting the tests working again. Seems like once I have fixed up the POM file after updating to the latest weekly As a result, I may need more time to get the code running again before I can cut a release. BTW, would you like to help me to debug this issue? I could send you the new |
Hi @krisstern, sure I can take a look. Can you push up another branch with the changes? |
That's great @sghill! The branch with the new pom.xml is at https://github.com/krisstern/build-timeout-plugin/tree/preparing-for-new-release-2022 |
Hi @sghill New release draft version 1.21 is ready for testing: https://github.com/jenkinsci/build-timeout-plugin/releases/tag/untagged-aea95edaabab5f0616d0 |
Nice, thanks @krisstern! We have pulled in v1.21. This should be rolling out over the next couple weeks. I'll let you know if anything comes up. Current plan is to configure like this in a custom initialization plugin: import hudson.Extension;
import hudson.init.Initializer;
import hudson.plugins.build_timeout.BuildTimeOutOperation;
import hudson.plugins.build_timeout.global.GlobalTimeOutConfiguration;
import hudson.plugins.build_timeout.impl.AbsoluteTimeOutStrategy;
import hudson.plugins.build_timeout.operations.AbortOperation;
import hudson.plugins.build_timeout.operations.WriteDescriptionOperation;
import lombok.extern.slf4j.Slf4j;
import javax.inject.Inject;
import java.time.Duration;
import java.util.LinkedList;
import java.util.List;
import static hudson.init.InitMilestone.SYSTEM_CONFIG_ADAPTED;
@Slf4j
@Extension
public class GlobalBuildTimeoutConfigInit {
private static final Duration GLOBAL_TIMEOUT = Duration.ofHours(30);
private GlobalTimeOutConfiguration configuration;
@Inject
void setConfiguration(GlobalTimeOutConfiguration configuration) {
this.configuration = configuration;
}
@Initializer(after = SYSTEM_CONFIG_ADAPTED)
public void init() {
long hours = GLOBAL_TIMEOUT.toHours();
String description = String.format("Global timeout of %d hours elapsed", hours);
configuration.setStrategy(new AbsoluteTimeOutStrategy(String.valueOf(GLOBAL_TIMEOUT.toMinutes())));
List<BuildTimeOutOperation> operations = new LinkedList<>();
operations.add(new AbortOperation());
operations.add(new WriteDescriptionOperation(description));
configuration.setOperations(operations);
configuration.save();
log.info("Builds will abort after {} hours and set description to '{}'", hours, description);
}
} |
Following up on this -- plugin has been rolled out and is working. We've timed out 3 runaway builds so far 🎉 Thanks so much @krisstern! |
You are welcome @sghill |
This change introduces a global build time out that re-uses the existing strategies and operations. The current functionality requires users to opt-in and update the configuration of each build.
Goal
I'd like to use a global timeout to ensure builds don't run longer than the set time. Occasionally builds hang, and in large deployments it's easy for this to go unnoticed. A hanging build that goes unnoticed occupies an executor indefinitely, reducing the capacity of the system and potentially allowing the queue for a particular job to build up.
Implementation
On each build:
Implemented with a
hudson.model.listeners.RunListener
so it can be applied to every build. The existing implementation uses ahudson.tasks.BuildWrapper
, which requires a checkbox on each build.Screenshots
The configuration screen looks like this:
When a build fails due to a global timeout, it is noted in the build's log:
Extensive logging is available on the FINE level: