-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Opbeans" stage of release pipeline fails #2728
Comments
@trentm I'd propose to use wait-on and target it to current releasing version like this one https://www.npmjs.com/package/elastic-apm-node/v/3.34.0. |
@pazone The package version showing up at Also from above:
Perhaps that is outdated and in practice the wait time is less than that. However, I think it would be unfortunate for the agent release process to fail because there is some issue or slowness with one or more npm package mirrors/CDN nodes. It feels architecturally cleaner to have the responsibility of updating opbeans-node be a something that lives with the opbeans-node repo. |
We can do If it still takes ~1h we can trigger the opbeans job after 1h timeout. |
I'm willing to try that, and thanks very much for offering to implement that. However, if we do find it takes up to 1h for the "Opbeans" stage of the agent release pipeline to complete, then I think we should revisit this and remove the "Opbeans" stage.
What "opbeans job" do you mean here? |
I'd personally prefer Opbeans is not a public artifact that is tied to this repository. It should not influence our ability to execute the release of the agent IMO. Moving the opbeans update completely out of band seems appropriate. |
Ok let's consider it as plan A |
@pazone Is this something you or your team will have time to work on soon? If not, please let me know and I can take a stab at it. Given the current opbeans-node Jenkinsfile (https://github.com/elastic/opbeans-node/blob/main/.ci/Jenkinsfile) is just a call to an apm-pipeline-library function, I'm not sure what the preferred approach would be to supporting this. |
@trentm Hi Trent. In looking at this team's workload, it would be hard for us to get this in within the next month or so. We're happy to do it of course but if your needs are more urgent, you might want to take a swing at it. Happy to chat more to help get this prioritized correctly though. LMK. |
@cachedout Thanks and understood. I'll take a stab at it and get review from y'all. As a sanity check, my plan is to add an optional |
Responsibility for updating the elastic-apm-node dep in opbeans-node will move to *opbeans-node*'s CI. Refs: elastic/opbeans-node#164 Fixes: #2728
Hi all, If
By splitting the bump from the main pipeline then there is no need to change the What do you think? |
Responsibility for updating the elastic-apm-node dep in opbeans-node will move to *opbeans-node*'s CI. Refs: elastic/opbeans-node#164 Fixes: #2728
@v1v That sounds fine to me. A question about Jenkinsfile syntax: Can a Jenkinsfile have multiple top-level #!/usr/bin/env groovy
@Library('apm@current') _
opbeansPipeline()
pipeline {
agent { label 'linux && immutable' }
// ... my new pipeline for doing the update and tagging
} ? |
@v1v elastic/opbeans-node#163 is my attempt at doing this. |
…gent dep (#163) This adds a second "Opbeans Node Bump" Jenkins job that runs weekly on the "main" branch. It checks for an available agent update (a published version newer than what is in the current package-lock file), and if there is one it: bumps to that ver, pushes, and git tags with "v$VERSION". The push and tag will trigger the usual opbeans Jenkins pipeline to publish docker images. Refs: elastic/apm-agent-nodejs#2728 Fixes: #164 Co-authored-by: Victor Martinez <VictorMartinezRubio@gmail.com>
Recently in #2625 we automated releases: when a version tag ("vN.N.N") is pushed, a Jenkins "Release" stage will build and publish the Lambda layer, do a GitHub release,
npm publish
, and attempt to updating opbeans-node.git to use this new APM agent release.That "Opbeans" stage is flaky (or perhaps fails every time), as discussed here: #2625 (comment)
This issue is about making the release process reliable by doing something about this stage.
The Opbeans stage effectively does this: #2723 (comment)
Options
Option 1: npm publish early and hope
Do the 'npm publish' step earlier in the pipeline and hope that the lambda layer publishing steps take enough time that the Opbeans stage will work then.
I don't love this idea because relying on "hope" means that it may fail sometime, just less frequently, which just means a more subtle bug. Also see the "timeout" discussion below.
Option 2: wait for npm install to work
Add a spin loop at the start of the Opbeans stage process to retry the
npm install
if it gets an ETARGET with a timeout to account for being run soon after a publish.The "ETARGET" is referring to the specific error you get from
npm install
when this issue happens:Theoretically this option would be straightforward to implement, but what should that timeout be? Granted the issue is old (from 2018) but user reports from npm/npm#20574 suggest that the time for all npm servers to update could be an hour or more. That's too long to have as a timeout in a release process.
Option 3: use dependabot to update opbeans
Configure dependabot to look for an agent update daily.
Some issues with this:
Option 4: use a Jenkins pipeline in the opbeans repo
Add a stage to the Jenkinsfile in the opbeans repo(s) on a
cron(@daily)
to look for a new agent version, then do the update, commit, and tag.I don't see any issues with this approach other than:
This is my current preferred option.
@elastic/observablt-robots @astorm Thoughts?
The text was updated successfully, but these errors were encountered: