Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic retry mechanism if task fail to fetch material #6252

Open
ariefrahmansyah opened this issue May 6, 2019 · 12 comments
Open

Automatic retry mechanism if task fail to fetch material #6252

ariefrahmansyah opened this issue May 6, 2019 · 12 comments
Labels

Comments

@ariefrahmansyah
Copy link

ariefrahmansyah commented May 6, 2019

Issue Type
  • Feature proposal
Summary

I'm using GitLab as main Git repository and sometimes I have connectivity issue that made git fetching failed. Can GoCD have an automatic retry mechanism to re-trigger the material fetching task?

Any other info

I'm thinking for something like Concourse's attempt: https://concourse-ci.org/attempts-step-modifier.html

@arvindsv
Copy link
Member

arvindsv commented May 6, 2019

@ariefrahmansyah If it's an auto polled material, it should retry in around a minute. If not, there's currently no mechanism to retry (a forced trigger or timer, for instance).

@ariefrahmansyah
Copy link
Author

ariefrahmansyah commented May 7, 2019

@arvindsv sorry, it's not auto poll material, but it's when running task failed to fetch material. the task will be failed.

image

If it happens, we need to manually retrigger the task.

@ariefrahmansyah ariefrahmansyah changed the title Automatic Retry Mechanism For Failed Material Fetching Automatic retry mechanism if task fail to fetch material May 7, 2019
@maheshp maheshp added this to the NextUp milestone May 30, 2019
@stale
Copy link

stale bot commented Apr 1, 2020

This issue has been automatically marked as stale because it has not had activity in the last 90 days.
If you can still reproduce this error on the master branch using local development environment or on the latest GoCD Release, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@stale stale bot added the stale label Apr 1, 2020
@stale stale bot closed this as completed Apr 8, 2020
@dominik-ba
Copy link

I would really love to have this feature!

@chadlwilson chadlwilson added no stalebot Don't mark this stale. enhancement materials and removed stale labels Feb 1, 2022
@chadlwilson chadlwilson removed this from the NextUp milestone Feb 1, 2022
@chadlwilson chadlwilson reopened this Feb 1, 2022
@mattgauntseo-sentry
Copy link

We run into this issue every now and then, it would be great to have a method to set retry count and / or a wait between retries.

@chadlwilson
Copy link
Member

@mattgauntseo-sentry Have you determined root cause of the issue in your case? Is it a git material or some other material type or plugin material? Is it a temporary connectivity problem or target material source rate limit?

@mattgauntseo-sentry
Copy link

We use github and it presents itself as a ssh connection issue. It's always intermittent. I don't believe this is a rate limit, but wouldn't rule it out.

I'll share an example log when it happens again.

@chadlwilson
Copy link
Member

chadlwilson commented Sep 26, 2023

Hmm. GitHub definitely has some sort of rate limit on https (although mixed info available on that, as they say they dont limit?) and GoCD's polling nature can be a problem for that, but not 100% sure what it looks like for ssh, and whether it's different on enterprise plans.

Regardless useful to know, as there are probably multiple layers this could be addressed (specific material type, more abstract for all materials - that type of thing.

Is this a standard OOTB GoCD git material or some extra plugin (eg for PRs)?

@mattgauntseo-sentry
Copy link

This is a standard OOTB git material with an ssh url (git@github.com/...). I believe these are repos in an enterprise plan.

@mattgauntseo-sentry
Copy link

Just ran into this during a test:

[go] Start to prepare <pipeline name>/52/deploy_primary/1/deploy on <elastic agent name>-<elastic agent ID> [/go]
[go] Start to update materials.

[go] Start updating <pipeline group> at revision <commit revision> from git@github.com:<org>/<repo>.git
STDERR: Cloning into '/go/pipelines/<pipeline name>/<pipeline group>'...
[GIT] Fetching changes
STDERR: ssh: connect to host github.com port 22: Connection timed out
STDERR: fatal: Could not read from remote repository.
STDERR: 
STDERR: Please make sure you have the correct access rights
STDERR: and the repository exists.
git fetch failed for [git@github.com:<org>/<repo>.git]

@chadlwilson
Copy link
Member

Yeah, that looks like a general networking issue where at best anything done at GoCD level will be a bit of a fudge. But perhaps a necessary one in some setups.

If you see these types of errors exclusively on agents rather than the server logs you might want to explore anything different about the agent environment or specific to elastic agents (although the server can typically recover due to its polling nature, assuming you're not using webhooks).

@mattgauntseo-sentry
Copy link

I've seen this on the server too with config repos hitting this error, but in those cases the polling gets us to an eventual good state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants