Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slack: periodic jobs missing URL #3759

Closed
howardjohn opened this issue Jan 11, 2022 · 11 comments
Closed

slack: periodic jobs missing URL #3759

howardjohn opened this issue Jan 11, 2022 · 11 comments
Assignees

Comments

@howardjohn
Copy link
Member

2022-11-01_15-58-30

Since the format is the same for all job types I am guessing .Status.URL} is not set somehow?

@howardjohn
Copy link
Member Author

Actually even prow.istio.io has no link, so I guess its not really slack reporting broken but something lower level

@howardjohn
Copy link
Member Author

Timing aligns with 8c51a11 @cjwagner

@cjwagner
Copy link
Member

I think we accidentally nuked the test-pods namespace (and in the process the deployment job itself) due to calling kubectl apply twice with the same pruning resource list. We may need to use server-side apply for everything or stop using this pruning pattern.
For now I've reverted the change and manually redeployed. Jobs seem to be starting now, e.g. https://prow.istio.io/view/gs/istio-prow/logs/ci-test-infra-branchprotector/1481082039586263040

@cjwagner
Copy link
Member

I'm not sure why this change didn't cause the ProwJob CRD to be pruned: #3744
As far as I can tell it should have, but it didn't before or when I ran make deploy locally just now.

I'm kind of thinking this pruning logic may be more hazardous than it is worth.

@howardjohn
Copy link
Member Author

howardjohn commented Jan 12, 2022 via email

@cjwagner
Copy link
Member

Does k8s do anything like this or just plain kubectl apply? If so, does it handle removal of resources at all? I can imagine that needing to remove is pretty uncommon

K8s uses bazel to do a kubectl apply there is no resource removal. Needing to remove is pretty uncommon and is extremely easy.
The original motivation for the pruning is to help avoid tech debt by ensuring everything in the cluster is checked in. Its a good motivation, but I don't think we really have issues with folks manually applying to the cluster and not checking in the changes any more. I think this pruning mechanism is more often hurting us than helping us in its current form.

@howardjohn
Copy link
Member Author

howardjohn commented Jan 12, 2022 via email

@ericvn
Copy link
Contributor

ericvn commented Jan 12, 2022

Removal seems to have put the URL back, but now the job is failing with:

Test started yesterday at 7:54 PM failed after 15m0s. (more info)
The pod could not start because it could not mount the volume "oauth": secret "oauth-token" not found

@ericvn
Copy link
Contributor

ericvn commented Jan 12, 2022

The ci-prow-autobump is also failing with the missing token.

@cjwagner
Copy link
Member

Everything should be fixed now.
Here is a cover for the pitfall: #3764

/assign
/close

@istio-testing
Copy link
Collaborator

@cjwagner: Closing this issue.

In response to this:

Everything should be fixed now.
Here is a cover for the pitfall: #3764

/assign
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants