Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to find and verify existing OIDC providers before we try to create a new one #2901

Merged
merged 4 commits into from Oct 19, 2022

Conversation

codablock
Copy link
Contributor

@codablock codablock commented Nov 1, 2021

What type of PR is this?
/kind bug

What this PR does / why we need it:
When moving clusters between management clusters, ControlPlane.Status.OIDCProvider.ARN
is lost. The new management cluster must then pickup the already existing
cluster, as otherwise it tries to create the same provider again and then
fails.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • adds unit tests
  • adds or updates e2e tests

Release note:

Try to find and verify existing OIDC providers before we try to create a new one

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority labels Nov 1, 2021
@k8s-ci-robot
Copy link
Contributor

@codablock: This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 1, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @codablock. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 1, 2021
@randomvariable
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 2, 2021
@richardcase
Copy link
Member

Shame we stored this in the status in the original implementation.

@codablock - do you see any potential issues (i.e. edge cases) with trying to to automatically find the oidc provider?

@codablock
Copy link
Contributor Author

@richardcase I can't imagine any. I was first thinking that I could also try to update an existing OIDC provider's thumbprint ind client IDs, but then decided that I should not touch them as we can't know for sure WHY they would differ from the expected values...if they do, there is probably a good reason (e.g. manually created provider, for whatever reason).

@codablock
Copy link
Contributor Author

@richardcase Any update on how to proceed with this PR?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 14, 2022
@richardcase
Copy link
Member

/lifecycle frozen

This looks good to me. I can't imagine any issues with this myself. @sedefsavas - what do you think?

@k8s-ci-robot
Copy link
Contributor

@richardcase: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/lifecycle frozen

This looks good to me. I can't imagine any issues with this myself. @sedefsavas - what do you think?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@richardcase
Copy link
Member

@codablock - this looks good to me. Do you think there is anyway we can add a test around this functionality?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 11, 2022
@codablock
Copy link
Contributor Author

Unfortunately I'm a bit overloaded atm and won't find time to add tests.

@richardcase
Copy link
Member

/help

@Skarlso
Copy link
Contributor

Skarlso commented Sep 29, 2022

This is a nightmare to test. :D

After I finally managed to get the HTTPS test server working with proper certs, now I'm in trouble with PatchObject... :D

@Skarlso
Copy link
Contributor

Skarlso commented Sep 29, 2022

Yesss!! Almost there. :D now trying to reconcile the LAST HURDLE. :D This part:

	if err := s.reconcileTrustPolicy(); err != nil {
		return errors.Wrap(err, "failed to reconcile trust policy in workload cluster")
	}

:D

@Skarlso
Copy link
Contributor

Skarlso commented Sep 29, 2022

OH. MY. GODS.

I0929 22:29:41.349199   66964 oidc.go:54] "Reconciling EKS OIDC Provider" cluster-name=0xc000200960
--- PASS: TestOIDCReconcile (0.05s)
    --- PASS: TestOIDCReconcile/cluster_create_with_no_OIDC_provider_present_yet_should_create_one (0.05s)
PASS

Edit: NOPE. My Kind test cluster was running and it connected to that. :(

@Skarlso
Copy link
Contributor

Skarlso commented Sep 29, 2022

Hah! I found a bug! :D :D Testing++++++

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 29, 2022
@Skarlso
Copy link
Contributor

Skarlso commented Sep 29, 2022

@richardcase @codablock This was DIFFICULT not gonna lie. :D But I added a test. :) And I found a bug where this was being done:

		if fmt.Sprintf("https://%s", *provider.Url) != issuerURL.String() {
			continue
		}

But the provider url MUST contain https already anyways so this errored. Now, was there a reason for this being there?
I can add a thing like, check if https is there and if not, append it. But the specification is telling me there should be https on the retrieved provider always. 🤷

@Skarlso
Copy link
Contributor

Skarlso commented Sep 30, 2022

TLS handshake error from 127.0.0.1:36514: remote error: tls: bad certificate

Sh*t. I was afraid of that. :/ The certificate is not installed/recognised by the local environment. So it doesn't like it.

@Skarlso
Copy link
Contributor

Skarlso commented Sep 30, 2022

Dang it. This won't work. :( I have to find a different approach and mock the http client completely.

type ServiceOpts func(s *Service)

// WithIAMClient creates an access spec with a custom http client.
func WithIAMClient(client *http.Client) ServiceOpts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, I can't create a legit https service because the certificate is not trusted. So I had to fall back to making the client mockable.

If anyone has a better idea, I'm all ears.

Also, I made this an option to minimise the impact on changing NewService.

}

func fetchRootCAThumbprint(issuerURL string, client *http.Client) (string, error) {
// needed to appease noctx.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Funny" how noctx never noticed "http.Get" but noticed client.Get immediately. Tsk, tsk, tsk.

@Skarlso
Copy link
Contributor

Skarlso commented Oct 1, 2022

This is now ready for review. :)

@Skarlso
Copy link
Contributor

Skarlso commented Oct 3, 2022

ping @Ankitasw as Richard is OOO. :))

@Skarlso
Copy link
Contributor

Skarlso commented Oct 12, 2022

/test ?

@k8s-ci-robot
Copy link
Contributor

@Skarlso: The following commands are available to trigger required jobs:

  • /test pull-cluster-api-provider-aws-build
  • /test pull-cluster-api-provider-aws-test
  • /test pull-cluster-api-provider-aws-verify

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-provider-aws-apidiff-main
  • /test pull-cluster-api-provider-aws-e2e
  • /test pull-cluster-api-provider-aws-e2e-blocking
  • /test pull-cluster-api-provider-aws-e2e-clusterclass
  • /test pull-cluster-api-provider-aws-e2e-conformance
  • /test pull-cluster-api-provider-aws-e2e-conformance-with-ci-artifacts
  • /test pull-cluster-api-provider-aws-e2e-eks

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-provider-aws-apidiff-main
  • pull-cluster-api-provider-aws-build
  • pull-cluster-api-provider-aws-test
  • pull-cluster-api-provider-aws-verify

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Skarlso
Copy link
Contributor

Skarlso commented Oct 12, 2022

/test pull-cluster-api-provider-aws-e2e-eks

Hmm, hmm, unsure if this passes oidc part. Probably not. But at least it will test that EKS didn't break. :)

@Skarlso
Copy link
Contributor

Skarlso commented Oct 19, 2022

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 19, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Skarlso

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 19, 2022
@k8s-ci-robot k8s-ci-robot merged commit 95e8972 into kubernetes-sigs:main Oct 19, 2022
@k8s-ci-robot k8s-ci-robot modified the milestones: v1.6.0, v1.x Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants