Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEP-0089] - Spire Package #5039

Merged
merged 1 commit into from
Sep 19, 2022
Merged

Conversation

pxp928
Copy link
Member

@pxp928 pxp928 commented Jun 25, 2022

Signed-off-by: pxp928 parth.psu@gmail.com

Changes

Spire package separated out from #4759 as requested. It includes the spire interface with a mocked spire for testing. As requested by @afrittoli.

/kind feature

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Docs included if any changes are user facing
  • Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been filled in
    (if there are no user facing changes, use release note "NONE")
Spire package that includes the controller and entrypointer API that is used in for [TEP-0089: Non-falsifiable provenance support](https://github.com/tektoncd/community/blob/main/teps/0089-nonfalsifiable-provenance-support.md). This is just the spire package. The phase 1 PR for the TEP is located here -> https://github.com/tektoncd/pipeline/pull/4759 

@tekton-robot tekton-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 25, 2022
@tekton-robot tekton-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 25, 2022
@tekton-robot
Copy link
Collaborator

Hi @pxp928. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@afrittoli
Copy link
Member

/ok-to-test

@tekton-robot tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 28, 2022
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 0.0%
pkg/spire/entrypointer.go Do not exist 0.0%
pkg/spire/sign.go Do not exist 17.6%
pkg/spire/spire_mock.go Do not exist 85.5%
pkg/spire/verify.go Do not exist 17.3%

@afrittoli afrittoli added this to the Pipelines v0.38 milestone Jun 28, 2022
@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 4, 2022
Comment on lines 91 to 129
func NewSpireControllerAPIClient(c spireconfig.SpireConfig) ControllerAPIClient {
if c.MockSpire {
return &MockClient{}
}
return &spireControllerAPIClient{
config: c,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have two different factories here, one the tests and one for the actual usage, instead of embedding a MockSpire attribute in the client struct spireControllerAPIClient, it would be more consistent with all the other clients used in the controllers, like Tekton API clients, cloudevents client and cache client.

For instance, if you look at the https://github.com/tektoncd/pipeline/tree/main/pkg/reconciler/events/cache package, there are two clients defined there, the regular one and the mock one.

The knative/pkg injection machinery makes sure that the correct client (regular or mock) is injected into the context, so that when you get the client from the context you get a different one, depending on whether it's unit tests or actual running code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattmoor may want to have a say on this too 🙏

Copy link
Member Author

@pxp928 pxp928 Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see. I might need some help setting this up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to deal with client injection, you could just pass a client rather than the config options in for the reconciler options.

Or since this client creation isn't actually doing too much, test clients could also just override this after the reconciler is created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems do-able, does this look about right? https://github.com/pxp928/pipeline/pull/26/files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lumjjb - that's definitely in the right direction!

The only question I have is whether we could get the config from the context, so that when the client are setup the config is added directly, and we don't need to add it later on. Ideally we should add the client config once, unless we need to support reloading configuration changes at run time. Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @afrittoli ! It's in a weird spot with the initialization of the TaskRun reconciler, since some of the options come in as CLI arguments. Do you have any recommendations to handle this?

@afrittoli
Copy link
Member

Thanks @pxp928 for moving this to a separate PR!

I'm reviewing the PR, but I find the module structure and test strategy not 100% clear:
In my understanding this is the role of each module is:

  • spire.go: defines two interfaces, ControllerAPIClient and EntrypointerAPIClient. (NIT: maybe we call call this clients?)
  • controller.go: implements some of ControllerAPIClient, including type, constructor and some of the required method, except for VerifyTaskRunResults
  • verify.go: implements the rest of ControllerAPIClient, specifically VerifyTaskRunResults (NIT: maybe call this controller_verify?)
  • entrypointer.go: implements some of EntrypointerAPIClient, including type, constructor and some of the required method, except for Sign
  • sign.go: implements the rest of EntrypointerAPIClient, specifically VerifyTaskRunResults, specifically VerifyTaskRunResults (NIT: maybe call this controller_verify?)esults` (NIT: maybe call this entrypointer_sign?)
  • spire_mock.go: defines a mock implementation of both interfaces. This bypasses most of the code in all the actual implementation, but it does invoke some of the actual functions, like verifyManifest and getResultValue
  • spire_mock_tests.go: unit tests for spire_mock.go. These tests verify the logic in the mocks and some private functions in the verify and sign modules, but I find this a bit problematic:
    • it's very opaque what is tested and what not by the unit tests. This will make changing the code in future a rather convoluted task as changing private functions might break mock unit tests that depend on them. We have a coding guideline of testing only exported functions, and this is doing the opposite
    • there is no unit test coverage at all for a lot of the code defined in the actual implementations

The structure I would expect for this is something like a mock client, that implements only functions that already exists in the original client, something like dial, close, get resource, list resources, create resource, delete resource. The ControllerAPIClient and EntrypointerAPIClient would only be responsible for implementing higher abstraction functions like VerifyTaskRunResults and Sign. They would take a client as input, which would be the actual client at runtime, or the mock client at unit test time.

/cc @tektoncd/core-maintainers thoughts?

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, I've not completed the review yet, but I would like to clarify the overall code structure first.

pkg/spire/entrypointer.go Outdated Show resolved Hide resolved
pkg/spire/config/config.go Outdated Show resolved Hide resolved
pkg/spire/config/config.go Show resolved Hide resolved
pkg/spire/config/config.go Outdated Show resolved Hide resolved
pkg/spire/controller.go Outdated Show resolved Hide resolved
pkg/spire/spire.go Show resolved Hide resolved
pkg/spire/spire.go Show resolved Hide resolved
pkg/spire/spire.go Outdated Show resolved Hide resolved
pkg/spire/sign.go Outdated Show resolved Hide resolved
pkg/spire/verify.go Outdated Show resolved Hide resolved
@pxp928
Copy link
Member Author

pxp928 commented Jul 4, 2022

Thanks @pxp928 for moving this to a separate PR!

I'm reviewing the PR, but I find the module structure and test strategy not 100% clear: In my understanding this is the role of each module is:

  • spire.go: defines two interfaces, ControllerAPIClient and EntrypointerAPIClient. (NIT: maybe we call call this clients?)

  • controller.go: implements some of ControllerAPIClient, including type, constructor and some of the required method, except for VerifyTaskRunResults

  • verify.go: implements the rest of ControllerAPIClient, specifically VerifyTaskRunResults (NIT: maybe call this controller_verify?)

  • entrypointer.go: implements some of EntrypointerAPIClient, including type, constructor and some of the required method, except for Sign

  • sign.go: implements the rest of EntrypointerAPIClient, specifically VerifyTaskRunResults, specifically VerifyTaskRunResults (NIT: maybe call this controller_verify?)esults` (NIT: maybe call this entrypointer_sign?)

  • spire_mock.go: defines a mock implementation of both interfaces. This bypasses most of the code in all the actual implementation, but it does invoke some of the actual functions, like verifyManifest and getResultValue

  • spire_mock_tests.go: unit tests for spire_mock.go. These tests verify the logic in the mocks and some private functions in the verify and sign modules, but I find this a bit problematic:

    • it's very opaque what is tested and what not by the unit tests. This will make changing the code in future a rather convoluted task as changing private functions might break mock unit tests that depend on them. We have a coding guideline of testing only exported functions, and this is doing the opposite
    • there is no unit test coverage at all for a lot of the code defined in the actual implementations

The structure I would expect for this is something like a mock client, that implements only functions that already exists in the original client, something like dial, close, get resource, list resources, create resource, delete resource. The ControllerAPIClient and EntrypointerAPIClient would only be responsible for implementing higher abstraction functions like VerifyTaskRunResults and Sign. They would take a client as input, which would be the actual client at runtime, or the mock client at unit test time.

/cc @tektoncd/core-maintainers thoughts?

Let's setup a meeting and to go through this. @lumjjb and I can talk through the intent for each component of the spire package and determine what changes need to be done.

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 0.0%
pkg/spire/entrypointer.go Do not exist 0.0%
pkg/spire/sign.go Do not exist 17.6%
pkg/spire/spire_mock.go Do not exist 85.5%
pkg/spire/verify.go Do not exist 17.3%

@pxp928
Copy link
Member Author

pxp928 commented Jul 4, 2022

/kind feature

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 4, 2022
pkg/spire/controller.go Show resolved Hide resolved
pkg/spire/controller.go Outdated Show resolved Hide resolved
Comment on lines 91 to 129
func NewSpireControllerAPIClient(c spireconfig.SpireConfig) ControllerAPIClient {
if c.MockSpire {
return &MockClient{}
}
return &spireControllerAPIClient{
config: c,
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to deal with client injection, you could just pass a client rather than the config options in for the reconciler options.

Or since this client creation isn't actually doing too much, test clients could also just override this after the reconciler is created.

pkg/spire/entrypointer.go Outdated Show resolved Hide resolved
pkg/spire/sign.go Show resolved Hide resolved
pkg/spire/spire.go Outdated Show resolved Hide resolved
pkg/spire/spire_mock.go Show resolved Hide resolved
return nil, err
}
trustPool := x509.NewCertPool()
for _, c := range x509Bundle[0].X509Authorities() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be looping over all bundles? Would also help protect if len(x509Bundle) == 0

Copy link
Member Author

@pxp928 pxp928 Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A trust bundle is a collection of one or more CA root certificates that the workload should consider trustworthy. So it depends on how the spire server is configured. Hence the looping. Added the len(x509Bundle) check

Copy link
Member

@wlynch wlynch Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have phrased that better - I meant should we be looping over x509Bundle (as opposed to just the authorities in first one) i.e.

trustPool := x509.NewCertPool()
for _, bundle := range x509Bundle {
	for _, c := range bundle.X509Authorities() {
		trustPool.AddCert(c)
	}
	return trustPool, nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Added.

pkg/spire/verify.go Show resolved Hide resolved
pkg/spire/verify.go Outdated Show resolved Hide resolved
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreeing with @wlynch and @afrittoli, in term of code organization especially for test.

}
if len(unset) > 0 {
sort.Strings(unset)
return fmt.Errorf("found unset image flags: %s", unset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the error is correct here (no image involevd)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the phase 1 PR. This PR is missing this context as this is just the breakout for the spire package. The flags are coming from the pipeline controller when its image is initialized.

flag.StringVar(&opts.SpireConfig.TrustDomain, "spire-trust-domain", "example.org", "Experimental: The SPIRE Trust domain to use.")
flag.StringVar(&opts.SpireConfig.SocketPath, "spire-socket-path", "/spiffe-workload-api/spire-agent.sock", "Experimental: The SPIRE agent socket for SPIFFE workload API.")
flag.StringVar(&opts.SpireConfig.ServerAddr, "spire-server-addr", "spire-server.spire.svc.cluster.local:8081", "Experimental: The SPIRE server address for workload/node registration.")
flag.StringVar(&opts.SpireConfig.NodeAliasPrefix, "spire-node-alias-prefix", "/tekton-node/", "Experimental: The SPIRE node alias prefix to use.")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be great to make this error more descriptive. This error will be met when SpireConfig is missing all four required flags which in turn means no image initialized 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to "found unset spire configuration flags"

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 0.0%
pkg/spire/entrypointer.go Do not exist 0.0%
pkg/spire/sign.go Do not exist 16.7%
pkg/spire/spire_mock.go Do not exist 85.5%
pkg/spire/verify.go Do not exist 16.8%

@pxp928 pxp928 force-pushed the package-spire branch 2 times, most recently from f58c601 to 5a536d4 Compare July 11, 2022 17:18
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 0.0%
pkg/spire/entrypointer.go Do not exist 0.0%
pkg/spire/sign.go Do not exist 16.7%
pkg/spire/spire_mock.go Do not exist 85.5%
pkg/spire/verify.go Do not exist 16.8%

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 1.0%
pkg/spire/entrypointer.go Do not exist 3.0%
pkg/spire/sign.go Do not exist 16.7%
pkg/spire/spire_mock.go Do not exist 82.4%
pkg/spire/verify.go Do not exist 16.8%

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 37.7%
pkg/spire/entrypointer.go Do not exist 88.9%
pkg/spire/sign.go Do not exist 81.4%
pkg/spire/spire_mock.go Do not exist 80.4%
pkg/spire/verify.go Do not exist 82.1%

@pxp928
Copy link
Member Author

pxp928 commented Aug 9, 2022

/retest

1 similar comment
@lumjjb
Copy link
Contributor

lumjjb commented Aug 9, 2022

/retest

@tekton-robot tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 11, 2022
@lumjjb
Copy link
Contributor

lumjjb commented Aug 12, 2022

Hi @pritidesai we've made your requested! Can you take a look at the PR again - would like to get this across the line! Thanks!

Signed-off-by: pxp928 <parth.psu@gmail.com>
@tekton-robot tekton-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 31, 2022
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/spire/controller.go Do not exist 37.7%
pkg/spire/entrypointer.go Do not exist 88.9%
pkg/spire/sign.go Do not exist 81.4%
pkg/spire/spire_mock.go Do not exist 80.4%
pkg/spire/verify.go Do not exist 82.1%

Copy link
Member

@wlynch wlynch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@pritidesai can you take another look at this for approval? 👀

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 2, 2022
Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
@pritidesai @afrittoli one last review before removing the hold 🙏🏼

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 15, 2022
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester, wlynch

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2022
@wlynch
Copy link
Member

wlynch commented Sep 19, 2022

Since this PR already has an LGTM from me and has been sitting without additional feedback for weeks, I'm going to go ahead and remove the hold to let this submit.

Nothing in this PR is a one-way door decision and is not wired up to the controller yet, so there's not much risk in submitting this even if there is additional feedback. Feel free to continue to leave feedback on this PR - we can follow up with additional PRs if needed!

/remove-hold

@tekton-robot tekton-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 19, 2022
@tekton-robot tekton-robot merged commit 2272ded into tektoncd:main Sep 19, 2022
@afrittoli
Copy link
Member

Great to see this moving forward! I'm sorry I wasn't able to re-review before, but I'm glad to see progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants