Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add certificates deduplication feature #184

Merged
merged 1 commit into from
Sep 28, 2023
Merged

Add certificates deduplication feature #184

merged 1 commit into from
Sep 28, 2023

Conversation

arsenalzp
Copy link
Contributor

This PR is related to the issue de-duplicate CA from the trust bundle #155
It add feature of certificates de-duplication, meaning to exclude certificates which were already added to a certificates bundle, thus save memory.

@jetstack-bot jetstack-bot added dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 23, 2023
@jetstack-bot
Copy link
Contributor

Hi @arsenalzp. Thanks for your PR.

I'm waiting for a cert-manager member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jetstack-bot jetstack-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 23, 2023
@arsenalzp arsenalzp changed the title add certificates deduplication feature Add certificates deduplication feature Sep 23, 2023
Copy link
Contributor

@erikgb erikgb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenalzp, thanks for this! This is almost LGTM from me. I have just a few comments/questions.

pkg/util/pem.go Outdated
@@ -120,3 +125,19 @@ func DecodeX509CertificateChainBytes(certBytes []byte) ([]*x509.Certificate, err

return certs, nil
}

// Check if the given certificate was added to certificates bundle already
func IsCertificateDuplicate(certsHash map[[32]byte]struct{}, certData []byte) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about creating a type for this function? Maybe it could be named certPool? 😉 With an appendCertFromPEM func, the type could also hold the certificates returned from the outer function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd support this as a new type, ideally in internal if possible so nobody else relies on it!

Copy link
Contributor Author

@arsenalzp arsenalzp Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added CertPool type type with related methods.
It's not possible to put that structure into internal directory, due to limitation of importing packages which are located in internal sub-directories.

pkg/util/pem.go Outdated
// Check if the given certificate was added to certificates bundle already
func IsCertificateDuplicate(certsHash map[[32]byte]struct{}, certData []byte) bool {
// calculate hash sum of the given certificate
hash := sha256.Sum256(certData)
Copy link
Contributor

@erikgb erikgb Sep 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that x509.CertPool uses sha256.Sum224 for deduplicating certs. Any particular reason you went for something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that CertPool uses sha224! I think sha256 is fine here though, it's what I'd have reached for as a "default"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,
Thank you so much for your feedback, it is very appreciated!
There weren't any reasons to use sha224 against sha256, they have almost equal computation complexity, bus some differences have present:

https://www.rfc-editor.org/rfc/rfc3874.txt

1.1. Usage Considerations

Since SHA-224 is based on SHA-256, roughly the same amount of effort
is consumed to compute a SHA-224 or a SHA-256 digest message digest
value. Even though SHA-224 and SHA-256 have roughly equivalent
computational complexity, SHA-224 is an appropriate choice for a
one-way hash function that provides 112 bits of security. The use of
a different initial value ensures that a truncated SHA-256 message
digest value cannot be mistaken for a SHA-224 message digest value
computed on the same data.

Some usage environments are sensitive to every octet that is
transmitted. In these cases, the smaller (by 4 octets) message
digest value provided by SHA-224 is important.

These observations lead to the following guidance:

  • When selecting a suite of cryptographic algorithms that all offer
    112 bits of security strength, SHA-224 is an appropriate choice
    for one-way hash function.

  • When terseness is not a selection criteria, the use of SHA-256 is
    a preferred alternative to SHA-224.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our purposes we have to choice algorithm which proposes:

  • acceptable computation complexity and
  • minimize hash collisions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with sha256. 😃

)

// CertPool is a set of certificates.
type CertPool struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref. #184 (comment) I think this type should be defined in an internal package. Or as an unexported type in the package it's used. We mustn't tempt anyone to use it. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!
Let's do it!

Copy link
Contributor Author

@arsenalzp arsenalzp Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CertPool and its methods were made unexported, so certPool is used by util package only

pkg/util/pem.go Outdated
@@ -87,7 +88,10 @@ func ValidateAndSplitPEMBundle(data []byte) ([][]byte, error) {
return nil, fmt.Errorf("invalid PEM block in bundle; invalid PEM certificate: %w", err)
}

certificates = append(certificates, pem.EncodeToMemory(b))
// check the duplicate certificates
if !certPool.IsCertificateDuplicate(b.Bytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes missing here? I think the idea was to use the new certPool type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sure! I didn't recognize your requirement properly!
I'm going to fix the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes missing here? I think the idea was to use the new certPool type.

If function ValidateAndSplitPEMBundle returns certPool type instead of [][]byte then some tests will fail, due to their dependencies on that function' result, so I leave it as is.
However, I decided to move operation with certPool to ValidateAndSanitizePEMBundle function.
Could you please review it?

Copy link
Contributor

@erikgb erikgb Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you just return certPool.getCertsPEM() from ValidateAndSplitPEMBundle? Am I missing something? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, in that case we can! I thought you wanted to return certPool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you just return certPool.getCertsPEM() from ValidateAndSplitPEMBundle? Am I missing something? 🤔

I fixed a code as you required, all tests were passed.
Could you please review?

Copy link
Contributor

@erikgb erikgb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks for addressing the review comments. I have just a question about some commented code.

/ok-to-test

// // p is nil and the whole of the input is returned in rest.
// if len(PEMdata) != 0 {
// return fmt.Errorf("error of decoding PEM block")
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the out-commented code intentional? I am not very familiar with this PEM decoding, but if this code is not worth having, I think we should delete it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the out-commented code intentional? I am not very familiar with this PEM decoding, but if this code is not worth having, I think we should delete it.

Yes of course, I will fix it.
Thank you so much for your help, hints and patience, it is very appreciated!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the out-commented code intentional? I am not very familiar with this PEM decoding, but if this code is not worth having, I think we should delete it.

Hello,
Comments were trimmed as you requested.

@jetstack-bot jetstack-bot added ok-to-test and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 26, 2023
@erikgb
Copy link
Contributor

erikgb commented Sep 26, 2023

/retest

1 similar comment
@erikgb
Copy link
Contributor

erikgb commented Sep 26, 2023

/retest

Copy link
Contributor

@erikgb erikgb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/cc @SgtCoDFish

Thanks @arsenalzp! With this new certPool type I think this can be improved even further in a follow-up PR.

Maybe you can rebase and squash commits?

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2023
Signed-off-by: Oleksandr Krutko <alexander.krutko@gmail.com>

add certPool structure and add related functions

Signed-off-by: Oleksandr Krutko <alexander.krutko@gmail.com>

add header to comply with code format

Signed-off-by: Oleksandr Krutko <alexander.krutko@gmail.com>

substitute IsCertificateDuplicate func to CertPool method

Signed-off-by: Oleksandr Krutko <alexander.krutko@gmail.com>

convert CertPool and its mothods to unexported ones

Signed-off-by: Oleksandr Krutko <alexander.krutko@gmail.com>

move certPool functionality to ValidateAndSplitPEMBundle func

Signed-off-by: Oleksandr Krutko <alexander.krutko@gmail.com>

Update cert_pool.go

Trim comments in the certPool code.

Signed-off-by: Oleksandr Krutko <46520164+arsenalzp@users.noreply.github.com>
@jetstack-bot jetstack-bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2023
@arsenalzp
Copy link
Contributor Author

/lgtm /cc @SgtCoDFish

Thanks @arsenalzp! With this new certPool type I think this can be improved even further in a follow-up PR.

Maybe you can rebase and squash commits?

It was done.

@erikgb
Copy link
Contributor

erikgb commented Sep 28, 2023

/retest
/lgtm

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Sep 28, 2023
Copy link
Member

@SgtCoDFish SgtCoDFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks so much @arsenalzp (and @erikgb for reviewing)! 😁

@jetstack-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: erikgb, SgtCoDFish

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jetstack-bot jetstack-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 28, 2023
@jetstack-bot jetstack-bot merged commit 34b2c0f into cert-manager:main Sep 28, 2023
4 checks passed
@arsenalzp
Copy link
Contributor Author

/lgtm /approve

Thanks so much @arsenalzp (and @erikgb for reviewing)! 😁

Thank you for your assistance!

@arsenalzp arsenalzp deleted the addCertsDedup branch November 6, 2023 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. lgtm Indicates that a PR is ready to be merged. ok-to-test size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants