Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encryption efficiency improvements #539

Merged

Conversation

NickMeves
Copy link
Member

@NickMeves NickMeves commented May 10, 2020

To support future session encoding efficiencies, Encrypt/Decrypt methods need to operate
on []byte and not strings (which requires a base64 encoding of the []byte ciphertext to make it a
valid string.

Encryption work related to #523

Description

This moves encryption.Cipher to be an Interface for Encrypt([]byte) ([]byte, error) & Decrypt([]byte) (byte[], error) -- This allows various encryption techniques to be implemented that are ideal for various use cases:

(1) AES CFB + Base64 (current implementation, all code continues to use with with string=>string adjusted to []byte=>[]byte to match the interface)
(2) AES CFB: Raw CFB without Base64. Ideal in the future for any binary data we want encrypted in a cookie (it is older, slower & unauthenticated -- but doesn't have IV + Secret reuse vulnerabilities)
(3) AES GCM. Implemented, unused. Ideal for future use in session overhaul for DB backed stores that encrypt sessions with unique per session secrets (gives advantage of being authenticated encryption that CFB doesn't have)

Motivation and Context

Base64 embedded in the Encrypt/Decrypt method to result in strings that could go in JSON results in session bloat contributing to the jumbo cookie & cookie splitting issues.

Additionally, upon future Redis session refactoring for binary encoded sessions, this can replace the existing reimplementation of crypto in the redis session store:

// Use secret as the Initialization Vector too, because each entry has it's own key

How Has This Been Tested?

Lots of new crypto related unit tests added & tested interactively locally

Checklist:

  • My change requires a change to the documentation or CHANGELOG.
  • I have updated the documentation/CHANGELOG accordingly.
  • I have created a feature (non-master) branch for my PR.

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I've added a few comments especially in the tests. I focused on one of the new tests in particular but I think the comments apply across the board so please consider that. Thanks!

pkg/encryption/cipher.go Outdated Show resolved Hide resolved
pkg/encryption/cipher_test.go Outdated Show resolved Hide resolved
pkg/encryption/cipher_test.go Outdated Show resolved Hide resolved
pkg/encryption/cipher_test.go Outdated Show resolved Hide resolved
@loshz loshz removed their request for review May 10, 2020 14:04
@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 0f63bab to 03c397f Compare May 10, 2020 17:56
pkg/encryption/cipher.go Outdated Show resolved Hide resolved
pkg/encryption/cipher_test.go Outdated Show resolved Hide resolved
pkg/encryption/cipher_test.go Outdated Show resolved Hide resolved
Comment on lines 180 to 192
// Test various sizes sessions might be
for _, dataSize := range []int{10, 100, 1000, 5000, 10000} {
data := make([]byte, dataSize)
_, err := io.ReadFull(rand.Reader, data)
assert.Equal(t, nil, err)

encrypted, err := gcm.Encrypt(data)
assert.Equal(t, nil, err)
assert.NotEqual(t, encrypted, data)

decrypted, err := cfb.Decrypt(encrypted)
assert.Equal(t, nil, err)
// Data is mangled
assert.NotEqual(t, data, decrypted)
assert.NotEqual(t, encrypted, decrypted)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this loop repeated several times in this file? Could we extract it to a function maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the error scenario tests are different than the standard Encrypt/Decrypt tests. I merged the Standard vs Base64 wrapped tests into one, so that removed the duplication those two had.

@@ -0,0 +1,88 @@
package encryption
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now duplicated with the cookies package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely related to this PR cleaning that organization up: #538

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I see, this content has just been moved from cipher.go to this file, makes sense now

@NickMeves NickMeves mentioned this pull request May 10, 2020
3 tasks
@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from a032b99 to 94b88db Compare May 11, 2020 19:28
t.Run(cName, func(t *testing.T) {
// Test various sizes sessions might be
for _, dataSize := range []int{10, 100, 1000, 5000, 10000} {
t.Run(string(dataSize), func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this nesting is a bit much ... you could cut out a few levels by avoiding a t.Run() for every outer loop, and only use it for the inner-most portion, like:

	for name, initCipher := range cipherInits {
		// Test all 3 valid AES sizes
		for _, secretSize := range []int{16, 24, 32} {
			// ... prep/test secret key formats
			ciphers := map[string]Cipher{
				"Standard": cstd,
				"Base64":   cb64,
			}
			for cName, c := range ciphers {
				for _, dataSize := range []int{10, 100, 1000, 5000, 10000} {
					casename := fmt.Sprintf("%s_%d_%s_%d", name, secretSize, cName, dataSize)
					t.Run(casename, func(t *testing.T) {
						// ... actual test case here

or use some helper functions

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch 2 times, most recently from c8d5676 to 0dc7a61 Compare May 14, 2020 18:48
@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 0dc7a61 to 5d6f553 Compare May 24, 2020 17:50
@NickMeves
Copy link
Member Author

@JoelSpeed Since the primary purpose of this was to set the stage for an efficient refactor of sessions to message pack (and only improvement otherwise to current master would be code organization and more unit tests of crypto functionality):

Do we want to merge this into a feature branch on the repo (not on my fork) and have that as a landing ground for the few PRs related to session store optimizations (so they can all go together in one big bang after being reviewed separately)?

Or is there still merit for the code quality and testing purposes to get this merged into master alone?

@JoelSpeed
Copy link
Member

I think there is still merit to merging this in terms of code quality improvements and the added tests, could you please rebase

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 5d6f553 to 4a1a282 Compare May 25, 2020 15:06
@NickMeves
Copy link
Member Author

All set! 👍

Just an FYI, this commit makes the new version have slightly different behavior for future stability: 5fb4d08

This makes the decrypt operation no longer decrypt in-place into the given encrypted []byte -- but into a new []byte and leaving the original intact.

This was fine in the Base64 Decode + Decrypt merged functionality (because the decoded []byte was already a temp variable within the function). But now this leans toward the function not mutating the input bytes to the function (tests added to ensure this).

Just wanted to bring it up since that would be the one area that isn't a pure refactor on current master functionality with added tests.

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 34dc9e1 to b047ee4 Compare May 28, 2020 17:55
@NickMeves NickMeves mentioned this pull request May 30, 2020
3 tasks
Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickMeves I think this is looking good, but there are conflicts with master and the linting isn't passing, could you fix that up please, and add a changelog entry, then I'll give it another reivew

@NickMeves
Copy link
Member Author

Will do! Had it on my todo list today to rebase this off the session refactor changes that merged in. 👍

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from b047ee4 to f85a22e Compare June 1, 2020 23:16
}
// Backward compatibility with using unencrypted Email or User
// Decryption errors will leave original string
c.DecryptInto(&ss.Email)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed So I think this (and all previous implementations) are problematic -- but I'm thinking its not worth fixing because the error is rare and the need to support this will go away once we stop allowing SHA1 signed session cookies:

Since AES-CFB isn't an authenticated encryption, there's likely a subset of values that will successfully Base64 decode & then AES decrypt into a valid string (where it just assumed the first bytes are the IV vs part of the data) -> and this won't raise an error (which would then leave the User/Email as-is assuming they were legacy unencrypted). And instead they will be mangled into garbage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you could write these as _ := c.DecryptInto to make it more obvious that this can error, but we are deliberately ignoring it (in addition to the comment preferably)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickMeves I'm not sure this problem is going to be so rare, have you seen the local test environment we added recently? I spun that up (make local-env-up), logged in (v5.1.1), built from master, updated the container, then I'm getting errors because the user is not a valid header value which I've tracked to being down to this function https://github.com/golang/net/blob/627f9648deb96c27737b83199d44bb5c1010cbcf/http/httpguts/httplex.go#L302

@NickMeves
Copy link
Member Author

Something I realized in finishing up this PR that likely is out of scope here and requires a new PR:

We now require a Cipher in our options validation, it can't be nil anymore. But we have TONs of unit tests that operate with a nil cipher still. (SessionStore, SessionState & Oauth2Proxy unit tests being the worst offenders)

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 631877a to 3b85b49 Compare June 1, 2020 23:33
Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another pass of the updates, added a few comments

@@ -17,10 +17,14 @@ func timePtr(t time.Time) *time.Time {
return &t
}

func NewCipher(secret []byte) (encryption.Cipher, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be private, and given it's for testing purposes, how about newTestCipher so that we know it's meant only for testing?

}
// Backward compatibility with using unencrypted Email or User
// Decryption errors will leave original string
c.DecryptInto(&ss.Email)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you could write these as _ := c.DecryptInto to make it more obvious that this can error, but we are deliberately ignoring it (in addition to the comment preferably)

}
}
return
type Base64Cipher struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if this and the other Ciphers should be private? Since we have constructors and an interface I don't think they need to be public, WDYT?

Copy link
Member Author

@NickMeves NickMeves Jun 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the upcoming session refactor that builds off of this, different use cases use different encryption ciphers that are best for various use cases:

CFB, GCM that operate on pure []byte and never have a string are used with the MessagePack encoding (this is the secret sauce that allows us not to have a base64 encoding adding 33% bloat to fit an encrypted value into JSON that requires a string).

For Cookie sessions, pure CFB (not base64 wrapped) is used on encrypting sessions. CFB is missing authentication (which GCM has) but isn't weak to IV reuse issues (which prevents us from using GCM comfortably in cookies in case some users never cycle the cookie secret). We get authentication of the cookie via the SHA256 HMAC on the cookie.

For the encryption layer in DB based session stores (currently just Redis), GCM is great because it is better in all aspects than CFB (except IV reuse), and it doesn't have the IV reuse issue since we make a different secret that is unique to each session.

Base64 wrapped versions are left available for backwards compatibility with legacy JSON based sessions that still exist.

Hence, I left all public and accessible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, lol, I think I follow your suggestion now 😄

Treat the above as me over explaining a question you didn't ask.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's a good naming convention for the CFBCipher and GCMCipher to make the structs themselves private?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just go for cfbCipher and gcmCipher if that's ok with you?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol -- so obvious 🤦


// EncryptInto returns an error since the encrypted data is a []byte that isn't string cast-able
func (c *CFBCipher) EncryptInto(s *string) error {
return fmt.Errorf("CFBCipher is not a string->string compatible cipher")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand why this is the case? Does it not survive a round trip? Are you envisioning the EncryptInto/DecryptInto will go away as part of the session state refactoring?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the Base64 wrapped version are ensured to get a string & return a string that can use the EncryptInto functionality.

The output of Encrypt from CFB/GCM pure versions will result in []byte that doesn't cast to a string to put back into the s *string

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for the second question -- yes, they won't be needed for encrypting the MessagePack encoding (compressed or not, []byte in both cases). So the pure CFB/GCM []byte -> []byte functions can be used (and we can potentially alter them to encrypt/decrypt in place if we deem that safe -- currently they are coded to treat the input as immutable).

Would only be needed to decode legacy sessions.

@@ -0,0 +1,88 @@
package encryption
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I see, this content has just been moved from cipher.go to this file, makes sense now

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 3b85b49 to 1313841 Compare June 2, 2020 19:19
@NickMeves
Copy link
Member Author

Curious your thoughts on this move: ba1816c

The EncryptInto/DecryptInto are only really valid for the Base64 Cipher, and not for pure ones. And it would only be used in the SessionState --> and in a post session_state MessagePack refactor world, only to support legacy JSON inner field encrypted sessions.

With that in mind, does it make sense to slide those method away from the Cipher interface to just a SessionState helper method?

@JoelSpeed
Copy link
Member

Curious your thoughts on this move: ba1816c

The EncryptInto/DecryptInto are only really valid for the Base64 Cipher, and not for pure ones. And it would only be used in the SessionState --> and in a post session_state MessagePack refactor world, only to support legacy JSON inner field encrypted sessions.

With that in mind, does it make sense to slide those method away from the Cipher interface to just a SessionState helper method?

This makes sense to me yes, happy to keep that in.

Is this waiting for review from me, or is there anything more to be done?

@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from ba1816c to 998dc9a Compare June 8, 2020 02:08
@NickMeves
Copy link
Member Author

Curious your thoughts on this move: ba1816c
The EncryptInto/DecryptInto are only really valid for the Base64 Cipher, and not for pure ones. And it would only be used in the SessionState --> and in a post session_state MessagePack refactor world, only to support legacy JSON inner field encrypted sessions.
With that in mind, does it make sense to slide those method away from the Cipher interface to just a SessionState helper method?

This makes sense to me yes, happy to keep that in.

Is this waiting for review from me, or is there anything more to be done?

Just rebased off latest master. All set to go from my perspective.

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing I'd appreciate your thoughts on @NickMeves and @steakunderscore, then I'm happy for this one to get merged

Thanks for your work on it Nick!

Comment on lines 105 to 108
// Backward compatibility with using unencrypted Email or User
// Decryption errors will leave original string
_ = into(&ss.Email, c.Decrypt)
_ = into(&ss.User, c.Decrypt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we'd actually like to do is drop the backward compatibility, we discussed this in #601, basically, if we have an issue decrypting, or if the decrypted result is not valid UTF 8, then we believe that the session was previously unencrypted and decided that we should return an error, to force users to re-authenticate in this case.

So what we could do here, is just move Email and User into the range statement below I think, would have the same sort of effect right?

CC @steakunderscore can you also confirm my logic and this is what we had agreed upon

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #601 has merged I'll rebase and incorporate those changes here 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaaaaand rebased. I think I captured the spirit of #601 with how we check the errors via the into methods

These will take in []byte and not automatically
Base64 encode/decode.
Make signedValue & Validate operate on []byte
by default and not assume/cast string. Any casting
will be done from callers.
All Encrypt/Decrypt Cipher implementations will now take
and return []byte to set up usage in future binary compatible
encoding schemes to fix issues with bloat encrypting to strings
(which requires base64ing adding 33% size)
During the upcoming encoded session refactor, AES GCM is ideal
to use as the Redis (and other DB like stores) encryption wrapper
around the session because each session is encrypted with a
distinct secret that is passed by the session ticket.
Have it take in a cipher init function as an argument.
Remove the confusing `newCipher` method that matched legacy behavior
and returns a Base64Cipher(CFBCipher) -- instead explicitly ask for
that in the uses.
This helper method is only applicable for Base64 wrapped
encryption since it operated on string -> string primarily.
It wouldn't be used for pure CFB/GCM ciphers. After a messagePack
session refactor, this method would further only be used for
legacy session compatibility - making its placement in cipher.go
not ideal.
@NickMeves NickMeves force-pushed the encryption-efficiency-improvements branch from 998dc9a to 1979627 Compare June 12, 2020 21:47
Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @NickMeves

@JoelSpeed JoelSpeed merged commit a197a17 into oauth2-proxy:master Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants