Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Compressing a Session #523

Closed
wants to merge 15 commits into from

Conversation

NickMeves
Copy link
Member

@NickMeves NickMeves commented May 6, 2020

Support optional compression strategy on the session store -- using MessagePack encoding, LZ4 compression (for fast decompression behavior compared to alternative compression libraries), and changes to scope of encrypting data.

UPDATE: After discussion, this also now moves to encoding all sessions in msgpack (which can hold encrypted data without base64 encoding bloat) -- even non compressed sessions. In my cookie session tests, just the messagepack/encryption shift without compression results in a cookie size decrease from ~5000 => ~3500 bytes.

The compression flag now just adds a light layer of LZ4 in the middle of the messagePack encoding/decoding process.

Encryption on sessions in redis updated to use AES-GCM (rather than legacy double AES-CFB)

Description

Behind a --session-store-compression flag:

  • [REMOVED - POC IN EARLIER COMMITS] Opportunistically base64 decode the 3 tokens (if they can be)
  • Store the encoded session in MessagePack (instead of JSON since it allows binary data)
  • [ENABLED BY FLAG] LZ4 compress the MessagePack encoded session
  • Encrypt the whole payload now (rather than for each sub field like before with a base64 encoding to be a valid string for JSON).
  • Wrap message in new lightweight SessionEnvelope struct (allows easier detection of changed session cookie types & legacy coming in).
  • Now Base64 the session for the cookie with encryption.SignedValue. This is the only Base64 encoding now.

Some rough numbers after testing various sessions with a generic OIDC provider:
Current _oauth2_proxy cookie size: ~5000 bytes
Compressed: ~2700 bytes
Compressed (without LZ4 compression, just base64 tricks + messagePack): ~3500 bytes

Motivation and Context

Current session cookies are sometimes bloated (hence the cookie splitting issue). With the current implementation, there's the potential for access/id/refresh tokens to have 3 levels of base64 encoding bloat (adding 33% (my math right?) size increase each time) when 1 is potentially needed for the final cookie.

#482
#522

How Has This Been Tested?

Unit tests added for all new code, aimed to only add new functionality if this feature flag is enabled.
Tested interactively locally on MacOS

Checklist:

  • My change requires a change to the documentation or CHANGELOG.
  • I have updated the documentation/CHANGELOG accordingly.
  • I have created a feature (non-master) branch for my PR.

Nick Meves added 5 commits May 5, 2020 17:30
These will take in []byte and not automatically
Base64 encode/decode.
Make signedValue & Validate operate on []byte
by default and not assume/cast string. Any casting
will be done from callers.
This way, future session strategy changes can be more
easily made (or client shifts in session store) without a
hard cut between formats. Use MsgPack so binary data can be
stored (fixing the double Base64 encode issues)
pkg/sessions/utils/envelope/constants.go Outdated Show resolved Hide resolved
pkg/sessions/utils/envelope.go Outdated Show resolved Hide resolved
}

// Encrypt with AES CFB on raw bytes
func (c *Cipher) EncryptCFB(value []byte) ([]byte, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are split out now from the legacy Encrypt and Decrypt to not be forced to do a base64 encoding in all cases.

This would be where we add EncryptGCM support if desired to speed up the encryption/decryption layer.

return []byte{}, err
}

// The Compress:Decompress ratio is 1:Many. LZ4 gives fastest decompress speeds
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LZ4 Compression here is the layer that potentially adds the most CPU & speed concerns. LZ4 chosen for its dominance in decompression throughput compared to other decryption libraries (https://github.com/lz4/lz4#benchmarks)

Without LZ4 (but with other tricks), test sessions reduced from ~5000 bytes to ~3500 bytes

With LZ4, reduced further to about ~2700 bytes. No significant difference in our payloads between level 0 compression vs level 9, so left as 0 (fastest)

@NickMeves
Copy link
Member Author

NickMeves commented May 6, 2020

This is my first time doing significant coding in Golang in a while, so feel free to be vicious in comments where I've made egregious style faux pas 😄

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @NickMeves, thanks for raising this PR. I think the concepts in general are a good idea and I would love to see at least some of this getting merged.

I am wondering now, having read about half of this, whether this could potentially be broken down somehow into smaller chunks and introduced in steps? WDYT?

Introducing msgpack and compression seems good to me, though I'm not a fan of modifying the Access, ID and Refresh tokens that we receive from ID Providers. We support many different providers and there's no guarantee that we would be able to do this in a reversible way so I think I would feel better if we never modified these once we have received them. Does that make the compression not worth it, I'm not sure, how do you feel given that?

@@ -101,6 +101,7 @@ An example [oauth2-proxy.cfg]({{ site.gitweb }}/contrib/oauth2-proxy.cfg.example
| `-resource` | string | The resource that is protected (Azure AD only) | |
| `-reverse-proxy` | bool | are we running behind a reverse proxy, controls whether headers like X-Real-Ip are accepted | false |
| `-scope` | string | OAuth scope specification | |
| `-session-store-compression` | bool | Compress session store with LZ4 (cookie session store only) | false |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be cookie-compression instead? Why is it only supported by cookie stores? This seems odd

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured this would be as useful option for all stores as well -- I scoped my work to the cookie session as a first step since that is what we are using in our environment.

I didn't have a redis session store handy to test those changes. I can give that a stab in another PR? Or this one (assuming we like the SessionEnvelope msgpack encoding wrapper to give details about the session coming in).

Cipher *encryption.Cipher `cfg:",internal"`
Redis RedisStoreOptions `cfg:",squash"`
Type string `flag:"session-store-type" cfg:"session_store_type" env:"OAUTH2_PROXY_SESSION_STORE_TYPE"`
CompressedSession bool `flag:"session-store-compression" cfg:"session_store_compression" env:"OAUTH2_PROXY_SESSION_STORE_COMPRESSION"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this would be better named CompressSession, wdyt? Also, Is this the right place, if we decide this is cookie compression then it should be in the CookieOptions, TBD

Suggested change
CompressedSession bool `flag:"session-store-compression" cfg:"session_store_compression" env:"OAUTH2_PROXY_SESSION_STORE_COMPRESSION"`
CompressSession bool `flag:"session-store-compression" cfg:"session_store_compression" env:"OAUTH2_PROXY_SESSION_STORE_COMPRESSION"`

Comment on lines 299 to 300
// We assume if a token successfully base64 decodes, we can encode back later
// Otherwise we leave the original token
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of decoding these tokens. I appreciate it can save some date (about 33% per token), but it can see this potentially causing issues. For example, ID Tokens are normally JWTs, which are three Base64 encoded strings separated by periods, which wouldn't decode well.

@NickMeves
Copy link
Member Author

Hey @NickMeves, thanks for raising this PR. I think the concepts in general are a good idea and I would love to see at least some of this getting merged.

I am wondering now, having read about half of this, whether this could potentially be broken down somehow into smaller chunks and introduced in steps? WDYT?

Introducing msgpack and compression seems good to me, though I'm not a fan of modifying the Access, ID and Refresh tokens that we receive from ID Providers. We support many different providers and there's no guarantee that we would be able to do this in a reversible way so I think I would feel better if we never modified these once we have received them. Does that make the compression not worth it, I'm not sure, how do you feel given that?

Ack! Forget about the . breaking up the 3 JWT fields, good point. No wonder I wasn't getting the compression ratio I had hoped with LZ4 (I had intended it to see the similar claim names & values inside of the JWTs with our session user/email/expires). My test was likely only getting compression savings on the JWT headers since that Base64 likely matched.

I tried to make the decode savings attempt only work if the operation succeeded (meaning the original value could be returned from a base64 encode no problem). Any idea the various token type formats you get?

I'd like to try to get a good reliable/scalable way to dig into these -- this is probably the key to massive compression space savings.

Otherwise we should ditch the compression idea and just do msgpack encoding -- that at least saves the middle base64 encode on encrypted tokens in JSON. But that space savings likely isn't enough in alot of cases to prevent cookie splitting.

@JoelSpeed
Copy link
Member

I tried to make the decode savings attempt only work if the operation succeeded (meaning the original value could be returned from a base64 encode no problem). Any idea the various token type formats you get?

I'm not really sure what they are to be honest, we have a lot of providers and they don't all follow specs accurately

Otherwise we should ditch the compression idea and just do msgpack encoding -- that at least saves the middle base64 encode on encrypted tokens in JSON. But that space savings likely isn't enough in alot of cases to prevent cookie splitting.

This seems like a good start at least, also the encrypting the whole thing rather than individually may make some difference I would have thought too? WDYT?

@JoelSpeed
Copy link
Member

I'm also not too concerned about the cookie splitting issue. We are going to deprecate it and try and add more session storage options for people. Hopefully it doesn't affect that many people, as far as I'm aware, it's normally just the Azure users that it affects

@NickMeves
Copy link
Member Author

@JoelSpeed This commit has an idea of splitting the JWTs for better compression (haven't written unit tests yet in case we decide to trash this idea).

ccb59e3

I was underwhelmed with the space savings -- brought me from ~2700 bytes trying to compress raw base64 encoded JWTs down to ~2300 bytes. Very meh.

So maybe we can ditch this potentially unsafe strategy and rely on the slight benefit of LZ4 catching very similar JWT headers that are the same B64 encoded text?

@NickMeves
Copy link
Member Author

I'm also not too concerned about the cookie splitting issue. We are going to deprecate it and try and add more session storage options for people. Hopefully it doesn't affect that many people, as far as I'm aware, it's normally just the Azure users that it affects

At least for us using the generic OIDC provider hooking into OneLogin, our access + id + refresh tokens are taking us over the cookie limit so we are getting split (like I said, about ~5000 chracters).

This potentially gets larger when we add in custom claims to pass down the OneLogin Roles in the IDToken.

Additionally, move to AES-GCM encryption for sessions
in Redis since they each have distinct keys.
@NickMeves
Copy link
Member Author

I ditched the JWT introspection since the savings weren't so great.

I've rolled out the concept of a StoreEnvelope wrapping the session store details (encryption algorithm, store type, if compressed) to both cookie & redis -- I also took a stab at allowing compressed sessions to work in Redis.

I've additionally shifted the underlying encryption in Redis to AES-GCM (definitely safe, since each one has a distinct secret).

// Holdover behavior from Legacy decode
// NOTE: this makes decode NOT a 1:1 reversal of Encode
// TODO: Is this the best place for this logic?
if ss.User == "" {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the situtation with this? I don't like how it makes data != Decode(Encode(data)) in some cases.

Is this a legacy artifact? If not, does it make sense to assign the user to the email if the user is missing outside of the Decode function so it can remain the inverse of Encode?

@@ -105,16 +104,6 @@ func (p *ProviderData) GetLoginURL(redirectURI, state string) string {
return a.String()
}

// CookieForSession serializes a session state for storage in a cookie
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these used at all anymore? Or has their usage been replaced by the equivalent CookieForSession in utils?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I believe they have pretty much been replaced

@NickMeves NickMeves changed the title Support Compressing a Cookie Session Support Compressing a Session May 7, 2020
store.CookieOptions,
expires,
now,
)
}

func (store *SessionStore) storeValue(ctx context.Context, value string, expiration time.Duration, requestCookie *http.Cookie) (string, error) {
func (store *SessionStore) storeValue(ctx context.Context, value []byte, expiration time.Duration, requestCookie *http.Cookie) (string, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method SessionStore.storeValue has 5 return statements (exceeds 4 allowed).

return session, nil
}

func (store *SessionStore) legacyV5LoadSessionFromTicket(ctx context.Context, value string) (*sessions.SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method SessionStore.legacyV5LoadSessionFromTicket has 5 return statements (exceeds 4 allowed).

@@ -141,7 +198,7 @@ func legacyDecodeSessionStatePlain(v string) (*SessionState, error) {

// legacyDecodeSessionState attempts to decode the session state string
// generated by v3.1.0 or older
func legacyDecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
func legacyV3DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function legacyV3DecodeSessionState has 6 return statements (exceeds 4 allowed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's time to drop this? It's been a while, people should have upgraded by now. I might raise a separate PR for that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a note in this PR: #524 (comment)

The moment we trigger a hard shift to SHA256 signed cookies instead of SHA1, we can ditch all Legacy code, since the old SHA1 cookies won't validate.

Is that a good time to drop all these legacy areas?

@@ -184,12 +241,12 @@ func legacyDecodeSessionState(v string, c *encryption.Cipher) (*SessionState, er
}

// DecodeSessionState decodes the session cookie string into a SessionState
func DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
var ssj SessionStateJSON
func LegacyV5DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function LegacyV5DecodeSessionState has 6 return statements (exceeds 4 allowed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a complicated function, maybe we could simplify this in another PR before we change all of this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code should be identical to existing (with some things changes to be named Legacy). The idea was this is the route to handle any existing JSON sessions that trickle in after the shift to MessagePack encoded versions.

// loadSessionFromString loads the session based on the ticket value
func (store *SessionStore) loadSessionFromString(ctx context.Context, value string) (*sessions.SessionState, error) {
// loadSessionFromTicket loads the session based on the ticket value
func (store *SessionStore) loadSessionFromTicket(ctx context.Context, value []byte) (*sessions.SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method SessionStore.loadSessionFromTicket has 8 return statements (exceeds 4 allowed).

}

// SessionFromCookie deserializes a session from a cookie value
func SessionFromCookie(v string, c *encryption.Cipher) (s *sessions.SessionState, err error) {
return sessions.DecodeSessionState(v, c)
func SessionFromCookie(v []byte, c *encryption.Cipher) (*sessions.SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function SessionFromCookie has 5 return statements (exceeds 4 allowed).

@@ -184,12 +241,12 @@ func legacyDecodeSessionState(v string, c *encryption.Cipher) (*SessionState, er
}

// DecodeSessionState decodes the session cookie string into a SessionState
func DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
var ssj SessionStateJSON
func LegacyV5DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function LegacyV5DecodeSessionState has 65 lines of code (exceeds 50 allowed). Consider refactoring.

@@ -184,12 +241,12 @@ func legacyDecodeSessionState(v string, c *encryption.Cipher) (*SessionState, er
}

// DecodeSessionState decodes the session cookie string into a SessionState
func DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
var ssj SessionStateJSON
func LegacyV5DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function LegacyV5DecodeSessionState has a Cognitive Complexity of 42 (exceeds 20 allowed). Consider refactoring.


"github.com/oauth2-proxy/oauth2-proxy/pkg/apis/sessions"
"github.com/oauth2-proxy/oauth2-proxy/pkg/encryption"
"github.com/oauth2-proxy/oauth2-proxy/pkg/sessions/envelope"
)

// CookieForSession serializes a session state for storage in a cookie
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this utils area by refactored and just move this logic directly into the cookie_session? That seems to be the only place that uses these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that makes sense

@codeclimate
Copy link

codeclimate bot commented May 8, 2020

Code Climate has analyzed commit 45e7a77 and detected 8 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 8

View more on Code Climate.

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another quick pass and added some more comments. I'm still finding this quite hard to digest at the moment, for a project that only has 6000 lines of code, a 1200 line PR is pretty substantial.
Do you think there's any logical break points within the changes that could be changed into smaller PRs to make this easier to understand and introduce to the project?

Also, don't worry too much about the code climate thing, we just introduced this and are exploring what sort of things it recommends

Comment on lines 19 to 36
// MessagePack naming aligned to match with internal JWT claims for compression synergy
type SessionState struct {
AccessToken string `json:",omitempty"`
IDToken string `json:",omitempty"`
CreatedAt time.Time `json:"-"`
ExpiresOn time.Time `json:"-"`
RefreshToken string `json:",omitempty"`
Email string `json:",omitempty"`
User string `json:",omitempty"`
PreferredUsername string `json:",omitempty"`
AccessToken string `json:",omitempty" msgpack:"at,omitempty"`
IDToken string `json:",omitempty" msgpack:"it,omitempty"`
CreatedAt time.Time `json:"-" msgpack:"-"`
ExpiresOn time.Time `json:"-" msgpack:"-"`
RefreshToken string `json:",omitempty" msgpack:"rt,omitempty"`
Email string `json:",omitempty" msgpack:"e,omitempty"`
User string `json:",omitempty" msgpack:"u,omitempty"`
PreferredUsername string `json:",omitempty" msgpack:"pu,omitempty"`
}

// SessionStateJSON is used to encode SessionState into JSON without exposing time.Time zero value
type SessionStateJSON struct {
// SessionStateEncoded is used to encode SessionState into JSON/MessagePack
// without exposing time.Time zero value
type SessionStateEncoded struct {
*SessionState
CreatedAt *time.Time `json:",omitempty"`
ExpiresOn *time.Time `json:",omitempty"`
CreatedAt *time.Time `json:",omitempty" msgpack:"ca,omitempty"`
ExpiresOn *time.Time `json:",omitempty" msgpack:"eo,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise you didn't add it, but do you think we could merge these two? ie always have time pointers? Would that still work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this a whirl in this commit: 124eb5b

I didn't know why the special time logic -- but I left it intact to not reintroduce whatever bug it might've fixed in the past.

pkg/apis/sessions/session_state.go Outdated Show resolved Hide resolved
@@ -141,7 +198,7 @@ func legacyDecodeSessionStatePlain(v string) (*SessionState, error) {

// legacyDecodeSessionState attempts to decode the session state string
// generated by v3.1.0 or older
func legacyDecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
func legacyV3DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's time to drop this? It's been a while, people should have upgraded by now. I might raise a separate PR for that

@@ -184,12 +241,12 @@ func legacyDecodeSessionState(v string, c *encryption.Cipher) (*SessionState, er
}

// DecodeSessionState decodes the session cookie string into a SessionState
func DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
var ssj SessionStateJSON
func LegacyV5DecodeSessionState(v string, c *encryption.Cipher) (*SessionState, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a complicated function, maybe we could simplify this in another PR before we change all of this

pkg/sessions/redis/redis_store.go Outdated Show resolved Hide resolved
@@ -128,9 +132,20 @@ func (store *SessionStore) Save(rw http.ResponseWriter, req *http.Request, s *se
return err
}

se := &envelope.StoreEnvelope{
Type: envelope.RedisType,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does knowing the type help?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought process here was a having this only added a few bytes to the MessagePack encoded message -- with the benefit being on decode operations as customers potentially switching between session store types.

Instead of trying to throw errors decoding the raw bytes in the data field when types are mismatched, use the Type header to know exactly how to decode a session that comes in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, SGTM


"github.com/oauth2-proxy/oauth2-proxy/pkg/apis/sessions"
"github.com/oauth2-proxy/oauth2-proxy/pkg/encryption"
"github.com/oauth2-proxy/oauth2-proxy/pkg/sessions/envelope"
)

// CookieForSession serializes a session state for storage in a cookie
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that makes sense

@@ -105,16 +104,6 @@ func (p *ProviderData) GetLoginURL(redirectURI, state string) string {
return a.String()
}

// CookieForSession serializes a session state for storage in a cookie
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I believe they have pretty much been replaced

@NickMeves
Copy link
Member Author

Had another quick pass and added some more comments. I'm still finding this quite hard to digest at the moment, for a project that only has 6000 lines of code, a 1200 line PR is pretty substantial.
Do you think there's any logical break points within the changes that could be changed into smaller PRs to make this easier to understand and introduce to the project?

Also, don't worry too much about the code climate thing, we just introduced this and are exploring what sort of things it recommends

I hear you 😄 -- I think after our conversations, this PR has 3 main changes that are slightly interrelated and benefit from rolling out together to limit needing to support legacy types during shifts:

(1) MessagePack + Optional LZ4 on sessions instead of JSON
(2) StoreEnvelope: wrapping encoded sessions to give context on how to decode the session easily that is future proofed
(3) Streamline the encryption of the session in the Redis store (it previously wasn't using the cipher encrypt functions and had odd IV selection) -- Use AES-GCM encryption since the encryption keys are unique per session which mitigates GCM IV + Secret reuse vulnerabilities.

(3) is the least related -- but it benefits rolling out with (1) & (2) since we can know the new format uses the improved crypto and don't need to juggle an inline AES-CFB that needs refactoring.

Nick Meves added 3 commits May 8, 2020 09:30
Align with legacy where missing cookie ciphers resulted
in even redis not saving tokens.
This object is available in SessionOptions.Cipher
and SessionOptions is already available in the
store.
Move cookieSession functions to cookie session store
& align the double implementation of SecretBytes to be
united and housed under encryption
@NickMeves
Copy link
Member Author

Something I'm weighing: for sets of options that result in a full session with tokens encoded (vs just user + email -- a nuance I missed earlier for when a Cipher is nil) -- with the new msgpack encoding, do we just want to default to these large session encodings with tokens to use LZ4 (and not be a flag)?

And leave minimal sessions that don't include tokens as uncompressed (since trying to compress payloads that small is worthless)?

Nick Meves added 3 commits May 8, 2020 10:48
Additionally, fix the session object modification bug
introduced by minimal encoding. Add unit tests to ensure
`Encode` doesn't mangle the original struct.
CookieSecret is only required to be a valid AES length if
used in encryption. This is now only the case for Cookie Stores.

Redis uses its own dynamic 32 byte secret for each ticket for
AES-GCM encryption in the store. So it never needs to run in minimal
mode and strip the tokens from the session anymore.

In Redis legacy fallback mode, it will only attempt a legacy session
load if the CookieSecret is a valid length -- since then it is required
to decode the legacy session style that had embedded AES-CFB encryption.
@JoelSpeed
Copy link
Member

@NickMeves please review #535 #536 and #537 as I believe we should try and merge these first, and then tackle this problem

@NickMeves
Copy link
Member Author

@JoelSpeed Agreed!

I actually think this PR should be broken up into three components after thinking on it more -- that should help ease the evaluation:

(1) Refactoring sessions to MessagePack (This actually gave the biggest size reduction of ~30%) -- We can work to align ideas here between this and #536 - I think they are very similar endeavors.

(2) Compressing sessions with LZ4 (This gave an additional ~25%)

(3) Switching redis session encryption to GCM (This was most unrelated part of this PR)

@JoelSpeed
Copy link
Member

(1) Refactoring sessions to MessagePack (This actually gave the biggest size reduction of ~30%) -- We can work to align ideas here between this and #536 - I think they are very similar endeavors.

(2) Compressing sessions with LZ4 (This gave an additional ~25%)

(3) Switching redis session encryption to GCM (This was most unrelated part of this PR)

(1) Yep, this makes the most sense to do and IMO should be highest priority. My intention with #536 was mainly to tidy up, as much as possible, what the decoding would look like, so that we have a neater "legacy" decode once we implement the MessagePack. Though, so future archaeologists can see the progression, it made sense to clean up the encode as well, even though it will theoretically not live for very long

(2) Yep, this also makes sense to add, but should be done after all the current work around this area is done

(3) Could/Should this be done in parallel? I don't know how important this part is

@NickMeves
Copy link
Member Author

@JoelSpeed I've split the various ideas in this PR up into these smaller ones:

#538
#539
grnhse#3 (Draft off #539)

Should we close this monolith and let its legacy live on in those smaller PRs?

@JoelSpeed
Copy link
Member

@NickMeves If you've got a PR or Draft PR open for each of the sections, then sure, let's close this one down as it won't be needed anymore right?

@loshz loshz removed their request for review May 12, 2020 14:36
@NickMeves NickMeves closed this May 14, 2020
@NickMeves NickMeves mentioned this pull request May 30, 2020
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants