Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New token generation, support for custom hashing algorithms #1753

Merged
merged 15 commits into from
Jun 13, 2018

Conversation

buger
Copy link
Member

@buger buger commented Jun 2, 2018

Added new scheme of generating tokens, now it is JSON base64 objects, similar to how JWT works, which allows us add any meta information to the tokens.

Example of new token:
eyJvcmciOiIiLCJpZCI6IjAxZGIwN2Q5NWQ0MDRjYjM5ODg3ZjUwN2ZmNTg0OGUzIiwiaCI6Im11cm11cjEyOCJ9. After based64 decoding it turns to: {"org":"","id":"01db07d95d404cb39887f507ff5848e3","h":"murmur128”}.

Current token has 3 fields at the moment:

  • “org” - organization id
  • “id” - token id, as we had before. For regular tokens random uuid, but for example for JWTs or OpenID based on sub.
  • “h” - token hashing algorithm

At the moment following algorithms are supported: murmur32, murmur64, murmur128, sha256.

You can set hashing algorithm to use via new config variable: hash_key_function.

If config value not set, it generates legacy tokens like we had before. We may consider switching it to murmur64 by default in next releases, once feature will stabilize.

You can change hashing function dynamically, and have tokens with multiple hashing algos, including handling of legacy tokens. So if you turn on new key support, you old legacy tokens will continue working. Same if you decided to switch different algorithm or turn this off.

Fix #1694

@buger buger requested review from lonelycode and dencoded June 2, 2018 15:42
@buger
Copy link
Member Author

buger commented Jun 2, 2018

@lonelycode Worth noticing that I forked https://github.com/spaolacci/murmur3 to our org, and update d murmur32 hashing implementation to the version we had vendored. So all other algos like murmur128 got all the latest code and fixes, and murmur32 legacy support. github.com/aviddiviner/go-murmur mentioned in your proposal is older, support only murmur2 (and what we have already murmur3), and do not support 128 bit function.

@lonelycode
Copy link
Member

Awesome! Is there a way we can check the append-three character issue has been solved?

@buger
Copy link
Member Author

buger commented Jun 4, 2018

@lonelycode Yes. Added tests showing both bug and fix itself, in various combinations: 3d25da1#diff-092a80b1b0fbc9a08fef31ca557e51a2R20

api.go Outdated
@@ -235,7 +235,7 @@ func handleAddOrUpdate(keyName string, r *http.Request) (interface{}, int) {
// Only if it's NEW
switch r.Method {
case "POST":
keyName = newSession.OrgID + keyName
keyName = generateToken(newSession.OrgID, strings.TrimPrefix(keyName, newSession.OrgID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about hiding TrimPrefix-logic inside generateToken helper? so calling part won't have to remember to do this

auth_manager.go Outdated
token, err := storage.GenerateToken(orgID, keyID, config.Global().HashKeyFunction)

if err != nil {
log.WithFields(logrus.Fields{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also do .WithError(err).WithFields so error will appear as special error-field in log entry

// If hashing algorithm is empty, use legacy key generation
func GenerateToken(orgID, keyID, hashAlgorithm string) (string, error) {
if keyID == "" {
keyID = strings.Replace(uuid.NewV4().String(), "-", "", -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not efficient to get canonical string rep of UUID and then remove dashes right away using string search/replace operation, maybe it is OK as long as we don't issue tons of tokens per sec

}

if hashAlgorithm != "" {
_, err := hashFunction(hashAlgorithm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a little bit of extra work here - we are allocating an instance of hash (which is runtime sys-call and takes extra memory for a bit) just to test if token needs to have a default hash algo, then we ignore allocated space which give some extra work for GC. I am not sure compiler can optimize this.

also I like more readable form if _, err := hashFunction(hashAlgorithm); err != nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, we could introduce token as not just string but 1st class citizen in our code base as a struct, so we could put here in its fields hash algo name, Hasher func for later use in case we need to hash it (and do hash only once if it is not laready done), json and and base64 encoded representation of this token as fields as well

hashAlgorithm = defaultHashAlgorithm
}

jsonToken := fmt.Sprintf(`{"org":"%s","id":"%s","h":"%s"}`, orgID, keyID, hashAlgorithm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might worth to use string concatenation instead of Sprintf here


func hashFunction(algorithm string) (hash.Hash, error) {
switch algorithm {
case "sha256":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might worth to have constants for this values and expose them as public from storage package

func HashStr(in string) string {
h := murmur3.New32()
h, _ := hashFunction(TokenHashAlgo(in))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method HashStr is heavily used when key hashing enabled, correct me if I am wrong - we always do base64.StdEncoding.EncodeToString when we generateToken from key, org ID and then have to do base64.StdEncoding.DecodeString right away to get from storage or update from storage. this base64 encoding/decoding seems to be extra work to me

Copy link
Contributor

@dencoded dencoded left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments, mostly concerns about performance

@buger
Copy link
Member Author

buger commented Jun 6, 2018

Frankly, I would not care at all about key creation performance, since it does not need to be fast.
But note about HashStr every time using b64 is true, however, I'm not sure what we can do about it. I already added check which checks for legacy tokens without b64 decoding, but for new keys it probably will be required. Enabling caching for tokens seems like over-engineering here. b64 in general very fast, and we avoid full JSON parsing too. Let me add some benchmark results here, on how it affects basic request flow.

@buger
Copy link
Member Author

buger commented Jun 11, 2018

@dencoded fixed code styling issues. Regarding performance, as mentioned above, I would prefer not spending too much time on key creation process (if it looks good from code point of view), since this is quite rarely used functionality.

@dencoded
Copy link
Contributor

@buger I think key creation is fine. my concern was that we constantly do generateTokenand pack it into base64 encoded json, in many places - i.e. here https://github.com/TykTechnologies/tyk/pull/1753/files#diff-f0cc9b9790a59495abbf821c26cc986dR228 and then we do reverse operation here https://github.com/TykTechnologies/tyk/pull/1753/files#diff-4c166b743d95629a9ab37eb6fe3df22aR305 right away (if key hashing is enabled)

@buger
Copy link
Member Author

buger commented Jun 11, 2018

That's true but frankly, I can't see a more efficient way to encode custom metadata to token itself. I'm open to suggestions.

buger added 12 commits June 11, 2018 20:23
Added new scheme of generating tokens, now it is JSON base64 objects,
similar to how JWT works, which allows us add any meta information to
the tokens.

Example of new token:
`eyJvcmciOiIiLCJpZCI6IjAxZGIwN2Q5NWQ0MDRjYjM5ODg3ZjUwN2ZmNTg0OGUzIiwiaCI
6Im11cm11cjEyOCJ9`. After based64 decoding it turns to:
`{"org":"","id":"01db07d95d404cb39887f507ff5848e3","h":"murmur128”}`.

Current token has 3 fields at the moment:
- “org” - organization id
- “id” - token id, as we had before. For regular tokens random uuid,
but for example for JWTs or OpenID based on sub.
- “h” - token hashing algorithm

At the moment following algorithms are supported: “murmur32”,
“murmur64”, “murmur128”, “sha256”.

You can set hashing algorithm to use via new config variable:
“hash_key_function”.

If config value not set, it generate legacy tokens like we had before.
We may consider switching it to “murmur64” by default in next releases,
once feature will stabilize.

You can change hashing function dynamically, and have tokens with
multiple hashing algos, including handling of legacy tokens. So if you
turn on new hashing algo support, you old tokens will continue working,
same if you decided to switch different algorithm or turn this off.
Copy link
Contributor

@dencoded dencoded left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buger I was thinking that generateToken could return struct with pre-populated (only once) fields but it can be lots of changes in your PR. It is up to you - think current approach works, it just has a little bit extra work here. We can merge and optimize later if needed.

@buger
Copy link
Member Author

buger commented Jun 12, 2018

Ah, I see what you mean now, we can definitely hold token hash somewhere.

@buger
Copy link
Member Author

buger commented Jun 13, 2018

@dencoded change I made in #1770 should resolve the situation with key hashing called multiple times.

@buger buger merged commit 90fab73 into master Jun 13, 2018
@buger buger deleted the key-hashing-v2 branch June 13, 2018 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants