Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PKCE code verifier doesn't meet rfc7636 spec #1501

Closed
1 of 5 tasks
schalk-b opened this issue Apr 15, 2020 · 10 comments
Closed
1 of 5 tasks

PKCE code verifier doesn't meet rfc7636 spec #1501

schalk-b opened this issue Apr 15, 2020 · 10 comments
Assignees
Labels
enhancement Enhancement to an existing feature or behavior. msal-browser Related to msal-browser package

Comments

@schalk-b
Copy link

schalk-b commented Apr 15, 2020

Library

  • msal@1.2.1 or @azure/msal@1.2.1
  • @azure/msal-browser@2.x.x
  • @azure/msal-angular@0.x.x
  • @azure/msal-angular@1.x.x
  • @azure/msal-angularjs@1.x.x

Description

Shouldn't the PKCE code verifier meet the rfc7636 spec?

Specifically in the spec it mentions

code_verifier = high-entropy cryptographic random STRING using the
unreserved characters [A-Z] / [a-z] / [0-9] / "-" / "." / "_" / "~"
from Section 2.3 of [RFC3986], with a minimum length of 43 characters
and a maximum length of 128 characters.

However, reviewing the code #1141 for the PkceGenerator I noticed a few things.

  1. The code verifier is always 32 characters long (instead of the required minimum of 43).
  2. The code verifier characters aren't uniformly random.
/**
  * Generates a character string based on input array.
  * @param buffer 
  */
 private bufferToCVString(buffer: Uint8Array): string {
     const charArr = [];
     for (let i = 0; i < buffer.byteLength; i += 1) {
         const index = buffer[i] % CV_CHARSET.length;
         charArr.push(CV_CHARSET[index]);
     }
     return charArr.join("");
 }

Because we are using modulus buffer[i] % CV_CHARSET.length it means from our character set, all characters before '6' have a 4/256 chance of showing, '6' and after, have a 3/256 change of showing.

I know that the current implementation doesn't pose a 'serious' security risk. But I think it's worth pointing out that there is room for improvement.

@schalk-b schalk-b added the question Customer is asking for a clarification, use case or information. label Apr 15, 2020
@schalk-b
Copy link
Author

I would be happy to take this one on and submit a PR if you want me to?

It's just a matter of implementing a binary encoder for the character set in the spec: [A-Z] / [a-z] / [0-9] / "-" / "." / "_" / "~"

@pkanher617
Copy link
Contributor

pkanher617 commented Apr 16, 2020

@schalk-b Sorry for missing this post. I would be happy to review a PR that you have submitted. Please give me some time today to review the PKCE generator code and I can get back to you with a more informed answer to your original question.

@pkanher617
Copy link
Contributor

pkanher617 commented Apr 16, 2020

@schalk-b Thanks again for opening this issue.

As far as I can see, the lengths of the challenge and verifier that get generated are 43 characters. I used tests in the browser package as well as reviewed the network traces in different browsers.

it.only("generatePkceCode()", async () => {
        sinon.stub(BrowserCrypto.prototype, <any>"getSubtleCryptoDigest").callsFake(async (algorithm: string, data: Uint8Array): Promise<ArrayBuffer> => {
            expect(algorithm).to.be.eq("SHA-256");
            return crypto.createHash("SHA256").update(Buffer.from(data)).digest();
        });

        /**
         * Contains alphanumeric, dash '-', underscore '_', plus '+', or slash '/' with length of 43.
         */
        const regExp = new RegExp("[A-Za-z0-9-_+/]{43}");
        const generatedCodes: PkceCodes = await cryptoObj.generatePkceCodes();
        console.log(generatedCodes);
        console.log(generatedCodes.challenge.length);
        console.log(generatedCodes.verifier.length);
        expect(regExp.test(generatedCodes.challenge)).to.be.true;
        expect(regExp.test(generatedCodes.verifier)).to.be.true;
    });

testOutput

codeChallengeScreenshot

codeverifierScreenshot

Could you let me know what browser and environment you are using?

The randomness is something I am open to reviewing further. Could you explain how you arrived at the values of 4/256 and 3/256?

@schalk-b
Copy link
Author

schalk-b commented Apr 17, 2020

@pkanher617 Thanks for spending your time on this issue.

You are right that when calling generatePkceCodes(); the verifier is actually 43, as I only just noticed it encodes the verifier in base64. However, it does the base64 encoding, after running the random octet sequence through bufferToCVString.

So this is currently what happens:

  1. Random 32 octet sequence
    buffer = [21, 231, ..., 232]
  2. bufferToCVString(buffer)
    verifyString = uzxqicc89jAuURxH4qtzsfJH0YS2Jy.t
  3. base64urlEncode(verifyString)
    dXp4cWljYzg5akF1VVJ4SDRxdHpzZkpIMFlTMkp5LnQ

I believe that the bufferToCVString is unnecessary and basically takes some of the random, out of the random buffer (I'll show a test to demonstrate this at the end of the comment).

We can change:

   /**
     * Generates a random 32 byte buffer and returns the base64
     * encoded string to be used as a PKCE Code Verifier
     */
    private generateCodeVerifier(): string {
        try {
            // Generate random values as utf-8
            const buffer: Uint8Array = new Uint8Array(RANDOM_BYTE_ARR_LENGTH);
            this.cryptoObj.getRandomValues(buffer);
            // verifier as string
            const pkceCodeVerifierString = this.bufferToCVString(buffer);
            // encode verifier as base64
            const pkceCodeVerifierB64: string = this.base64Encode.urlEncode(pkceCodeVerifierString );
            return pkceCodeVerifierB64;
        } catch (e) {
            throw BrowserAuthError.createPkceNotGeneratedError(e);
        }
    }

to

   /**
     * Generates a random 32 byte buffer and returns the base64
     * encoded string to be used as a PKCE Code Verifier
     */
    private generateCodeVerifier(): string {
        try {
            // Generate random values as utf-8
            const buffer: Uint8Array = new Uint8Array(RANDOM_BYTE_ARR_LENGTH);
            this.cryptoObj.getRandomValues(buffer);
            // encode verifier as base64
            const pkceCodeVerifierB64: string = this.base64Encode.urlEncodeArr(buffer);
            return pkceCodeVerifierB64;
        } catch (e) {
            throw BrowserAuthError.createPkceNotGeneratedError(e);
        }
    }

And then we can completely remove the bufferToCVString method.

@schalk-b
Copy link
Author

schalk-b commented Apr 17, 2020

Sorry for this very long an ugly test, but this shows that the current bufferToCVString method that gets called removes some of the random, by not encoding correctly.

import { expect } from "chai";
import { BrowserCrypto } from "../../src/crypto/BrowserCrypto";
import { PkceGenerator } from "../../src/crypto/PkceGenerator";

describe("PkceVerifier.spec.ts Unit Tests", () => {
    it("bufferToCVString() evenly distributed encoding", async () => {
      // This only works with a very large number of tests as we are testing distribution from random numbers

      // Added here because they aren't exported from PkceGenerator
      const CV_CHARSET = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~";

      // Arrange
      const browserCrypto = new BrowserCrypto();
      const pkceGenerator = new PkceGenerator(browserCrypto);

      // We want to count the amount of times each character is encoded.
      const characterCounts = {};
      [...CV_CHARSET].forEach(char => characterCounts[char] = 0);
      let totalCharacters = 0;

      // Act
      for (let i = 0; i < 256; i++) {
        const buffer: Uint8Array = new Uint8Array(2);
        buffer[0] = i;

        for (let j = 0; j < 256; j++) {
          buffer[1] = j;
          const pkceCodeVerifierString = pkceGenerator['bufferToCVString'](buffer);
          totalCharacters += pkceCodeVerifierString.length;
          [...pkceCodeVerifierString].forEach(char => characterCounts[char]++);
        }
      }

      // Logging
      const expectedDistribution = Math.round(1 / CV_CHARSET.length * 10000) / 10000;
      console.log(`Expected distribution: ${expectedDistribution}`);
      [...CV_CHARSET].forEach(char => {
        const distribution = Math.round(characterCounts[char] / totalCharacters * 10000) / 10000;
        console.log(`${char}: ${distribution}`);
      });
      
      // Assert
      [...CV_CHARSET].forEach(char => {
        const distribution = Math.round(characterCounts[char] / totalCharacters * 10000) / 10000;
        expect(distribution).to.equal(expectedDistribution);
      });
    });
});

Example output:

Expected distribution: 0.0152
A: 0.0156
B: 0.0156
C: 0.0156
D: 0.0156
E: 0.0156
F: 0.0156
G: 0.0156
H: 0.0156
I: 0.0156
J: 0.0156
K: 0.0156
L: 0.0156
M: 0.0156
N: 0.0156
O: 0.0156
P: 0.0156
Q: 0.0156
R: 0.0156
S: 0.0156
T: 0.0156
U: 0.0156
V: 0.0156
W: 0.0156
X: 0.0156
Y: 0.0156
Z: 0.0156
a: 0.0156
b: 0.0156
c: 0.0156
d: 0.0156
e: 0.0156
f: 0.0156
g: 0.0156
h: 0.0156
i: 0.0156
j: 0.0156
k: 0.0156
l: 0.0156
m: 0.0156
n: 0.0156
o: 0.0156
p: 0.0156
q: 0.0156
r: 0.0156
s: 0.0156
t: 0.0156
u: 0.0156
v: 0.0156
w: 0.0156
x: 0.0156
y: 0.0156
z: 0.0156
0: 0.0156
1: 0.0156
2: 0.0156
3: 0.0156
4: 0.0156
5: 0.0156
6: 0.0117
7: 0.0117
8: 0.0117
9: 0.0117
-: 0.0117
.: 0.0117
_: 0.0117
~: 0.0117

@schalk-b
Copy link
Author

schalk-b commented Apr 17, 2020

You will see the distribution all the way through to 5 is about 0.0156 (4/256) and then 6 onward is 0.0117 (3/256) .

However, we expect the distribution to be 0.0152 for every character (1/66).

But to answer your question directly, I arrived at the values of 3/256 and 4/256 by:

Math.floor(256/66) = 3
Math.ceil(256/66) = 4

66 being the length of the character set.

@github-actions
Copy link
Contributor

This issue has not seen activity in 14 days. It may be closed if it remains stale.

@github-actions github-actions bot added the no-issue-activity Issue author has not responded in 5 days label May 27, 2020
@tnorling tnorling added enhancement Enhancement to an existing feature or behavior. and removed no-issue-activity Issue author has not responded in 5 days question Customer is asking for a clarification, use case or information. labels May 27, 2020
@schalk-b
Copy link
Author

schalk-b commented Jun 4, 2020

Are you open to accepting a PR for this one? Or does it need further review?

@pkanher617
Copy link
Contributor

pkanher617 commented Jun 4, 2020

Hi @schalk-b, please go ahead and submit a PR! would be happy to review it.

@tnorling tnorling added the msal-browser Related to msal-browser package label Jun 24, 2020
@pkanher617
Copy link
Contributor

Merged in #1872

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Enhancement to an existing feature or behavior. msal-browser Related to msal-browser package
Projects
None yet
Development

No branches or pull requests

3 participants