Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SonarQube: Weak Cryptography in generateRandomToken #61

Closed
Uzlopak opened this issue Nov 12, 2021 · 9 comments
Closed

SonarQube: Weak Cryptography in generateRandomToken #61

Uzlopak opened this issue Nov 12, 2021 · 9 comments
Labels
discussion 🗨️ Discussion about a particular topic.

Comments

@Uzlopak
Copy link
Collaborator

Uzlopak commented Nov 12, 2021

I am currently running node-oauth2-server through sonarqube.

I know this part of the code was touched in #38.

Sonarqube shows for generateRandomToken following Security Hotspot
Cryptographic hash algorithms such as MD2, MD4, MD5, MD6, HAVAL-128, HMAC-MD5, DSA (which uses SHA-1), RIPEMD, RIPEMD-128, RIPEMD-160, HMACRIPEMD160 and SHA-1 are no longer considered secure, because it is possible to have collisions (little computational effort is enough to find two or more different inputs that produce the same hash).

We actually just want to create a random token. So this finding is not a security issue anyway.

But when I read the code it makes for me no sense, why the output of randomBytes, which outputs a cryptographical secure random value should be hashed again with sha256. We could replace this by simply doing

  generateRandomToken: function() {
    return Promise.resolve(crypto.randomBytes(32).toString('hex'));
  }

and all tests would pass. SonarQube would also not remark this as potential security issue.

@jankapunkt
Copy link
Member

@Uzlopak which branch did sonarqube actually scan? The master branch was older and still had sha1, we just merged the latest updates into master after we published release 4.1.0

@jankapunkt
Copy link
Member

Can you please rescan targeting development which is always our latest state of code?

@Uzlopak
Copy link
Collaborator Author

Uzlopak commented Nov 13, 2021

I did this in the first place with the development branch. Actually this is no real issue. It is just wasted cycles for the generateRandomToken, as it does not make any technical change, if I use randomBytes or if I use randomBytes and hash it with sha256. random is random and hasing from random is still random.

@jankapunkt jankapunkt added the discussion 🗨️ Discussion about a particular topic. label Nov 15, 2021
@Uzlopak
Copy link
Collaborator Author

Uzlopak commented Nov 15, 2021

Actually, we can do it much simpler...
According to https://datatracker.ietf.org/doc/html/rfc6749#appendix-A.17 refresh_token should be of format VSCHAR.

We could do something like this. By this, we would have some bigger range tokens.

	const result = [];
	while (result.length < length) {
		const tmp = randomBytes(1).toString("binary");
		if (vschar(tmp)) { result.push(tmp) };
	}
	return result.join("");

By this we use the full range of valid characters and not only /0-9a-f/

@Uzlopak
Copy link
Collaborator Author

Uzlopak commented Nov 18, 2021

According to https://datatracker.ietf.org/doc/html/rfc6749#section-10.10

10.10.  Credentials-Guessing Attacks

   The authorization server MUST prevent attackers from guessing access
   tokens, authorization codes, refresh tokens, resource owner
   passwords, and client credentials.

   The probability of an attacker guessing generated tokens (and other
   credentials not intended for handling by end-users) MUST be less than
   or equal to 2^(-128) and SHOULD be less than or equal to 2^(-160).

   The authorization server MUST utilize other means to protect
   credentials intended for end-user usage.

It is now very late, but I think with using the reduced ascii range (256-32=242 characters) and a length of 64 (like sha256) you would have a magnitude higher security.

Current probability to guess the token: 16^64. (= 2 ^ 256, 16 characters with a length of 64)
With this: 242^64 (~ 2^499, 242 characters and with a length of 64)

So we could use still 64 characters but have it practically uncrackable. Even Bitcoin privatekeys are easier to hack (despite they are also practically uncrackable).

To have about 2^256 strength, with this method you would need "just" a new token with a length of 33 characters.

@jankapunkt
Copy link
Member

Question is, if clients / implemenations will be able to handle data different from /0-9a-f/ so this might be breaking, right?

@Uzlopak
Copy link
Collaborator Author

Uzlopak commented Nov 18, 2021

Could be. But usually generateAccessToken and generateRefreshToken is implemented in the model and generateRandomToken is just fallback. I think it would still make sense to implement it in a major version.

A former employer uses oauth2-server and the tokens are alphanumerical and they decided to make the accessTokens suuuuper huge. I think it was 512 characters for accessToken and 2048 characters for the refreshToken. So with every call they send 0,5kb (assuming the header can only handle ascii) for validation. Then they lookup in the mongodb if this token is valid or not etc.. By implementing it this way, and documenting, that this is already very save, an implementor could use it as a reference and just store it a small string instead of thinking, that they have to increase the length of the accessToken to an arbitrary very long length.

@Uzlopak
Copy link
Collaborator Author

Uzlopak commented Nov 21, 2021

Hi,

tested the performance of the above implementation. It is quite slow. About 13.000 valid tokens per second in my fastify implementation. (I do a password grant against my in memory model).

So this is my high performance implementation, which gets about 165k /s valid tokens. I guess there is the fastify overhead, but I guess, if you run this in a benchmark, it will be also 10x faster than the above implementation.

In best case szenario it iterates over the values and checks if it is valid and if we reach the requested length it slices that Buffer and turns it into the necessary string. If we are unlucky we fill the invalid positions with fields at the back of the buffer. And if we are very unlucky, we do the whole procedure again.

It is written in typescript but this shouldnt be an issue.

import { randomBytes } from "crypto";

/**
 * 10.10.  Credentials-Guessing Attacks
 *
 * The authorization server MUST prevent attackers from guessing access
 * tokens, authorization codes, refresh tokens, resource owner
 * passwords, and client credentials.
 *
 * The probability of an attacker guessing generated tokens (and other
 * credentials not intended for handling by end-users) MUST be less than
 * or equal to 2^(-128) and SHOULD be less than or equal to 2^(-160).
 *
 * The authorization server MUST utilize other means to protect
 * credentials intended for end-user usage.
 */
export function generateRandomToken(length = 33): string {
	let tmp: Buffer;
	let valid = false;
	while (!valid) {
		let pos = 0;
		let endPos = (length * 5) - 1;
		tmp = randomBytes(length * 5);
		while (pos < length) {
			if (
				tmp[pos] < 0x21 ||
				tmp[pos] > 0x7E
			) {
				tmp[pos] = tmp[endPos];
				endPos--;
				if (endPos < pos) {
					break;
				}
				continue;
			}

			if (++pos === length) {
				valid = true;
				break;
			}
		}
	}
	return tmp!.slice(0, length).toString("binary");
}

export default generateRandomToken;

And this the corresponding unit test

describe(
	"generateRandomToken",
	() => {
		it(
			"should return a 64 character long token",
			() => {
				expect(generateRandomToken(64)).to.have.lengthOf(64);
			}
		);

		it(
			"should return a valid token",
			() => {
				expect(vschar(generateRandomToken(64))).to.be.equal(true);
			}
		);

		it(
			"check if generateRandomToken creates valid vschar tokens by generating 100000 tokens",
			() => {
				for (let i = 0, il = 100000; i < il; i++) {
					expect(vschar(generateRandomToken(64))).to.be.equal(true);
				}
			}
		)
	}
);

@jankapunkt
Copy link
Member

@Uzlopak would you mind opening a PR for this, so we can leverage GitHub actions for code quality assurance and the review function for better reasoning about the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion 🗨️ Discussion about a particular topic.
Projects
None yet
Development

No branches or pull requests

2 participants