Add `rb/weak-sensitive-data-hashing` query port #16781

alexrford · 2024-06-18T16:52:26Z

This is a replacement of #12782, porting py/weak-sensitive-data-hashing to Ruby.

As with the previous PR, the python version is a little more sophisticated in terms of its source modeling. Compared to before, we have moved towards a concept of SensitiveDataSources as sources rather than just any SensitiveNode. This accounts for cases like:

def login_with_password(passwd) # the `passwd` parameter here should be marked as the source
  some_unremarkable_method(passwd) # the access to `passwd` is a `SensitiveNode`, but is not a useful source
  Digest::MD5.hexdigest(passwd) # read of `passwd` here is a sink
end

where using SensitiveNodes as sources would result in the read of passwd as an argument to some_unremarkable_method also being marked as a source.

Ideally we would move this out to a shared file to match Python's SensitiveDataSources.qll, but I've kept this as a private module in SensitiveDataHashingCustomization.qll to restrict the impact of this change for now.

github-actions · 2024-06-18T16:53:33Z

QHelp previews:

ruby/ql/src/queries/security/cwe-327/WeakSensitiveDataHashing.qhelp

Use of a broken or weak cryptographic hashing algorithm on sensitive data

Using a broken or weak cryptographic hash function can leave data vulnerable, and should not be used in security related code.

A strong cryptographic hash function should be resistant to:

pre-image attacks: if you know a hash value h(x), you should not be able to easily find the input x.
collision attacks: if you know a hash value h(x), you should not be able to easily find a different input y with the same hash value h(x) = h(y).
In cases with a limited input space, such as for passwords, the hash function also needs to be computationally expensive to be resistant to brute-force attacks. Passwords should also have an unique salt applied before hashing, but that is not considered by this query.

As an example, both MD5 and SHA-1 are known to be vulnerable to collision attacks.

Since it's OK to use a weak cryptographic hash function in a non-security context, this query only alerts when these are used to hash sensitive data (such as passwords, certificates, usernames).

Use of broken or weak cryptographic algorithms that are not hashing algorithms, is handled by the rb/weak-cryptographic-algorithm query.

Recommendation

Ensure that you use a strong, modern cryptographic hash function:

such as Argon2, scrypt, bcrypt, or PBKDF2 for passwords and other data with limited input space.
such as SHA-2, or SHA-3 in other cases.

Example

The following example shows two functions for checking whether the hash of a certificate matches a known value -- to prevent tampering. The first function uses MD5 that is known to be vulnerable to collision attacks. The second function uses SHA-256 that is a strong cryptographic hashing function.

require 'openssl'

def certificate_matches_known_hash_bad(certificate, known_hash)
  hash = OpenSSL::Digest.new('SHA1').digest certificate
  hash == known_hash
end

def certificate_matches_known_hash_good(certificate, known_hash)
  hash = OpenSSL::Digest.new('SHA256').digest certificate
  hash == known_hash
end

Example

The following example shows two functions for hashing passwords. The first function uses SHA-256 to hash passwords. Although SHA-256 is a strong cryptographic hash function, it is not suitable for password hashing since it is not computationally expensive.

require 'openssl'

def get_password_hash(password, salt)
  OpenSSL::Digest.new('SHA256').digest(password + salt) # BAD
end

The second function uses Argon2 (through the argon2 gem), which is a strong password hashing algorithm (and includes a per-password salt by default).

require 'argon2'

def get_initial_hash(password)
  Argon2::Password.create(password)
end

def check_password(password, known_hash)
  Argon2::Password.verify_password(password, known_hash)
end

References

OWASP: Password Storage Cheat Sheet
Common Weakness Enumeration: CWE-327.
Common Weakness Enumeration: CWE-328.
Common Weakness Enumeration: CWE-916.

joefarebrother

Looks good
I've noticed that some sensitive data queries exclude the id classification, but this appears not to. It seems like it's python counterpart doesn't either.
Does it matter very much?

RasmusWL

LGTM 👍

I don't quite understand the need to move SensitiveDataSources away from the SensitiveActions.qll file in f217de9? -- even when rb/weak-sensitive-data-hashing is the only Ruby query to use it, would there be a problem from it existing in the file? (as I understand it, it wouldn't affect existing uses of SensitiveNode)

I've noticed that some sensitive data queries exclude the id classification, but this appears not to. It seems like it's python counterpart doesn't either.
Does it matter very much?

we've done that since the results were simply not interesting for most people. Personally I feel someone doing MD5 of usernames sounds like a potential problem, so for now I think we should keep it 👍

RasmusWL · 2024-07-01T15:47:10Z

In the performance evaluation I see an increase in 'String cache size' for a single project 🤔 I didn't notice any kind of string manipulation being introduced in this PR... do you have an explanation for this? (maybe I overlooked something) -- If this project had 1000 new alerts I might understand it from the additional alert messages, but it doesn't seem like it even has 1 new alert.

alexrford · 2024-07-23T16:45:48Z

I don't quite understand the need to move SensitiveDataSources away from the SensitiveActions.qll file in f217de9? -- even when rb/weak-sensitive-data-hashing is the only Ruby query to use it, would there be a problem from it existing in the file? (as I understand it, it wouldn't affect existing uses of SensitiveNode)

This was just to avoid committing to a new public interface at this point - it shouldn't change anything in terms of functionality of this or any other query. The new sources could potentially be useful for other future queries, at which point it would make sense to undo this change (and maybe tidy up the interface before making it public if necessary).

In the performance evaluation I see an increase in 'String cache size' for a single project 🤔 I didn't notice any kind of string manipulation being introduced in this PR... do you have an explanation for this? (maybe I overlooked something) -- If this project had 1000 new alerts I might understand it from the additional alert messages, but it doesn't seem like it even has 1 new alert.

Honestly, I don't have an explanation for this. The alert message concatenation should be the only new string manipulation in this PR.

alexrford added 9 commits June 17, 2024 15:27

Ruby: minimal port of py/weak-sensitive-data-hashing

d4203d9

Ruby: rb/weak-sensitive-data-hashing qhelp

f017821

Ruby: Add SensitiveDataSource abstract class

6a46fb5

Ruby: Move SensitiveDataSource logic into a private module

f217de9

Ruby: fix some SensitiveDataSource definitions

81ec686

Ruby: add tests for rb/weak-sensitive-data-hashing

d994959

Ruby: changenote for rb/weak-sensitive-data-hashing

7380e29

Ruby: remove unused import

d79a253

Ruby: remove outdated test comment

51f3f15

alexrford added the Ruby label Jun 18, 2024

github-actions bot added the documentation label Jun 18, 2024

alexrford marked this pull request as ready for review June 19, 2024 11:38

alexrford requested a review from a team as a code owner June 19, 2024 11:38

alexrford mentioned this pull request Jun 19, 2024

Ruby: port py/weak-sensitive-data-hashing #12782

Closed

sidshank requested review from aibaars, RasmusWL and joefarebrother June 20, 2024 15:42

joefarebrother reviewed Jun 27, 2024

View reviewed changes

RasmusWL approved these changes Jul 1, 2024

View reviewed changes

alexrford merged commit 9fb657c into github:main Jul 25, 2024
26 checks passed

alexrford deleted the rb/weak-sensitive-data-hashing branch July 25, 2024 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `rb/weak-sensitive-data-hashing` query port #16781

Add `rb/weak-sensitive-data-hashing` query port #16781

alexrford commented Jun 18, 2024 •

edited

Loading

github-actions bot commented Jun 18, 2024

Use of a broken or weak cryptographic hashing algorithm on sensitive data

Recommendation

Example

Example

References

joefarebrother left a comment

RasmusWL left a comment

RasmusWL commented Jul 1, 2024

alexrford commented Jul 23, 2024

Add rb/weak-sensitive-data-hashing query port #16781

Add rb/weak-sensitive-data-hashing query port #16781

Conversation

alexrford commented Jun 18, 2024 • edited Loading

github-actions bot commented Jun 18, 2024

Use of a broken or weak cryptographic hashing algorithm on sensitive data

Recommendation

Example

Example

References

joefarebrother left a comment

Choose a reason for hiding this comment

RasmusWL left a comment

Choose a reason for hiding this comment

RasmusWL commented Jul 1, 2024

alexrford commented Jul 23, 2024

Add `rb/weak-sensitive-data-hashing` query port #16781

Add `rb/weak-sensitive-data-hashing` query port #16781

alexrford commented Jun 18, 2024 •

edited

Loading