Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby: Use additional sensitive data heuristics for CleartextSources #16503

Merged
merged 7 commits into from
Jun 18, 2024

Conversation

joefarebrother
Copy link
Contributor

Depends on #16446.

This PR expands CleartextSources.qll to use additional sensitive data heuristics besides passwords.
Additionally, the cleartext storage and cleartext logging queries allow implicit read steps at sinks.
This finds new results in Railsgoat (https://github.com/github/codeql-team/issues/2367)

@joefarebrother joefarebrother changed the title [Draft] Ruby: Use additional sensitive data heuristics for CleartextSources Ruby: Use additional sensitive data heuristics for CleartextSources May 21, 2024
@joefarebrother joefarebrother marked this pull request as ready for review May 21, 2024 08:28
@joefarebrother joefarebrother requested a review from a team as a code owner May 21, 2024 08:28
Copy link
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nitpick, otherwise looks good as long as performance is fine.

HashKeyWriteSensitiveSource() {
exists(DataFlow::CallNode writeNode, SensitiveDataClassification classification |
nameIndicatesSensitiveData(name, classification) and
not classification = SensitiveDataClassification::id() and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I'd prefer an inclusion test to make it more obvious which categories are considered, and maybe make this a less likely to change inadvertently if we add more classifications in the future, e.g.

Suggested change
not classification = SensitiveDataClassification::id() and
classification = [
SensitiveDataClassification::secret(),
SensitiveDataClassification::password(),
SensitiveDataClassification::certificate(),
SensitiveDataClassification::private()
] and

// avoid safe values assigned to presumably unsafe names
not this instanceof NonCleartextSensitive and
nameIndicatesSensitiveData(name, classification) and
not classification = SensitiveDataClassification::id() and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto the above comment, so perhaps we could move this check out into a helper predicate (isRelevantClassification(SensitiveDataClassification classification) or similar).

ParameterSensitiveSource() {
exists(SensitiveDataClassification classification |
nameIndicatesSensitiveData(name, classification) and
not classification = SensitiveDataClassification::id() and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@joefarebrother
Copy link
Contributor Author

@alexrford Thanks, I have factored out these into a helper predicate nameIndicatesRelevantSensitiveData. However, the last DCA run still indicated a performance issue on one DB, which I am investigating.

@joefarebrother
Copy link
Contributor Author

Performance has now been resolved. New DCA: https://github.com/github/codeql-dca-main/issues/21834

@joefarebrother joefarebrother merged commit 3370477 into github:main Jun 18, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants