Skip to content

C++: Restrict sinks in cpp/cleartext-storage-database #12469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MathiasVP
Copy link
Contributor

This PR reduces the set of sinks considered in the cpp/cleartext-storage-database query. Previously we used "all functions prefixed with sqlite", and after this PR it's confined to a specific set of functions. This fixes a performance issue I was seeing on a database that had a particularly bad naming convention of prefixing their own functions with sqlite 😂.

I've added two additional commits that each should provide a small performance boost.

…abase'. This reduces the number of sinks considered on the 'sysown/proxysql' from > 62000 sinks to ~1000 sinks.
…om the sink definition to the source definition. The dataflow library starts tracking flow from the sources, so it's better to to rule out the entire database in the source definition than in the sink definition.
…ready be considering flow starting at those sources) and out of sinks (since we'll already be alerting on this sink if it's relevant).
@MathiasVP MathiasVP added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Mar 9, 2023
@MathiasVP MathiasVP requested a review from a team as a code owner March 9, 2023 15:20
@github-actions github-actions bot added the C++ label Mar 9, 2023
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Mar 9, 2023
@MathiasVP
Copy link
Contributor Author

@geoffw0 I'd like to hear your input on this one as well. Did I capture all the relevant functions as sinks?

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. There is a risk that we're loosing support for other sqlite interfaces besides the common one, which sqlite% may well have matched by semi-coincidence whereas the more specific modelling will not. On the flip side, we lose false positives like you were seeing this morning. I'm happy with the change.

.getName()
.matches([
"sqlite3_bind_blob", "sqlite3_bind_blob64", "sqlite3_bind_text", "sqlite3_bind_text16",
"sqlite3_bind_text64", "sqlite3_bind_value", "sqlite3_bind_pointer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally preserve modelling of sqlite3_bind_int64 etc as well here, though I'm not sure they have much value and are potentially more likely to lead to false positives???

Copy link
Contributor Author

@MathiasVP MathiasVP Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't really sure about the integer versions. Since the sources for this query is "sensitive data" I can't really imagine why sensitive data would end up in an integer (in a way that reveals any sensitive data).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well in principle you can store a bank account number as a uint64 or something. I hope it's uncommon.

@MathiasVP MathiasVP merged commit d89b8ba into github:mathiasvp/replace-ast-with-ir-use-usedataflow Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ depends on internal PR This PR should only be merged in sync with an internal Semmle PR no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants