Skip to content

C++: Remove field conflation caused by IR field flow #3532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented May 20, 2020

Fixes https://github.com/github/codeql-c-analysis-team/issues/77.

#3118 introduced some unwanted field conflation by adding two dataflow rules to simpleInstructionLocalFlowStep. This PR fixes these conflations by removing the flow from object -> field, leaving us with (hopefully) only field -> object flows.

With this PR we lose 2 results on CPP-differences:

  • The one on git is a false positive
  • The one on php is a true positive.

We also lose a result in the internal repo since we no longer have flow in the object -> field flow. Here's the internal PR that updates that test: https://git.semmle.com/Semmle/code/pull/37015

@MathiasVP MathiasVP added the C++ label May 20, 2020
@MathiasVP MathiasVP requested a review from a team as a code owner May 20, 2020 16:31
@MathiasVP
Copy link
Contributor Author

MathiasVP commented May 20, 2020

CPP-differences (https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1147/) shows no performance problems. The lost result on git looks like a fixed false positive, but I'm not so sure about the lost result on php. I'll take a closer look at this one when I'm back on Tuesday.

Edit: Looks like we're losing a true result on php with the change to DefaultTaintTracking. I'll see if I can come up with a better change to prevent field conflation in taint tracking.

@MathiasVP MathiasVP requested a review from jbj May 20, 2020 18:30
@MathiasVP MathiasVP added the WIP This is a work-in-progress, do not merge yet! label May 26, 2020
@MathiasVP
Copy link
Contributor Author

MathiasVP commented May 26, 2020

I've started a new CPP-differences to test a (hopefully) correct fix: https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1155/

Edit: Same result as the previous CPP-difference. The true positive on php is lost when removing the problematic dataflow from ChiInstruction to LoadInstruction.

@MathiasVP
Copy link
Contributor Author

I think the lost result in the internal repo is an unfortunate, but necessary, consequence of the field conflation removal. We're losing taint in the following situation:

taint_function_without_source(&output);
...
sink(output.f);

Before the field conflation fix there would a be flow step from the ChiInstruction of the BufferMayWriteSideEffect from the call to taint_source_without_code to the load of output.f.
I'll look into how we get this flow back without the field conflation. If the CPP-difference looks good I don't think this lost flow should block this PR, though.

@MathiasVP MathiasVP removed the WIP This is a work-in-progress, do not merge yet! label May 27, 2020
@jbj jbj merged commit c7fa112 into github:master May 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants