Skip to content

Java: Track taint for CharSequence#subSequence #6407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Sep 10, 2021

Conversation

bmuskalla
Copy link
Contributor

I've seen this being used in a few repos and given it's available on String (which does the same as subString)

@bmuskalla bmuskalla added the Java label Aug 3, 2021
@bmuskalla bmuskalla requested a review from a team as a code owner August 3, 2021 14:59
@bmuskalla bmuskalla added the no-change-note-required This PR does not need a change note label Aug 3, 2021
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of surprised these never got CSV-ified, @aschackmull ?

Copy link
Contributor

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be out of scope for this pull request, but would it make sense to cover toString() as well? The documentation mandates that the result are the characters of this CharSequence. (Though this is sometimes violated, but I guess in these cases there is also no call to toString() then.)

@aschackmull
Copy link
Contributor

Kind of surprised these never got CSV-ified, @aschackmull ?

Yeah, might as well do that now. @bmuskalla would you mind?

@bmuskalla
Copy link
Contributor Author

As suggested, CSV-ified the existing string-related models and added subsequence there.

@github-actions
Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Java Standard Library,``java.*``,3,371,30,13,,,7,,,10
+    Java Standard Library,``java.*``,3,419,30,13,,,7,,,10
-    Totals,,84,2711,398,13,6,6,107,33,1,66
+    Totals,,84,2759,398,13,6,6,107,33,1,66
  • Changes to framework-coverage-java.csv:
- java.io,3,,24,,3,,,,,,,,,,,,,,,,,24,
+ java.io,3,,27,,3,,,,,,,,,,,,,,,,,27,
- java.lang,,,3,,,,,,,,,,,,,,,,,,,1,2
+ java.lang,,,42,,,,,,,,,,,,,,,,,,,38,4
- java.util,,,332,,,,,,,,,,,,,,,,,,,15,317
+ java.util,,,338,,,,,,,,,,,,,,,,,,,15,323

@github-actions
Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Java Standard Library,``java.*``,3,371,30,13,,,7,,,10
+    Java Standard Library,``java.*``,3,419,30,13,,,7,,,10
-    Totals,,84,2711,398,13,6,6,107,33,1,66
+    Totals,,84,2759,398,13,6,6,107,33,1,66
  • Changes to framework-coverage-java.csv:
- java.io,3,,24,,3,,,,,,,,,,,,,,,,,24,
+ java.io,3,,27,,3,,,,,,,,,,,,,,,,,27,
- java.lang,,,3,,,,,,,,,,,,,,,,,,,1,2
+ java.lang,,,42,,,,,,,,,,,,,,,,,,,39,3
- java.util,,,332,,,,,,,,,,,,,,,,,,,15,317
+ java.util,,,338,,,,,,,,,,,,,,,,,,,15,323

@bmuskalla
Copy link
Contributor Author

It would be great to get some feedback on the changes for some of the query results as the CSV models seem to have changed how much we represent certain flows in there.

@owen-mc owen-mc changed the title Track taint for CharSequence#subSequence Java: Track taint for CharSequence#subSequence Aug 17, 2021
@github-actions
Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Java Standard Library,``java.*``,3,371,30,13,,,7,,,10
+    Java Standard Library,``java.*``,3,420,30,13,,,7,,,10
-    Totals,,84,2711,398,13,6,6,107,33,1,66
+    Totals,,84,2760,398,13,6,6,107,33,1,66
  • Changes to framework-coverage-java.csv:
- java.io,3,,24,,3,,,,,,,,,,,,,,,,,24,
+ java.io,3,,27,,3,,,,,,,,,,,,,,,,,27,
- java.lang,,,3,,,,,,,,,,,,,,,,,,,1,2
+ java.lang,,,43,,,,,,,,,,,,,,,,,,,40,3
- java.util,,,332,,,,,,,,,,,,,,,,,,,15,317
+ java.util,,,338,,,,,,,,,,,,,,,,,,,15,323

"java.io;StringWriter;true;append;;;Argument[0];ReturnValue;taint",
"java.io;StringWriter;true;write;;;Argument[0];Argument[-1];taint",
"java.lang;AbstractStringBuilder;true;AbstractStringBuilder;(String);;Argument[0];Argument[-1];taint",
"java.lang;AbstractStringBuilder;true;append;;;Argument[0];Argument[-1];taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods return Argument[-1] by value -- this could have interesting effects worth watching out for though, as this will give the dataflow lib two ways to propagate side-effects to a stringbuilder used fluently -- e.g. getsTaint.append("clean").append(taint) can now get taint via the special rules about stringbuilder chains, and via the general rule about chaining taint steps and value steps that also handles otherFluent.fluentOp("clean").fluentOp(taint). We may need to coordinate removing the special-casing for stringbuilder (see private predicate stringBuilderStep(Expr tracked, Expr sink)) along with noting these as value-propagating steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the step in 1b69029 - from local experiments, the model (with a few fixes) handles the existing behaviour

@github-actions
Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Java Standard Library,``java.*``,3,371,30,13,,,7,,,10
+    Java Standard Library,``java.*``,3,423,30,13,,,7,,,10
-    Totals,,84,3491,398,13,6,6,107,33,1,66
+    Totals,,84,3543,398,13,6,6,107,33,1,66
  • Changes to framework-coverage-java.csv:
- java.io,3,,24,,3,,,,,,,,,,,,,,,,,24,
+ java.io,3,,27,,3,,,,,,,,,,,,,,,,,27,
- java.lang,,,3,,,,,,,,,,,,,,,,,,,1,2
+ java.lang,,,46,,,,,,,,,,,,,,,,,,,42,4
- java.util,,,332,,,,,,,,,,,,,,,,,,,15,317
+ java.util,,,338,,,,,,,,,,,,,,,,,,,15,323

@bmuskalla
Copy link
Contributor Author

Split the Objects part out into #6555

@github-actions
Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged. The following differences were found:

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    Java Standard Library,``java.*``,3,375,30,13,,,7,,,10
+    Java Standard Library,``java.*``,3,421,30,13,,,7,,,10
-    Totals,,84,3495,398,13,6,6,107,33,1,66
+    Totals,,84,3541,398,13,6,6,107,33,1,66
  • Changes to framework-coverage-java.csv:
- java.io,3,,24,,3,,,,,,,,,,,,,,,,,24,
+ java.io,3,,27,,3,,,,,,,,,,,,,,,,,27,
- java.lang,,,3,,,,,,,,,,,,,,,,,,,1,2
+ java.lang,,,46,,,,,,,,,,,,,,,,,,,42,4

@aschackmull
Copy link
Contributor

It would be great to get some feedback on the changes for some of the query results as the CSV models seem to have changed how much we represent certain flows in there.

The bulk of the edges/nodes changes look expected as discussed offline.

@bmuskalla
Copy link
Contributor Author

@aschackmull I think this is ready for review. The last remaining test failure is fixed by an internal PR. I'd like to rewrite the flow tests as inline tests in a separate PR to not start yet another refactoring in this PR.

@bmuskalla
Copy link
Contributor Author

Thanks for catching those mistakes

@bmuskalla
Copy link
Contributor Author

Sorry I missed those 2 value flows in your suggestions. Should be fixed now

@aschackmull aschackmull added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Sep 9, 2021
@aschackmull aschackmull merged commit 3e17fdc into github:main Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
depends on internal PR This PR should only be merged in sync with an internal Semmle PR Java no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants