Skip to content

Conversation

joefarebrother
Copy link
Contributor

No description provided.

"com.google.common.io;BaseEncoding;true;decodingSource;(CharSource);;Argument[0];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;encode;(byte[]);;Argument[0];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;encode;(byte[],int,int);;Argument[0];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;withSeparator;(String,int);;Argument[0];ReturnValue;taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can taint be modelled backwards?
In that case should this also match encodingSink and encodingStream where the return value taints the argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few special cases that model backwards taint steps, however there is no general mechanism for it, as it would likely lead to false positives from "time-travel flow", in cases like the following:

x = new StringWriter();
y = enc.encodingStream(x)
sink(x.toString());
y.write(taint());

If the writes to y are propegated backwards to x, then we get spurious flow to sink.
So, the question of whether to include things like encodingStream (in general, wrappers around an output stream - of which there are several in the standard library that we don't model for simlar reasons) amounts to whether the additional effort to implement a special-case backwards flow step plus the potential for these kinds of FPs is worth the few additional results it could get; which I don't think it is.

Comment on lines +17 to +25
"com.google.common.io;BaseEncoding;true;decode;(CharSequence);;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;decodingStream;(Reader);;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;decodingSource;(CharSource);;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;encode;(byte[]);;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;upperCase;();;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;lowerCase;();;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;withPadChar;(char);;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;omitPadding;();;Argument[-1];ReturnValue;taint",
"com.google.common.io;BaseEncoding;true;encode;(byte[],int,int);;Argument[-1];ReturnValue;taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a comment why BaseEncoding can be tainted. I was puzzled at first until I noticed that withSeparator (and only that method) can taint the encoding object.

"com.google.common.io;ByteStreams;false;toByteArray;(InputStream);;Argument[0];ReturnValue;taint",
"com.google.common.io;CharSource;true;asByteSource;(Charset);;Argument[-1];ReturnValue;taint",
"com.google.common.io;CharSource;true;concat;;;Argument[0];ReturnValue;taint",
"com.google.common.io;CharSource;true;copyTo;(Appendable);;Argument[-1];Argument[0];taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also cover copyTo(CharSink)? Someone could write their own CharSink which acts like a StringWriter, though this might be unlikely since it does not fit the intended use case of CharSink very well.

"com.google.common.io;CharStreams;false;copy;(Readable,Appendable);;Argument[0];Argument[1];taint",
"com.google.common.io;CharStreams;false;readLines;(Readable);;Argument[0];ReturnValue;taint",
"com.google.common.io;CharStreams;false;toString;(Readable);;Argument[0];ReturnValue;taint",
"com.google.common.io;Closer;true;register;;;Argument[0];ReturnValue;value",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should exception objects also be tracked for taint? E.g. when an exception message is created with a user-defined value and is then logged, it could allow log injection. Though in general any flow from user-defined data to an exception constructor is bad, so maybe it is not worth trying to check where the tainted exception flows afterwards?

"com.google.common.io;CharStreams;false;readLines;(Readable);;Argument[0];ReturnValue;taint",
"com.google.common.io;CharStreams;false;toString;(Readable);;Argument[0];ReturnValue;taint",
"com.google.common.io;Closer;true;register;;;Argument[0];ReturnValue;value",
"com.google.common.io;Files;false;getFileExtension;(String);;Argument[0];ReturnValue;taint",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all the methods which read or transfer the file content to in-memory structures (e.g. Appendable) also be considered, when the user can chose the file (though this on its own is already bad)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a query for user controlled data flowing to a File, so additional taint steps from that point are unlikely to be helpful. Additionally, a user controlling a file path doesn't necassarily mean they can control the file contents.

Copy link
Contributor

@Marcono1234 Marcono1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to cover com.google.common.io.Resources methods taking a URL as argument?

@joefarebrother
Copy link
Contributor Author

Would it make sense to cover com.google.common.io.Resources methods taking a URL as argument?

As it is unlikely for a user-controlled resource URL to imply that its contents are user controlled, those methods don't seem like useful flow steps.

@Marcono1234
Copy link
Contributor

Despite Resources discouraging use of HTTP URLs, they actually work, so a user could (potentially) provide an HTTP URL referring to a user controlled site.

@joefarebrother
Copy link
Contributor Author

In fact, there is a query for calling .openStream on an untrusted URL, so these methods could be added as sinks for that query.

@aschackmull
Copy link
Contributor

Otherwise LGTM.

Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
@joefarebrother joefarebrother merged commit 7a4ce83 into github:main Mar 9, 2021
@joefarebrother joefarebrother deleted the guava-io branch March 9, 2021 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants