-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Python: Model sources from stdlib HTTP servers #4797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Just a port of the old tests, except for the fact that I learned `cgi.FieldStorage()` _should_ be tainted when not specifying any arguments. (and moved taint-test to own function) Also clarified how imports of all the .*HTTPRequestHandler works in Python2
| * A source of an instance of `cgi.FieldStorage`. | ||
| * | ||
| * This can include instantiation of the class, return value from function | ||
| * calls, or a special parameter that will be set when functions are call by external | ||
| * library. | ||
| * | ||
| * Use `FieldStorage::instance()` predicate to get references to instances of `cgi.FieldStorage`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * A source of an instance of `cgi.FieldStorage`. | |
| * | |
| * This can include instantiation of the class, return value from function | |
| * calls, or a special parameter that will be set when functions are call by external | |
| * library. | |
| * | |
| * Use `FieldStorage::instance()` predicate to get references to instances of `cgi.FieldStorage`. | |
| * An instance of `cgi.FieldStorage`, extend this class to model new instances. | |
| * | |
| * This can include instantiations of the class, return values from function | |
| * calls, or a special parameter that will be set when functions are called by an external | |
| * library. | |
| * | |
| * Use the predicate `FieldStorage::instance()` to get references to instances of `cgi.FieldStorage`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you agree with changing this, you probably want to update your snippet...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-A source of an instance of `cgi.FieldStorage`.
+An instance of `cgi.FieldStorage`I don't agree with this part. This pattern of having a source of an instance doesn't make too much sense here, but really helps out in other places, for example for the MultiDict in werkzeug. I would really like us to not change this pattern, so it's easy to recognize it's just this one pattern being used all over the place.
The other suggestions looks like an improvement. Let's do that in an other PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to relegate these to a separate PR if we are planning to change multiple instances anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps "A source of instances of...", then? I wanted to make it clear, that if you extend this class, you should extend it with things that can be considered instances since the code below does so. The "source of" part was confusing this a bit for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A source of instances of...
sounds good to me 👍
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some extra edges that I hope we can avoid.
Apart from this, I wonder if we do want the instances themselves to be the sources. Since we identify the reads that actually pull out the user controlled data, should those points not be the sources?
(I guess if someone were to print the instance, that would also expose the data.)
So there isn't flow from *any* instance to *any* access of the methods, but only from the _actual_ instance where the method is accessed.
Co-authored-by: yoff <lerchedahl@gmail.com>
|
Ok, no more details from me. The high-level question still stands: Is there an obvious downside to only mark the actual data extraction points as sinks? It seems it would allow us to skip adding extra taint steps altogether? |
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
As discussed verbally, the path from the initial instance to the field extractions can be useful both in path explanations and to facilitate sanitizers.
I manually tested this against a few projects, and it does bring back missing results for Code Injection 🎉