New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Module boto3 - Taint Propagation Issue #795
Comments
Hi @giusepperaffa, You are definitely on the right track. The problem is that Note that there is another problem that you will hit next. Those models:
You are marking the result of a function as a sink (something we call "return sinks"). This will only impact the analysis of the body of
This is not what you want. What you want is to mark
|
Hi @arthaud, Thank you very much for your help. I have now solved the problem, and this issue can be closed. A few details for the future. Analysis of the call graph Case 1 - pyre_dump()
Case 2 - pyre_dump_call_graph()
Model for the source
Model for the sink
However, it is the model for |
Just for clarification: this must be because this is a somewhat new feature. The latest |
Description
I have been trying to use Pysa (Ubuntu 20.04 + virtual environment + Python 3.8) to perform a data flow analysis of the following code (stored in a file called
httphandler.py
). The expected result is a data flow that hasevent
(defined in the parameter list of the handleronHTTPPostEvent
) as source andbucket.objects.all().delete()
as sink.Pysa does not detect any data flow and I am struggling to understand how this can be solved, if at all. Details on used Pysa models, debugging and resolution attempts are provided below.
Pysa models
These are the models that I tried using initially. All of them are considered valid by Pysa. Note that none of the models focuses on the
Bucket
object, as I was hoping to rely on the automatic propagation of the taint attached toevent['body']
. Model 2 and 3 below, which were written by using the information available here, can be considered one an alternative to the other.def httphandler.onHTTPPostEvent(event: TaintSource[Test]): ...
def mypy_boto3_s3.service_resource.BucketObjectsCollection.all() -> TaintSink[Test]: ...
def mypy_boto3_s3.service_resource.ObjectSummary.delete() -> TaintSink[Test]: ...
Debugging steps
As suggested here, I have instrumented my code with the functions
reveal_taint
andreveal_type
. Results:event['body']
: forward taintTest
/ No backward taint ({}). This is what I expected.bucket
: No forward taint ({}) / No backward taint ({}). As for the type, it is correctly identified asmypy_boto3_s3.service_resource.Bucket
The conclusion is that the taint of
event['body']
is not propagated to the objectbucket
, which explains why the expected data flow is not found.Attempts to propagate the taint
Taking into account the information on TITO, I have attempted the following:
Attempt 1
I have attempted to propagate the taint by modelling the constructor of
Bucket
as follows:def mypy_boto3_s3.service_resource.Bucket.__init__(self, name: TaintInTaintOut[LocalReturn]): ...
def mypy_boto3_s3.service_resource.Bucket.__init__(self, *args: TaintInTaintOut[LocalReturn]): ...
Neither of the above models was considered valid by Pysa, which suggested modelling the constructor of the base class (
object
) instead. I have therefore tried the following models, but none of them was considered valid by Pysa because thename
parameter was unexpected.def object.__init__(self, name: TaintInTaintOut[LocalReturn]): ...
def object.__init__(self, *args: TaintInTaintOut[LocalReturn]): ...
def object.__init__(self, **kwargs: TaintInTaintOut[LocalReturn]): ...
Attempt 2
The
boto3
stubs documentation shows here thatname
is an attribute ofBucket
. The following model is, in fact, considered valid by Pysa:mypy_boto3_s3.service_resource.Bucket.name: TaintSource[Test]
However, the above model does not achieve the propagation of the taint from
name
to theBucket
object. Consequently, I have tried forcing the propagation with the models below, but neither of them worked becauseTaintInTaintOut
is not supported in models that contain only attributes.mypy_boto3_s3.service_resource.Bucket.name: TaintInTaintOut[Updates[self]]
mypy_boto3_s3.service_resource.Bucket.name: TaintInTaintOut[Updates[mypy_boto3_s3.service_resource.Bucket]]
Conclusion
The
boto3.resource
object is rather complex, and relying on automatic taint propagation is not an option. I would be keen to know whether there is a way of detecting the expected data flow in the above case or if this is a Pysa limitation / bug.Please let me know if you need any additional information.
Thank you very much.
The text was updated successfully, but these errors were encountered: