Skip to content

Python: Aiohttp improvements #13731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 9, 2023
Merged

Conversation

pwntester
Copy link
Contributor

Few improvements for the aiohttp framework support:

  • New heuristic source for request handlers based on type hints
  • New Path Manipulation sink for FileResponse
  • New SSRF sink for ClientSession.ws_connect
  • Better recognition of the HTTP response's content type

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me 👍

The one major thing about the addition of remote-flow-sources based off type-annotations is that we will introduce new shorter paths. In the example below, it's much more interesting to see the whole path starting at handler with a bad sanitizer, than seeing the path that only starts at the req parameter in foo. Would like to think a little about this aspect 🤔

@routes.post(...)
async def handler(request: aiotthp.web.Request):
    if request.host == "127.0.0.1":
        foo(request)

...

async def foo(req: aiohttp.web.Request):
    db.cursor().execute(await req.content.read())

Can you please add tests? I expect in these two files

  • python/ql/test/library-tests/frameworks/aiohttp/client_request.py
  • pyhton/ql/test/library-tests/frameworks/aiohttp/response_test.py
  • python/ql/test/library-tests/frameworks/aiohttp/taint_test.py (for the new remote-flow-source)

@RasmusWL RasmusWL changed the title [Python] Aiohttp improvements Python: Aiohttp improvements Jul 13, 2023
@RasmusWL
Copy link
Member

Actually, we should also add a test in python/ql/test/library-tests/frameworks/aiohttp/taint_test.py (for the new remote-flow-source)

Alvaro Muñoz and others added 2 commits July 13, 2023 12:23
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@pwntester pwntester requested a review from a team as a code owner July 13, 2023 10:24
@github-actions github-actions bot added the Go label Jul 13, 2023
@pwntester
Copy link
Contributor Author

Thanks @RasmusWL, I added the tests, but not sure how the inline tests expect. Can you please take a look?

@github-actions github-actions bot removed the Go label Jul 13, 2023
RasmusWL added 4 commits July 13, 2023 13:57
But notice that keyword argument is not handled yet
However, notice that the concepts tests use the HttpResponse location
for the `responseBody` tag, which seems a little odd in this situation,
where they are actually separate. Will fix in next commit.
RasmusWL
RasmusWL previously approved these changes Jul 13, 2023
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I fixed up tests a bit (checking if we handle keyword arguments as well), and while I had the code checked out locally, I improved the modeling a little more as well.

Overall this LGTM, but let us just think a little more about the potential problem with what paths are shwon before merging. (will also run a performance test, just to be sure)

@pwntester
Copy link
Contributor Author

Thanks for fixing the tests @RasmusWL! As for the heuristic sources, in JS they have in a different file that you can enable to opt-in for more results. Perhaps thats a way to implement it until the subflow filtering is implemented

@RasmusWL
Copy link
Member

Thanks for fixing the tests @RasmusWL! As for the heuristic sources, in JS they have in a different file that you can enable to opt-in for more results. Perhaps thats a way to implement it until the subflow filtering is implemented

I think calling it a "heuristic sources" is not doing enough justice to what you're doing here. For me, a heuristic source would be something like: "we see that a parameter is called request, so it's probably an incoming HTTP request, and we'll model all attributes as tainted".

So I think 100% we should include the new sources, it's just a matter of question how to do it.

@RasmusWL
Copy link
Member

Evaluation was fine 👍

I've made a minor naming update to highlight that this is not a heuristic source, and changed the char-pred so it's not overlapping with AiohttpRequestHandlerRequestParam (makes reporting sources a little more accurate)

RasmusWL
RasmusWL previously approved these changes Jul 14, 2023
@RasmusWL RasmusWL removed the request for review from a team July 14, 2023 10:56
@calumgrant calumgrant requested a review from yoff July 17, 2023 09:09
@yoff
Copy link
Contributor

yoff commented Jul 17, 2023

I am a bit worried about the ability to jump into the middle of a path; not just for display reasons, but also for missing upstream sanitisers. If there is no easy way to fix it now, we should probably merge as is, though...

@pwntester
Copy link
Contributor Author

I understand the concern, but this is something we already have problems with when the sanitizer is a middleware/filter function

@yoff
Copy link
Contributor

yoff commented Jul 17, 2023

I understand the concern, but this is something we already have problems with when the sanitizer is a middleware/filter function

But that is mainly because we do not have a model of middleware 🙂 I would love a list of examples of this, though, since we might look into this soon...

Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requested changes are buried in comments. To be clear, they are

  • adding a test showing that we can now skip sanitisers by jumping past them to a type-annotated node.
    (We will accept this for now and merge the PR once we have a test documenting the behaviour.)
  • simplifying using getSubscript

Comment on lines 573 to 579
exists(DataFlow::Node headers, Dict d |
headers = this.getArgByName("headers").getALocalSource()
|
headers.asExpr() = d and
d.getAKey().(StrConst).getText().toLowerCase() = "content-type" and
d.getAValue() = result.asExpr()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the toLowerCase is crucial, you can simply do

Suggested change
exists(DataFlow::Node headers, Dict d |
headers = this.getArgByName("headers").getALocalSource()
|
headers.asExpr() = d and
d.getAKey().(StrConst).getText().toLowerCase() = "content-type" and
d.getAValue() = result.asExpr()
)
result =
this.getKeywordParameter("headers").getSubscript("content-type").getAValueReachingSink()

If toLowerCase is crucial, we should add it either as default or as an option to getSubscript (say getSubscriptCaseInsensitive).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP headers are case insensitive, so we would need to implement the getSubscriptCaseInsensitive predicate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now you can perhaps do

exists(string key | key.toLowerCase() = "content-type" |
  result =
    this.getKeywordParameter("headers").getSubscript(key).getAValueReachingSink()
)

DataFlow::ParameterNode, RemoteFlowSource::Range
{
AiohttpRequestParamFromTypeAnnotation() {
not this instanceof AiohttpRequestHandlerRequestParam and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the API graph overhaul has been merged, we should be able to change like this:

Suggested change
not this instanceof AiohttpRequestHandlerRequestParam and
not this = any(AiohttpRequestHandlerRequestParam p).track() and

This should remove the problem of jumping into the middle of a path.
Until then, it would be good to have a test illustrating the problem (similar to the example given, but with a proper sanitiser so it is actually an FP); then we can see the FP disappear when we make this change.

(And with the overhaul, we should also be able to make these InstanceSources a fair bit nicer, I imagine.)

@pwntester
Copy link
Contributor Author

But that is mainly because we do not have a model of middleware 🙂 I would love a list of examples of this, though, since we might look into this soon...

This is an example of a security filter middleware used in Home Assistant and therefore this code is called before dispatching the request to any request handler

@yoff
Copy link
Contributor

yoff commented Jul 18, 2023

This is an example of a security filter middleware used in Home Assistant and therefore this code is called before dispatching the request to any request handler

Thanks!

Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, given the tests all pass..

@RasmusWL RasmusWL merged commit 51a0528 into github:main Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants