New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added mitmXSS addon #1907

Merged
merged 19 commits into from Feb 27, 2017

Conversation

Projects
None yet
5 participants
@ddworken
Contributor

ddworken commented Dec 30, 2016

mitmXSS is a new addon for mitmproxy that automatically scans every visited page for XSS vulnerabilities. It outputs any found XSS vulnerabilities to the mitmproxy log (viewable by pressing e).

xss_scanner.py is the actual script for mitmproxy
test_xss_scanner.py contains unit tests for xss.py

ddworken added some commits Dec 30, 2016

Added mitmXSS addon
```xss.py``` is a new script for mitmproxy that automatically scans every visited page for XSS vulnerabilities
```xss_testing.py``` contains unit tests for xss.py
```images/``` contains 4 example outputs from xss.py
Automatically send cookies in all requests
Added a getCookies(flow) function that returns cookies in a format readable by requests.
Include those cookies in all requests made by the script
@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 1, 2017

Contributor

I just added another commit so that it automatically includes cookies in all of its requests. This is done using the flow object and pulling the information out of flow.response.cookies.fields and flow.request.cookies.fields and parsing it into a format readable by the requests library.

Contributor

ddworken commented Jan 1, 2017

I just added another commit so that it automatically includes cookies in all of its requests. This is done using the flow object and pulling the information out of flow.response.cookies.fields and flow.request.cookies.fields and parsing it into a format readable by the requests library.

@cortesi

This comment has been minimized.

Show comment
Hide comment
@cortesi

cortesi Jan 22, 2017

Member

Hi there,

Apologies for taking so long to get back to this issue - I've been nightmarishly busy.

  • I really want this type of functionality in mitmproxy. I think the next step for the project is to start building higher-level assessment functionality, and this is precisely the type of thing we need.
  • In its current form, this sends information to the logs when an issue is found. I think we can be much more structured than that. I'm working on a new top-level concept for mitmproxy - a session where addons like this can add annotations and user reports, which can then be viewed and manipulated by the user. I should have something for the project to look at and discuss relatively soon. Getting this in place is not a blocker for this PR, but the addon will need to be refocused pretty quickly afterwards.
  • Now for the major blocker. We've just worked very hard to remove lxml from our distribution - it's a horrible package to distribute and has caused us major issues. I think @mhils in particular is likely to have an aneurism if we just re-introduce it. ;) I'm genuinely torn on this. On the one hand, moving away from it makes our lives much easier. On the other hand, higher-level functionality like this PR and other things that I have planned absolutely requires a high-quality HTML parser. I think lxml is pretty much the only game in town here for the Python world. We need to discuss this, but I'm very reluctantly in favour of reintroducing lxml as a dependency.
  • Finally, there are some structural things in this PR that should be fixed before we pull.
    • Could we please put the tests in the top-level project tests directory.
    • Having a separate license file for a portion of the project is a headache. This code might be moved around and split up within the project. If you don't mind, we'd prefer if you could add the copyright attribution to the top of the file, and just license the contribution under the top-level MIT license. We're super meticulous about recognition and attribution, and you'll be mentioned by name in release notes and the like.

If we resolve the structural issues and convince the other maintainers to re-add lxml, I'm in favor of pulling this in, on the understanding that we'll refactor once sessions are in tree.

Cheers,

Aldo

Member

cortesi commented Jan 22, 2017

Hi there,

Apologies for taking so long to get back to this issue - I've been nightmarishly busy.

  • I really want this type of functionality in mitmproxy. I think the next step for the project is to start building higher-level assessment functionality, and this is precisely the type of thing we need.
  • In its current form, this sends information to the logs when an issue is found. I think we can be much more structured than that. I'm working on a new top-level concept for mitmproxy - a session where addons like this can add annotations and user reports, which can then be viewed and manipulated by the user. I should have something for the project to look at and discuss relatively soon. Getting this in place is not a blocker for this PR, but the addon will need to be refocused pretty quickly afterwards.
  • Now for the major blocker. We've just worked very hard to remove lxml from our distribution - it's a horrible package to distribute and has caused us major issues. I think @mhils in particular is likely to have an aneurism if we just re-introduce it. ;) I'm genuinely torn on this. On the one hand, moving away from it makes our lives much easier. On the other hand, higher-level functionality like this PR and other things that I have planned absolutely requires a high-quality HTML parser. I think lxml is pretty much the only game in town here for the Python world. We need to discuss this, but I'm very reluctantly in favour of reintroducing lxml as a dependency.
  • Finally, there are some structural things in this PR that should be fixed before we pull.
    • Could we please put the tests in the top-level project tests directory.
    • Having a separate license file for a portion of the project is a headache. This code might be moved around and split up within the project. If you don't mind, we'd prefer if you could add the copyright attribution to the top of the file, and just license the contribution under the top-level MIT license. We're super meticulous about recognition and attribution, and you'll be mentioned by name in release notes and the like.

If we resolve the structural issues and convince the other maintainers to re-add lxml, I'm in favor of pulling this in, on the understanding that we'll refactor once sessions are in tree.

Cheers,

Aldo

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 24, 2017

Contributor

Thanks for all the feedback!

On the topic of lxml, I totally get the hesitation towards reintroducing it as a dependency. If you are already building a new feature for addons, would it potentially make sense to build dependency management into that as a feature? That way one doesn't need lxml in order to run mitmproxy but there can still be a nice UI for running this addon.

On the topic of the license, yup that makes sense— just changed that.

As far as moving tests to the top level tests directory, I'm happy to do that but I'm not sure that it makes sense to do so. If the ultimate goal is to have a lot of addons for mitmproxy, I'm not sure it makes sense to include tests for each addon in the tests for the project as a whole. But if you are sure that it is the right decision for the time being, let me know and I'll move them over.

Thanks,
David Dworken

Contributor

ddworken commented Jan 24, 2017

Thanks for all the feedback!

On the topic of lxml, I totally get the hesitation towards reintroducing it as a dependency. If you are already building a new feature for addons, would it potentially make sense to build dependency management into that as a feature? That way one doesn't need lxml in order to run mitmproxy but there can still be a nice UI for running this addon.

On the topic of the license, yup that makes sense— just changed that.

As far as moving tests to the top level tests directory, I'm happy to do that but I'm not sure that it makes sense to do so. If the ultimate goal is to have a lot of addons for mitmproxy, I'm not sure it makes sense to include tests for each addon in the tests for the project as a whole. But if you are sure that it is the right decision for the time being, let me know and I'll move them over.

Thanks,
David Dworken

@mhils

This comment has been minimized.

Show comment
Hide comment
@mhils

mhils Jan 24, 2017

Member

So, lxml is a difficult topic. It caused a lot of headaches for me by being the only dependency which was not working on Windows with Python 3 while we were already on track to being Python 3 only (see #1444). I initially tried to solve this the nice way and spent a significant amount of time and effort building Windows wheels for lxml (mhils/libxml2-win-binaries and a couple of PRs to lxml), which really wasn't fun. Unfortunately, after doing all of this, there were no visible signs that a lxml release with these changes would eventually appear within a reasonable timeframe. We had to come up with alternatives (#1657) and I finally replaced the lxml contentview implementation in #1831. For the record, I do think the lxml maintainer is well-intentioned (and he clearly has no obligations to us), but he seems to have very little time at hand and even less interest in supporting downstream. He did manage to push out a release with Windows wheels by now, but I still think that depending on lxml is risky and likely to put us in a similar situation again. Sorry for being negative on this, but I've been burnt.

Let's get a bit more positive and productive:

If you are already building a new feature for addons, would it potentially make sense to build dependency management into that as a feature?

Right now the major selling point of addons is that they allow us to structure and separate our internal logic. External addons that come with their own dependencies are the next logical step, but if I read @cortesi's comment correctly he thinks that this feature is important enough to be right in the core so we can skip that problem for now :). If we can do that without lxml, I would tend to agree.

Can we use https://docs.python.org/3.6/library/xml.etree.elementtree.html instead of lxml? IIRC we discarded that option for the contentviews as it auto-fixes HTML when prettyprinting and failed for some malformed documents, but those limitations shouldn't be dealbreakers for this feature. This would also allow us to mitigate some DoS attack vectors if we use defusedxml, an option we don't have with lxml.

Member

mhils commented Jan 24, 2017

So, lxml is a difficult topic. It caused a lot of headaches for me by being the only dependency which was not working on Windows with Python 3 while we were already on track to being Python 3 only (see #1444). I initially tried to solve this the nice way and spent a significant amount of time and effort building Windows wheels for lxml (mhils/libxml2-win-binaries and a couple of PRs to lxml), which really wasn't fun. Unfortunately, after doing all of this, there were no visible signs that a lxml release with these changes would eventually appear within a reasonable timeframe. We had to come up with alternatives (#1657) and I finally replaced the lxml contentview implementation in #1831. For the record, I do think the lxml maintainer is well-intentioned (and he clearly has no obligations to us), but he seems to have very little time at hand and even less interest in supporting downstream. He did manage to push out a release with Windows wheels by now, but I still think that depending on lxml is risky and likely to put us in a similar situation again. Sorry for being negative on this, but I've been burnt.

Let's get a bit more positive and productive:

If you are already building a new feature for addons, would it potentially make sense to build dependency management into that as a feature?

Right now the major selling point of addons is that they allow us to structure and separate our internal logic. External addons that come with their own dependencies are the next logical step, but if I read @cortesi's comment correctly he thinks that this feature is important enough to be right in the core so we can skip that problem for now :). If we can do that without lxml, I would tend to agree.

Can we use https://docs.python.org/3.6/library/xml.etree.elementtree.html instead of lxml? IIRC we discarded that option for the contentviews as it auto-fixes HTML when prettyprinting and failed for some malformed documents, but those limitations shouldn't be dealbreakers for this feature. This would also allow us to mitigate some DoS attack vectors if we use defusedxml, an option we don't have with lxml.

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 25, 2017

Contributor

That sounds reasonable, I should have some free time this weekend so I'll try switching it over to see if that works. Depending on the extent to which it auto-fixes the HTML, it may or may not end up being a problem (I don't think it will be problematic though).

Contributor

ddworken commented Jan 25, 2017

That sounds reasonable, I should have some free time this weekend so I'll try switching it over to see if that works. Depending on the extent to which it auto-fixes the HTML, it may or may not end up being a problem (I don't think it will be problematic though).

@cortesi

This comment has been minimized.

Show comment
Hide comment
@cortesi

cortesi Jan 25, 2017

Member

We need to think about this carefully, and there's no rush. A few quick comments:

  • ElementTree is not an HTML parser, and last time I checked failed on lots of very common HTML. There are 3 libraries for "real-world" HTML parsing in Python, as far as I can tell: lxml, beautifulsoup (based on lxml) and html5lib (based on lxml). As far as I can see, for real-world parsing, the options are to a) use lxml or b) fork it, or c) implement our own. I think we agree that all of these are apocalyptic fates worse than death, but I think that's the hand we've been dealt.
  • We believe we need an HTML parser in tree. I think one next step for mitmproxy is to include higher-level functionality, and HTML is the raw material we deal with most often. XSS detection is a good example. Another, which I've been toying with, is auto-crawling websites for end-point discovery. Both of these things would be very useful to me day-to-day, and would let us compete better against software like Burp. I'm excited about this direction, and I'll have more to say about it soon.
  • My judgement is that XSS detection is the sort of thing we'd like in mitmproxy itself. An example of something that should remain external is bdfproxy or intetration with things like beefhook (though I'm willing to be convinced otherwise).
  • Regarding the tests - I can be swayed on test location, but at the moment all our tests are in the /tests directory. Until we make some official change there for addons, I think things we maintain in-tree should follow the project norms.
Member

cortesi commented Jan 25, 2017

We need to think about this carefully, and there's no rush. A few quick comments:

  • ElementTree is not an HTML parser, and last time I checked failed on lots of very common HTML. There are 3 libraries for "real-world" HTML parsing in Python, as far as I can tell: lxml, beautifulsoup (based on lxml) and html5lib (based on lxml). As far as I can see, for real-world parsing, the options are to a) use lxml or b) fork it, or c) implement our own. I think we agree that all of these are apocalyptic fates worse than death, but I think that's the hand we've been dealt.
  • We believe we need an HTML parser in tree. I think one next step for mitmproxy is to include higher-level functionality, and HTML is the raw material we deal with most often. XSS detection is a good example. Another, which I've been toying with, is auto-crawling websites for end-point discovery. Both of these things would be very useful to me day-to-day, and would let us compete better against software like Burp. I'm excited about this direction, and I'll have more to say about it soon.
  • My judgement is that XSS detection is the sort of thing we'd like in mitmproxy itself. An example of something that should remain external is bdfproxy or intetration with things like beefhook (though I'm willing to be convinced otherwise).
  • Regarding the tests - I can be swayed on test location, but at the moment all our tests are in the /tests directory. Until we make some official change there for addons, I think things we maintain in-tree should follow the project norms.
@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 25, 2017

Contributor

I just finished switching it over to html.parser and it seems to work fine with that. Even though it isn't as robust as lxml or beautifulsoup, it should be fine for this given that 100% reliability for it is not essential. I think we can leave it with this and if at any point lxml is pulled in as a dependency, then we can switch back to that.

I'll move the tests over to /tests later once I get the chance to figure out how to get imports working for /tests.

Contributor

ddworken commented Jan 25, 2017

I just finished switching it over to html.parser and it seems to work fine with that. Even though it isn't as robust as lxml or beautifulsoup, it should be fine for this given that 100% reliability for it is not essential. I think we can leave it with this and if at any point lxml is pulled in as a dependency, then we can switch back to that.

I'll move the tests over to /tests later once I get the chance to figure out how to get imports working for /tests.

@mhils

This comment has been minimized.

Show comment
Hide comment
@mhils

mhils Jan 25, 2017

Member

Fantastic, thanks for the quick change! 😃
Imports should work just the same way. We're are using py.test for our testing, so you can simplify your tests with plain assert statements if you want - both ways are fine though, so no worries 😊

Member

mhils commented Jan 25, 2017

Fantastic, thanks for the quick change! 😃
Imports should work just the same way. We're are using py.test for our testing, so you can simplify your tests with plain assert statements if you want - both ways are fine though, so no worries 😊

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 26, 2017

Contributor

Thanks! Just moved the tests to the test directory and fixed the lint problems.

Contributor

ddworken commented Jan 26, 2017

Thanks! Just moved the tests to the test directory and fixed the lint problems.

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 26, 2017

Contributor

I'll update this later tonight with some refactoring to increase code coverage (I just need to shift around some of the requests and log calls).

Contributor

ddworken commented Jan 26, 2017

I'll update this later tonight with some refactoring to increase code coverage (I just need to shift around some of the requests and log calls).

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 27, 2017

Contributor

Ok, I added a bunch of extra tests using mock to deal with requests so it should now pass the code coverage requirements.

Contributor

ddworken commented Jan 27, 2017

Ok, I added a bunch of extra tests using mock to deal with requests so it should now pass the code coverage requirements.

ddworken added some commits Jan 27, 2017

@mhils

This comment has been minimized.

Show comment
Hide comment
@mhils

mhils Jan 28, 2017

Member

Not sure why the tests are failing, but it sure looks like you added dist/mitmproxy-2.0.0-py2.7.egg by accident?

Member

mhils commented Jan 28, 2017

Not sure why the tests are failing, but it sure looks like you added dist/mitmproxy-2.0.0-py2.7.egg by accident?

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 28, 2017

Contributor

Yup, didn't notice that I accidentally committed that. Just removed it.

Contributor

ddworken commented Jan 28, 2017

Yup, didn't notice that I accidentally committed that. Just removed it.

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jan 28, 2017

Contributor

Just added SQLi detection (and some slightly more robust tests).

Contributor

ddworken commented Jan 28, 2017

Just added SQLi detection (and some slightly more robust tests).

Show outdated Hide outdated mitmproxy/addons/mitmXSS/xss.py
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

This comment has been minimized.

@cortesi

cortesi Jan 28, 2017

Member

Would you mind if we remove the full license from here, and just abide with a copyright notice in your name? I'd like to keep the license to the LICENSE file at the top level - I find it cumbersome to include this much verbiage in the source itself.

@cortesi

cortesi Jan 28, 2017

Member

Would you mind if we remove the full license from here, and just abide with a copyright notice in your name? I'd like to keep the license to the LICENSE file at the top level - I find it cumbersome to include this much verbiage in the source itself.

This comment has been minimized.

@ddworken

ddworken Jan 28, 2017

Contributor

Just removed it. 👍

@ddworken

ddworken Jan 28, 2017

Contributor

Just removed it. 👍

@cortesi

This comment has been minimized.

Show comment
Hide comment
@cortesi

cortesi Jan 28, 2017

Member

For what it's worth, I've just poked at html.parse quite extensively, and it's improved hugely since I last looked (a few major releases ago). I think it's quite adequate for this.

Member

cortesi commented Jan 28, 2017

For what it's worth, I've just poked at html.parse quite extensively, and it's improved hugely since I last looked (a few major releases ago). I think it's quite adequate for this.

@cortesi

This comment has been minimized.

Show comment
Hide comment
@cortesi

cortesi Jan 28, 2017

Member

Could we please also remove the images from the patch. They're very nice, but they're very soon going to be out of date, and will be a pain to maintain. I propose that we add this sort of thing to the documentation at the point where this addon becomes a built-in (which will be sooner rather than later).

Member

cortesi commented Jan 28, 2017

Could we please also remove the images from the patch. They're very nice, but they're very soon going to be out of date, and will be a pain to maintain. I propose that we add this sort of thing to the documentation at the point where this addon becomes a built-in (which will be sooner rather than later).

@mhils mhils dismissed their stale review Feb 24, 2017

.

@Kriechi

Not sure what plans @cortesi has for this new addon, but the actual code needs some major refactoring/reformatting before it can be merged.

@mhils

We just discussed this on Slack and decided to move this into examples/ for now. This needs a few changes beforehand, see comments here. For the structure:

  • Move mitmproxy/addons/xss.py to examples/complex/, feel free to keep the name or rename it to xss_checker or whatever you think is best 😃
  • Move test/mitmproxy/addons/test_xss.py to test/mitmproxy/examples/xss.py (examples folder needs to be created)

Thanks again, @ddworken!

Show outdated Hide outdated mitmproxy/addons/xss.py
from urllib.parse import urlparse # Used to modify paths and queries for URLs
import requests # Used to send additional requests when looking for XSSs
import re # Used for pulling out the payload
from html.parser import HTMLParser # used for parsing HTML

This comment has been minimized.

@mhils

mhils Feb 24, 2017

Member

I generally appreciate code with lots of comments, but you can leave imports uncommented. If I'd be interested to see what gethostbyname is used for, I'd grep the file for gethostbyname and hopefully understand its use at the respective place in the code. :)

@mhils

mhils Feb 24, 2017

Member

I generally appreciate code with lots of comments, but you can leave imports uncommented. If I'd be interested to see what gethostbyname is used for, I'd grep the file for gethostbyname and hopefully understand its use at the respective place in the code. :)

Show outdated Hide outdated mitmproxy/addons/xss.py
# The actual payload is put between a frontWall and a backWall to make it easy
# to locate the payload with regular expressions
frontWall = b"1029zxc"

This comment has been minimized.

@mhils

mhils Feb 24, 2017

Member

We try to follow PEP-8 for the project where it makes sense.

For the whole file:
Variables and methods should be lowercase, with words separated by underscores as necessary to improve readability (e.g. front_wall). For constants you can also go for e.g. FRONT_WALL, if that does a better job at conveying your intention. Class names should use CapWords convention.

@mhils

mhils Feb 24, 2017

Member

We try to follow PEP-8 for the project where it makes sense.

For the whole file:
Variables and methods should be lowercase, with words separated by underscores as necessary to improve readability (e.g. front_wall). For constants you can also go for e.g. FRONT_WALL, if that does a better job at conveying your intention. Class names should use CapWords convention.

Show outdated Hide outdated mitmproxy/addons/xss.py
""" Return a dict going from cookie names to cookie values
- Note that it includes both the cookies sent in the original request and
the cookies sent by the server
Flow -> Dict """

This comment has been minimized.

@mhils

mhils Feb 24, 2017

Member

We are Python 3.5+ only, so if you want to annotate types (which would not required), you can use Python 3 type annotations:

from mitmproxy import http
from typing import Dict

def get_cookies(flow: http.HTTPFlow) -> Dict[str, str]:
@mhils

mhils Feb 24, 2017

Member

We are Python 3.5+ only, so if you want to annotate types (which would not required), you can use Python 3 type annotations:

from mitmproxy import http
from typing import Dict

def get_cookies(flow: http.HTTPFlow) -> Dict[str, str]:
Show outdated Hide outdated mitmproxy/addons/xss.py
for name, SC in responseCookies:
cookieDict[name] = SC.value
for name, value in requestCookies:
cookieDict[name] = value

This comment has been minimized.

@mhils

mhils Feb 24, 2017

Member

If the response sets a new value for cookie "foo", you'll here overwrite it with the previous value from the request.
I'm not sure if you need the response cookies (the initial request worked without them as well), but you can probably simplify this to

# untested
cookies = flow.request.cookies.copy()
cookies.update(flow.response.cookies)
return dict(cookies)
@mhils

mhils Feb 24, 2017

Member

If the response sets a new value for cookie "foo", you'll here overwrite it with the previous value from the request.
I'm not sure if you need the response cookies (the initial request worked without them as well), but you can probably simplify this to

# untested
cookies = flow.request.cookies.copy()
cookies.update(flow.response.cookies)
return dict(cookies)
Show outdated Hide outdated test/mitmproxy/addons/test_xss.py
self.value = value
class MockFlow:

This comment has been minimized.

@mhils

mhils Feb 24, 2017

Member

Look at the tests in test/mitmproxy/ (test_example.py might be interesting in particular) - you should be able to use regular mitmproxy flows here.

@mhils

mhils Feb 24, 2017

Member

Look at the tests in test/mitmproxy/ (test_example.py might be interesting in particular) - you should be able to use regular mitmproxy flows here.

Show outdated Hide outdated test/mitmproxy/addons/test_xss.py
@@ -0,0 +1,280 @@
import unittest

This comment has been minimized.

@mhils

mhils Feb 24, 2017

Member

We use the fantastic pytest for testing mitmproxy, so no need for unittest here. You should get a good idea at https://github.com/mitmproxy/mitmproxy/blob/master/test/mitmproxy/test_examples.py#L129 about the style.

In a nutshell:

@mhils

mhils Feb 24, 2017

Member

We use the fantastic pytest for testing mitmproxy, so no need for unittest here. You should get a good idea at https://github.com/mitmproxy/mitmproxy/blob/master/test/mitmproxy/test_examples.py#L129 about the style.

In a nutshell:

ddworken added some commits Feb 25, 2017

Switched to pytest, added another detection algorithm, and fixed othe…
…r comments

Everything is now switched completely over to pytest using monkeypatch for mocking.
Added support for detecting XSS via attributes that execute JS
Removed comments on imports
Moved files from addons to examples
pep8 compliant variable names
@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Feb 26, 2017

Contributor

Now everything has been changed over so this can be integrated as an example. I moved the files, switched to pytest, fixed the variable names, and added type hinting.

Let me know if there is anything else to do for this and thanks for all the feedback!

Contributor

ddworken commented Feb 26, 2017

Now everything has been changed over so this can be integrated as an example. I moved the files, switched to pytest, fixed the variable names, and added type hinting.

Let me know if there is anything else to do for this and thanks for all the feedback!

@Kriechi

This is getting into a good shape! 👍
A few minor things are left - but overall a big improvement! Thanks for taking the time and addressing our comments!

Show outdated Hide outdated examples/complex/xss_scanner.py
# - line -> str
class XSSData:

This comment has been minimized.

@Kriechi

Kriechi Feb 26, 2017

Member

This class looks like it could be simplified by using a namedtuple instead.

@Kriechi

Kriechi Feb 26, 2017

Member

This class looks like it could be simplified by using a namedtuple instead.

Show outdated Hide outdated examples/complex/xss_scanner.py
# - dbms -> str
class SQLiData:

This comment has been minimized.

@Kriechi

Kriechi Feb 26, 2017

Member

This class looks like it could be simplified by using a namedtuple instead.

@Kriechi

Kriechi Feb 26, 2017

Member

This class looks like it could be simplified by using a namedtuple instead.

Show outdated Hide outdated examples/complex/xss_scanner.py
return False
return body[index:index + len(substring)] == substring
def is_not_escaped(index):

This comment has been minimized.

@Kriechi

Kriechi Feb 26, 2017

Member

Why are next_part_is_substring and is_not_escaped defined as functions?
The code is only called once. It would be cleaner to just assign it to as bool value:

next_part_is_substring = (
    (index + len(substring) > len(body)) or
    (body[index:index + len(substring)] == substring)
)

(similar for the other one, if I'm reading the conditions the right way...)

And then you can just this variable in you if-block below:
if char == qc and is_not_excaped:...

@Kriechi

Kriechi Feb 26, 2017

Member

Why are next_part_is_substring and is_not_escaped defined as functions?
The code is only called once. It would be cleaner to just assign it to as bool value:

next_part_is_substring = (
    (index + len(substring) > len(body)) or
    (body[index:index + len(substring)] == substring)
)

(similar for the other one, if I'm reading the conditions the right way...)

And then you can just this variable in you if-block below:
if char == qc and is_not_excaped:...

ddworken added some commits Feb 26, 2017

Switched to named tuples and removed two functions
Switched to namedtuples for XSSData and SQLiData. Note that it is using the NamedTuples from typing so that we get type hinting for the named tuples.
Inlined the `next_part_is_substring` and `is_not_escaped` functions
@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Feb 26, 2017

Contributor

@Kriechi I switched it over to NamedTuples (I'm using the NamedTuple from typing rather than from collections so that I still get type hinting throughout) and inlined those functions. Let me know if there is anything else. 👍

Contributor

ddworken commented Feb 26, 2017

@Kriechi I switched it over to NamedTuples (I'm using the NamedTuple from typing rather than from collections so that I still get type hinting throughout) and inlined those functions. Let me know if there is anything else. 👍

@mhils

This looks almost ready now! 😃

Show outdated Hide outdated test/mitmproxy/examples/test_xss_scanner.py
self.value = value
class MockFlow:

This comment has been minimized.

@mhils

mhils Feb 26, 2017

Member

This...

@mhils

mhils Feb 26, 2017

Member

This...

Show outdated Hide outdated test/mitmproxy/examples/test_xss_scanner.py
def __init__(self, value):
self.value = value
class MockFlow:

This comment has been minimized.

@mhils

mhils Feb 26, 2017

Member

...is redefined here...

@mhils

mhils Feb 26, 2017

Member

...is redefined here...

Show outdated Hide outdated test/mitmproxy/examples/test_xss_scanner.py
self.request = MockResponseOrRequest([("cookieName2", "cookieValue2")])
def test_get_cookies(self):
mocked_flow = MockFlow()

This comment has been minimized.

@mhils

mhils Feb 26, 2017

Member

...and then only used here. I think a tflow (where you add cookies) would do as well :)

@mhils

mhils Feb 26, 2017

Member

...and then only used here. I think a tflow (where you add cookies) would do as well :)

import re
from html.parser import HTMLParser
from mitmproxy import http
from typing import Dict, Union, Tuple, Optional, List, NamedTuple

This comment has been minimized.

@mhils

mhils Feb 26, 2017

Member

Now that this is in examples, let's add a brief comment here explaining what it does how to use it! 😃

@mhils

mhils Feb 26, 2017

Member

Now that this is in examples, let's add a brief comment here explaining what it does how to use it! 😃

Switched to tutils and tflow for mocking and added docs
Switched to using tflow with tutils.req and tutils.resp
Added a header to xss_scanner.py with a description of the script and how it works
@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Feb 26, 2017

Contributor

Thanks for the tip on setting cookies for the treq, I hadn't realized that I could directly do that. Now it works using that and I added docs to the header. Thanks again for all the help!

Contributor

ddworken commented Feb 26, 2017

Thanks for the tip on setting cookies for the treq, I hadn't realized that I could directly do that. Now it works using that and I added docs to the header. Thanks again for all the help!

@Kriechi Kriechi merged commit 99b584a into mitmproxy:master Feb 27, 2017

4 checks passed

codecov/patch Coverage not affected when comparing 973406f...f418701
Details
codecov/project 86.33% (+1.08%) compared to 973406f
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@Kriechi

This comment has been minimized.

Show comment
Hide comment
@Kriechi

Kriechi Feb 27, 2017

Member

Thanks! 🎉 👍
Really great work and thanks for addressing all our requests!

Member

Kriechi commented Feb 27, 2017

Thanks! 🎉 👍
Really great work and thanks for addressing all our requests!

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Feb 27, 2017

Contributor

🎉 Thanks again for working with me to get this merged! 🎉

Contributor

ddworken commented Feb 27, 2017

🎉 Thanks again for working with me to get this merged! 🎉

krsoninikhil added a commit to krsoninikhil/mitmproxy that referenced this pull request Mar 3, 2017

@brannondorsey

This comment has been minimized.

Show comment
Hide comment
@brannondorsey

brannondorsey Jul 27, 2017

Not sure if this is the best place for this, so feel free to shoo me away if not, but I'm having trouble parsing out exactly what the XSS payload in this example does. Would someone be able to explain this somewhat cryptic line?

Not sure if this is the best place for this, so feel free to shoo me away if not, but I'm having trouble parsing out exactly what the XSS payload in this example does. Would someone be able to explain this somewhat cryptic line?

@mhils

This comment has been minimized.

Show comment
Hide comment
@mhils

mhils Jul 27, 2017

Member

@brannondorsey: I might be wrong, but I think the point it to just probe which characters are escaped. :)

Member

mhils commented Jul 27, 2017

@brannondorsey: I might be wrong, but I think the point it to just probe which characters are escaped. :)

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jul 27, 2017

Contributor

Yup exactly. The letters between the special characters are used to make it easy to extract whether or not they are being escaped. Let me know if there is anything else I can do to clarify anything. 👍

Contributor

ddworken commented Jul 27, 2017

Yup exactly. The letters between the special characters are used to make it easy to extract whether or not they are being escaped. Let me know if there is anything else I can do to clarify anything. 👍

@brannondorsey

This comment has been minimized.

Show comment
Hide comment
@brannondorsey

brannondorsey Jul 27, 2017

Gotcha. So you insert those various checks and then read them back out from the injection to test if anything was escaped? And then if it wasn't, you've found a potential vulnerability? Also, why those specific values for FRONT_WALL and BACK_WALL? Thanks!

Gotcha. So you insert those various checks and then read them back out from the injection to test if anything was escaped? And then if it wasn't, you've found a potential vulnerability? Also, why those specific values for FRONT_WALL and BACK_WALL? Thanks!

@ddworken

This comment has been minimized.

Show comment
Hide comment
@ddworken

ddworken Jul 28, 2017

Contributor

Yup, exactly.

Not much reason behind those values, just needed a unique string that can be used to pick out the place where our input is reflected into the page. If you look at those strings while looking at a standard US qwerty keyboard there is some amount of a pattern but that is just a factor of how I mashed the keyboard.

Contributor

ddworken commented Jul 28, 2017

Yup, exactly.

Not much reason behind those values, just needed a unique string that can be used to pick out the place where our input is reflected into the page. If you look at those strings while looking at a standard US qwerty keyboard there is some amount of a pattern but that is just a factor of how I mashed the keyboard.

@Kriechi Kriechi referenced this pull request Jan 23, 2018

Closed

Extend mypy checking #2194

3 of 5 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment