Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HAR #1464

Merged
merged 27 commits into from Aug 15, 2016
Merged

HAR #1464

merged 27 commits into from Aug 15, 2016

Conversation

dufferzafar
Copy link
Contributor

@dufferzafar dufferzafar commented Aug 2, 2016

This adds a script that can dump flows as HAR but does not depend on harparser.

Tasks:

  • Tests based on comparison of entire HAR files?
  • Unit tests for formatting functions
  • .zhar Compression
  • SSL and Connect Timings
  • Logging via ctx.log
  • Basic Test
  • Proper Cookies Format
  • Remove harparser dependency in setup.py
  • Append to file on every request, rather than writing at end (forgot that this is json!)
  • Read har file? (Separate PR if at all)

New Fields:

  • text in response content
  • postData for POST, PUT, PATCH
  • serverIPAddress
  • connection (we have no unique ids for connections)
  • pages (mitmproxy has no concept of pages)

Fixes #1320

@mkagenius
Copy link
Contributor

mkagenius commented Aug 2, 2016

The post Data field is missing in case of POST method (was missing in original har_extractor too). Would it possible to add that? Something like this, maybe:

    if flow.request.method == "POST":
        entry["request"]["postData"] = format_post_data(flow.request)

and

def format_params(obj):
    if obj:
        return [{"name": k, "value": unicode(v, errors='ignore')} for k, v in obj.items()]
    return ""

def format_post_data(obj):
    content_type = obj.headers.get("content-type", "")
    post_data = {"mimeType" : content_type}
    is_valid_content_type = "application/x-www-form-urlencoded" in content_type.lower()
    if is_valid_content_type:
        post_data["params"] = format_params(obj.urlencoded_form)

    post_data["text"] = unicode(obj.content, errors='ignore')
    return post_data

@mkagenius
Copy link
Contributor

mkagenius commented Aug 2, 2016

One more thing, regarding cookies (in response), they are coming like (same behavior in har_extractor too):

  "cookies": [
            {
              "name": "__cfduid",
              "value": "SetCookie(value='d4554e2d8e2c1406a252163f6f693300a1470146048', attrs=CookieAttrs[('expires', 'Wed, 02-Aug-17 13:54:08 GMT'), ('path', '/'), ('domain', '.mydomain.co'), ('HttpOnly', None)])"
            }
          ],

While each one should be a separate field as in the specification:

"cookies": [
    {
        "name": "TestCookie",
        "value": "Cookie Value",
        "path": "/",
        "domain": "www.janodvarko.cz",
        "expires": "2009-07-24T19:20:30.123+02:00",
        "httpOnly": false,
        "secure": false,
        "comment": ""
    }
]

@mhils
Copy link
Member

mhils commented Aug 3, 2016

This looks great. So much clearer than what we had before 👍

return dt.replace(tzinfo=pytz.timezone("UTC")).isoformat()


def format_cookies(cookies):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I unit test these functions? examples directory has no __init__.py so har_dump can't directly be imported!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the stuff in test_examples:

m, sc = tscript("har_dump.py", six.moves.shlex_quote(path))
format_datetime = sc.ns["format_datetime"]
# work with format_datetime

Not really ideal, but I feel that's still better than making the examples directory a module.

@dufferzafar
Copy link
Contributor Author

@mkagenius Should the postData field be included in PUT, PATCH requests too?

@mhils This is ready for review!

@mkagenius
Copy link
Contributor

@dufferzafar Good question. I am also not sure as the specification is not too clear about that. But I checked what chrome does when we do copy ALL as HAR from inspect element window and found that in case of PUT / PATCH too it creates the post body.

@dufferzafar
Copy link
Contributor Author

@mkagenius If chrome does this, then I think we should too. It seems like the correct thing to do anyways.


elif 'max-age' in cookie_attrs:
try:
max_age = int(cookie_attrs['Max-Age'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a policy about cookie attribute capitalisation?
Sometimes we use only-lowercase, sometimes mixed...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no policy, i guess because the _kconv function converts these keys to lowercase.

IMO, we should keep using mixed cases just to show that they don't matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants