Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Unknown operator when using flow file which contains content length information #15

Closed
AndreiUlmeyda opened this issue Jun 24, 2022 · 9 comments

Comments

@AndreiUlmeyda
Copy link

AndreiUlmeyda commented Jun 24, 2022

Hey there, great tool. I am stoked to try it out.
I've run into a problem though in the following way

  1. capture one request using mitmweb
  2. save flow file (file name 'flows')
  3. run mitmproxy2swagger -i flows -o schema -p https://redacted.redacted/

result:
No existing swagger file found. Creating new one. Traceback (most recent call last): File "/home/redacted/.local/bin/mitmproxy2swagger", line 8, in <module> sys.exit(main()) File "/home/redacted/.local/lib/python3.10/site-packages/mitmproxy2swagger/mitmproxy2swagger.py", line 121, in main for f in caputre_reader.captured_requests(): File "/home/redacted/.local/lib/python3.10/site-packages/mitmproxy2swagger/har_capture_reader.py", line 87, in captured_requests data = json_stream.load(f) File "/usr/lib/python3.10/site-packages/json_stream/loader.py", line 8, in load return StreamingJSONBase.factory(token, token_stream, persistent) File "/usr/lib/python3.10/site-packages/json_stream/base.py", line 28, in factory raise ValueError(f"Unknown operator {token}") # pragma: no cover ValueError: Unknown operator 9613

interpretation:
The flow file itself uses a human readable format which incorporates content length information in addition to the request/response json data. The symbols the json parser is complaining about are the very first few characters of the flow file 9613:4:type;4:http;7 ... 3880:{"httpstatuscode":"200","statusmessage":"OK" ... 16:certificate_list;1462:1456:-----BEGIN CERTIFICATE----- ...

question:
What is the reason for the format mismatch? Where did I fuck up?

context:
Arch Linux, up to date
mitmweb --version Mitmproxy: 8.1.0 Python: 3.10.5 OpenSSL: OpenSSL 1.1.1o 3 May 2022 Platform: Linux-5.18.5-arch1-1-x86_64-with-glibc2.35
mitmproxy2swagger: seems to happen with both the arch linux package (v0.6.1) and the version installed using pip install mitmproxy2swagger

Cheers!

@alufers
Copy link
Owner

alufers commented Jun 24, 2022

Looks like a mismatched version of mitmproxy. I will have to update the dependency and check it.

@AndreiUlmeyda
Copy link
Author

Aye! Thanks for investigating. Could I, in the meantime, try to run it with mitmproxy 8.0 and have a chance of it working?

@alufers
Copy link
Owner

alufers commented Jul 4, 2022

I think so

@pdlloyd
Copy link

pdlloyd commented Jul 9, 2022

FWIW I have this same issue. Downgrading to 8.0 did not solve the issue.

[user@machine]$ mitmproxy2swagger -i flows -o out-schema -p https://web.site/api
No existing swagger file found. Creating new one.
Traceback (most recent call last):
  File "/usr/bin/mitmproxy2swagger", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.10/site-packages/mitmproxy2swagger/mitmproxy2swagger.py", line 121, in main
    for f in caputre_reader.captured_requests():
  File "/usr/lib/python3.10/site-packages/mitmproxy2swagger/har_capture_reader.py", line 87, in captured_requests
    data = json_stream.load(f)
  File "/usr/lib/python3.10/site-packages/json_stream/loader.py", line 8, in load
    return StreamingJSONBase.factory(token, token_stream, persistent)
  File "/usr/lib/python3.10/site-packages/json_stream/base.py", line 28, in factory
    raise ValueError(f"Unknown operator {token}")  # pragma: no cover
ValueError: Unknown operator 6201
[user@machine]$ mitmproxy --version
Mitmproxy: 8.0.0
Python:    3.10.5
OpenSSL:   OpenSSL 1.1.1l  24 Aug 2021
Platform:  Linux-5.18.7-1-MANJARO-x86_64-with-glibc2.35

@zkxjzmswkwl
Copy link

Same issue.

@rarestg
Copy link

rarestg commented Jul 18, 2022

Same issue here:

Traceback (most recent call last):
  File "/Users/rares/miniforge3/envs/rares/bin/mitmproxy2swagger", line 8, in <module>
    sys.exit(main())
  File "/Users/rares/miniforge3/envs/rares/lib/python3.9/site-packages/mitmproxy2swagger/mitmproxy2swagger.py", line 121, in main
    for f in caputre_reader.captured_requests():
  File "/Users/rares/miniforge3/envs/rares/lib/python3.9/site-packages/mitmproxy2swagger/har_capture_reader.py", line 87, in captured_requests
    data = json_stream.load(f)
  File "/Users/rares/miniforge3/envs/rares/lib/python3.9/site-packages/json_stream/loader.py", line 8, in load
    return StreamingJSONBase.factory(token, token_stream, persistent)
  File "/Users/rares/miniforge3/envs/rares/lib/python3.9/site-packages/json_stream/base.py", line 28, in factory
    raise ValueError(f"Unknown operator {token}")  # pragma: no cover
ValueError: Unknown operator 8466

@strangelydim
Copy link

I just ran into this. Looks like the magical auto-detection of flow vs. har format doesn't work too well and ends up deciding everything is in har format. Note in the originally reported stack trace that despite a flow format file being used, we end up in har_capture_reader.py code...

In my case, if I just use the -f option to force the file to be parsed as a flow file, it works for me. Just add -f flow to your command-line options.

@alufers
Copy link
Owner

alufers commented Aug 20, 2022

Good catch! Sorry for taking so long, I forgot about this issue. I will fix it ASAP.

@alufers
Copy link
Owner

alufers commented Aug 20, 2022

Hi.
I have fine-tuned the detection. Additionally I have added an error message which suggests you to specify the format manually if the detection fails.

Best regards

@alufers alufers closed this as completed Aug 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants