Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failure: "json_processing_error" #911

Closed
anadahz opened this issue Apr 29, 2019 · 12 comments
Closed

failure: "json_processing_error" #911

anadahz opened this issue Apr 29, 2019 · 12 comments
Assignees
Labels
ooni/probe-mobile issues related to OONI Probe mobile priority/low ux wontfix when an issue won't be addressed (add a comment to the issue as to why this is the case)
Projects

Comments

@anadahz
Copy link

anadahz commented Apr 29, 2019

A strange measurement reported by @xhdix in #ooni:
https://explorer.ooni.io/measurement/20190428T192329Z_AS197207_whIbylNGE3kRug3AJEnC9qQTiZJeF8wKjnUqZY3VD5LpaAZWLW

Not sure if this is the right place to report this issue.

Specifications

  • Version: 2.0.3
  • Platform: measureme-kit
  • Subsystem: ooniprobe-android
@bassosimone
Copy link
Member

bassosimone commented Apr 30, 2019

Another similar failure:

https://explorer.ooni.io/measurement/20190402T090256Z_AS58224_X2HTqiXeWO1Xb9NJxKsG6ln4v1OfID2zl0BvcdA3QzVXHfn8Lp

Logs:

HTTP Header Field Manipulation Test:

Contacting bouncer: https://bouncer.ooni.io
Using discovered collector: https://c.collector.ooni.io:443
Bouncer discovered helper for http-return-json-headers: http://37.218.247.95:80
Using discovered helper as 'backend'
Your country: IR
Your ISP identifier: AS58224
Your ISP name: Iran Telecommunication Company PJS
Results collector: https://c.collector.ooni.io:443
Opening report; please be patient...
Report ID: 20190402T090256Z_AS58224_X2HTqiXeWO1Xb9NJxKsG6ln4v1OfID2zl0BvcdA3QzVXHfn8Lp
response body not valid JSON
Submitting test results; please be patient...
Results successfully submitted
Closing report; please be patient...
Report successfully closed

@xhdix
Copy link

xhdix commented Apr 30, 2019

Another report from AS197207 sent by the same user:
https://explorer.ooni.io/measurement/20190428T192258Z_AS197207_py2wAHgNm3shTTH8lgGkkqqbt2k2StQKiva5vH96JQ5zrNy49H

I think there are two different json errors on the server side and on the client side.

@xhdix
Copy link

xhdix commented Apr 30, 2019

On the client side, the app mistakenly detects HTML (Probably the content sent by the censorship system) as object and wants to parse it with json. ( I guess)

response: Object
body: "<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8"><TITLE>503 Service Unavailable</TITLE></HEAD><BODY><H1>503 Service Unavailable</H1></BODY></HTML>"
 headers: Object
Content-Length:  "175"
response_line:  "HTTP/1.1 503 Service Unavailable"
code: 503

@bassosimone
Copy link
Member

@xhdix yes, I believe your analysis is right.

@bassosimone
Copy link
Member

To reproduce this issue locally, I did the following:

  1. run this python script:
import socket
import sys
import os
import time

HOST, PORT = '127.0.0.1', 7001
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(10)
conn, addr = s.accept()
conn.recv(65536)
conn.sendall("HTTP/1.0 503 Service Unavailable\r\n")
conn.sendall("Connection: close\r\n\r\n")
conn.sendall("""
  <HTML><HEAD>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <TITLE>503 Service Unavailable</TITLE>
  </HEAD><BODY><H1>503 Service Unavailable</H1></BODY></HTML>""")
s.close()
  1. run MK v0.10.3 as follows:
measurement_kit -vn http_header_field_manipulation -b http://127.0.0.1:7001
  1. the following is the log:
[D] net_test: running next measurement
[D] net_test: creating entry
[D] net_test: calling setup
[D] net_test: running with input 
[D] resolve_hostname: 127.0.0.1
[D] resolve_hostname: is valid ipv4
[D] connect_base 127.0.0.1:7001
[D] connect() in progress...
[D] net_test: running next measurement
[D] net_test: reached end of input
[D] net_test: running next measurement
[D] net_test: reached end of input
[D] connect time: 0.000785
[D] > GET / HTTP/1.1
[D] > hoST: 5iTz7F7FiPo99SC.com
[D] > UsER-agEnT: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2) Gecko/20100115 Firefox/3.6
[D] > aCCEPT: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
[D] > aCCePt-enCODIng: gzip,deflate,sdch
[D] > accepT-LAnguaGE: en-US,en;q=0.8
[D] > accEPT-CHarSet: ISO-8859-1,utf-8;q=0.7,*;q=0.3
[D] >
[D] http: started reading response
[D] http: passing read data to parser
[D] < HTTP/1.0 503 Service Unavailable
[D] < Connection: close
[D] <
[D] http: continue reading for the response
[D] http: passing read data to parser
[D] http: continue reading for the response
[D] connection: got bufferevent event: ZtfRw
[D] http: received error 1000 on connection
[D] Now passing EOF to parser
[!] response body not valid JSON
  1. the following is the report:
{
    "annotations": {
        "engine_name": "libmeasurement_kit",
        "engine_version": "0.10.3",
        "engine_version_full": "v0.10.3",
        "failure_asn_lookup": "true",
        "failure_cc_lookup": "true",
        "failure_network_name_lookup": "true",
        "platform": "macos"
    },
    "data_format_version": "0.2.0",
    "id": "27b7b4b7-2584-432c-a742-40bb19eff1c4",
    "input": null,
    "input_hashes": [],
    "measurement_start_time": "2019-05-23 08:39:10",
    "options": [],
    "probe_asn": "AS0",
    "probe_cc": "ZZ",
    "probe_city": null,
    "probe_ip": "127.0.0.1",
    "report_id": "",
    "software_name": "measurement_kit",
    "software_version": "0.10.3",
    "test_helpers": {
        "backend": "http://127.0.0.1:7001"
    },
    "test_keys": {
        "agent": "agent",
        "client_resolver": "91.80.37.84",
        "failure": null,
        "requests": [
            {
                "failure": null,
                "request": {
                    "body": "",
                    "headers": {
                        "UsER-agEnT": "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2) Gecko/20100115 Firefox/3.6",
                        "aCCEPT": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
                        "aCCePt-enCODIng": "gzip,deflate,sdch",
                        "accEPT-CHarSet": "ISO-8859-1,utf-8;q=0.7,*;q=0.3",
                        "accepT-LAnguaGE": "en-US,en;q=0.8",
                        "hoST": "5iTz7F7FiPo99SC.com"
                    },
                    "method": "GET",
                    "tor": {
                        "exit_ip": null,
                        "exit_name": null,
                        "is_tor": false
                    },
                    "url": "http://127.0.0.1:7001/"
                },
                "response": {
                    "body": "\n  <HTML><HEAD>\n    <meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n    <TITLE>503 Service Unavailable</TITLE>\n  </HEAD><BODY><H1>503 Service Unavailable</H1></BODY></HTML>",
                    "code": 503,
                    "headers": {
                        "Connection": "close"
                    },
                    "response_line": "HTTP/1.0 503 Service Unavailable"
                }
            }
        ],
        "socksproxy": null,
        "tampering": {
            "header_field_name": null,
            "header_name_diff": null,
            "request_line_capitalization": true,
            "total": true
        }
    },
    "test_name": "http_header_field_manipulation",
    "test_runtime": 0.006891012191772461,
    "test_start_time": "2019-05-23 08:39:09",
    "test_version": "0.0.1"
}

The relevant part of the JSON seems to me in line with the report that @xhdix shared while the log messages seem in line with what @xhdix reported. I believe the behaviour of MK is correct in this case, modulo the need to check whether the spec says what should be the value of the tampering entry in case the response is not JSON. It may also be that we could do a better job at clearly flagging a result like this as tampering (even though "total": true seems already a clear indication).

On the contrary, the original issue reported by @anadahz looks more problematic since it seems to happen really early, and specifically so early that we don't even know the test name (!?).

@bassosimone
Copy link
Member

Follow-up checks (to become maybe integration tests):

  1. what happens if the body contains random data? it's base64 encoded

  2. what happens if a header contains random data? sometimes http_parser_error and sometimes the random data does not break the HTTP parser and we get a base64 encoded header

  3. what happens if the IP address is included in the returned page or in a header? redacted

  4. what happens if we're in case 3 but the page/header contains binary data? difficult to reproduce for header; IP address redacted in base64 encoded body

@bassosimone
Copy link
Member

bassosimone commented May 23, 2019

Now, regarding the first measurement, here's what I did:

  1. the report ID is 20190428T192329Z_AS197207_whIbylNGE3kRug3AJEnC9qQTiZJeF8wKjnUqZY3VD5LpaAZWLW

  2. I used curl to the measurement ID:

curl "https://api.ooni.io/api/v1/measurements?report_id=20190428T192329Z_AS197207_whIbylNGE3kRug3AJEnC9qQTiZJeF8wKjnUqZY3VD5LpaAZWLW"

to obtain:

```json
{
    "metadata": {
        "count": 1,
        "current_page": 1,
        "limit": 100,
        "next_url": null,
        "offset": 0,
        "pages": 1,
        "query_time": 0.00869607925415039
    },
    "results": [
        {
            "anomaly": false,
            "confirmed": false,
            "failure": true,
            "input": null,
            "measurement_id": "temp-id-296207424",
            "measurement_start_time": "2019-04-28T19:23:27Z",
            "measurement_url": "https://api.ooni.io/api/v1/measurement/temp-id-296207424",
            "probe_asn": "AS197207",
            "probe_cc": "IR",
            "report_id": "20190428T192329Z_AS197207_whIbylNGE3kRug3AJEnC9qQTiZJeF8wKjnUqZY3VD5LpaAZWLW",
            "test_name": "dash"
        }
    ]
}
  1. So, I fetched the measurement:
curl https://api.ooni.io/api/v1/measurement/temp-id-296207424

to obtain:

{
    "annotations": {
        "engine_name": "libmeasurement_kit",
        "engine_version": "0.9.3",
        "engine_version_full": "v0.9.3",
        "flavor": "fdroid",
        "network_type": "mobile",
        "platform": "android"
    },
    "backend_version": null,
    "bucket_date": "2019-04-28",
    "data_format_version": "0.2.0",
    "id": "35730c9e-9554-4aba-96fb-4f960e2ce31f",
    "input": null,
    "input_hashes": [],
    "measurement_start_time": "2019-04-28 19:23:27",
    "options": [],
    "probe_asn": "AS197207",
    "probe_cc": "IR",
    "probe_city": null,
    "probe_ip": "127.0.0.1",
    "report_filename": "2019-04-28/20190428T192323Z-IR-AS197207-dash-20190428T192329Z_AS197207_whIbylNGE3kRug3AJEnC9qQTiZJeF8wKjnUqZY3VD5LpaAZWLW-0.2.0-probe.json",
    "report_id": "20190428T192329Z_AS197207_whIbylNGE3kRug3AJEnC9qQTiZJeF8wKjnUqZY3VD5LpaAZWLW",
    "software_name": "ooniprobe-android",
    "software_version": "2.0.3",
    "test_helpers": {},
    "test_keys": {
        "client_resolver": "2.188.21.131",
        "failure": "json_processing_error"
    },
    "test_name": "dash",
    "test_runtime": 0.515253067,
    "test_start_time": "2019-04-28 19:23:23",
    "test_version": "0.7.0"
}
  1. so, this test is dash, and we need to figure out a way in which a middlebox could cause it to receive an invalid JSON message by using a Python script similar to the one above.

  2. Ta-dah:

Negotiating with: 127.0.0.1:7007
[D] resolve_hostname: 127.0.0.1
[D] resolve_hostname: is valid ipv4
[D] connect_base 127.0.0.1:7007
[D] connect() in progress...
[D] net_test: running next measurement
[D] net_test: reached end of input
[D] net_test: running next measurement
[D] net_test: reached end of input
[D] connect time: 0.000531
[D] > POST /negotiate/dash HTTP/1.1
[D] > Content-Type: application/json
[D] > Authorization: 
[D] > Cache-Control: no-cache, no-store, must-revalidate
[D] > Host: 127.0.0.1:7007
[D] > Content-Length: 127
[D] >
[D] http: started reading response
[D] http: passing read data to parser
[D] http: continue reading for the response
[D] http: passing read data to parser
[D] < HTTP/1.0 200 OK
[D] < Connection: close
[D] <
[D] http: continue reading for the response
[D] connection: got bufferevent event: ZtfRw
[D] http: received error 1000 on connection
[D] Now passing EOF to parser
[!] neubot: cannot parse negotiate response: json_processing_error
[!] neubot: negotiate failed: json_processing_error

Where the script is now:

import socket
import sys
import os
import time

HOST, PORT = '127.0.0.1', 7007
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(10)
conn, addr = s.accept()
conn.recv(65536)
conn.sendall("HTTP/1.0 200 OK\r\n")
conn.sendall("Connection: close\r\n\r\n")
conn.sendall("""
  <HTML><HEAD>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <TITLE>503 Service Unavailable</TITLE>
  </HEAD><BODY><H1>503 Service Unavailable</H1></BODY></HTML>""")
s.close()

and the test keys matches the one in the original report.

@bassosimone
Copy link
Member

To conclude, I believe there may be app related actions to better visualise these failure, but it seems to me that the behaviour we see is consistent with the spirit of these two tests.

@lorenzoPrimi do we have a way to see what the app shows given a specific JSON?

@sarathms same question but for Explorer (maybe the next version not the legacy one)

@bassosimone
Copy link
Member

Meanwhile, I'll close the related MK bug as I don't think there's an MK bug here.

@bassosimone
Copy link
Member

Moving to the icebox because no-one is assigned.

@hellais hellais transferred this issue from ooni/probe-android Jan 2, 2020
@hellais hellais added priority/low ooni/probe-mobile issues related to OONI Probe mobile ux labels Jan 2, 2020
@bassosimone bassosimone self-assigned this Apr 27, 2020
@bassosimone bassosimone added the wontfix when an issue won't be addressed (add a comment to the issue as to why this is the case) label Jun 8, 2020
@bassosimone
Copy link
Member

Added wontfix label because this is an MK bug that we are not going to fix. We'll do better with the new Go engine, and we'll keep MK unchanged, so it will have this bug.

@hellais
Copy link
Member

hellais commented Apr 27, 2021

I guess this issue can be closed.

@hellais hellais closed this as completed Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ooni/probe-mobile issues related to OONI Probe mobile priority/low ux wontfix when an issue won't be addressed (add a comment to the issue as to why this is the case)
Projects
No open projects
OONI-Verse
  
Icebox
Development

No branches or pull requests

5 participants