Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requests seems to either ignore or not notice rare transport errors #4415

Open
szborows opened this issue Dec 4, 2017 · 9 comments
Open

Requests seems to either ignore or not notice rare transport errors #4415

szborows opened this issue Dec 4, 2017 · 9 comments

Comments

@szborows
Copy link

@szborows szborows commented Dec 4, 2017

So I was playing around with blockade and small piece of code that was using request library. I was researching something completely different but I have noticed some anomaly that seems to be originated somewhere in the library or libraries it uses. In essence, 1-2% of requests made when blockade disrupts the network end up with status code 200 and no exceptions, but content is only partial.

Blockade is a tool to test various network conditions. It's on GitHub too.

Maybe the bug is in request or maybe it lies somewhere lower. Or maybe it's my configuration that leads to such results. I successfully tried to uncover this bug on several machines (both physical laptops and VMs in the cloud). I asked colleague from my floor to reproduce this and he succeeded. Replacing requests code with urllib.request.urlopen results in exceptions being thrown for these 1-2% of cases.

I'll be glad if someone else can confirm this or let me know that I did something wrong and should go home.

Expected Result

Exception raised.

Actual Result

Exception is not raised. Response content is cut. Following is output from freshly deployed VM in private cloud. Hash starting with e4b is proper one.

INFO | request #33: http://file_server/files/1MB.bin
INFO | request #33: downloaded 1.0MB in 55.45s. code: 200, hsh: e4b7f62da691604a692844e93304de6b
INFO | 172.17.0.1 - - [04/Dec/2017:01:20:16 +0000] "POST /dl HTTP/1.1" 200 0 "-" "curl/7.52.1"
INFO | request #34: http://file_server/files/1MB.bin
INFO | request #34: downloaded 1.0MB in 50.31s. code: 200, hsh: e4b7f62da691604a692844e93304de6b
INFO | 172.17.0.1 - - [04/Dec/2017:01:21:11 +0000] "POST /dl HTTP/1.1" 200 0 "-" "curl/7.52.1"
INFO | request #35: http://file_server/files/1MB.bin
INFO | request #35: downloaded 0.1509552001953125MB in 69.74s. code: 200, hsh: afb64be8e2fe70f73f9d6e37d39478bf
INFO | 172.17.0.1 - - [04/Dec/2017:01:22:25 +0000] "POST /dl HTTP/1.1" 200 0 "-" "curl/7.52.1"
INFO | request #36: http://file_server/files/1MB.bin
INFO | request #36: downloaded 1.0MB in 55.48s. code: 200, hsh: e4b7f62da691604a692844e93304de6b
INFO | 172.17.0.1 - - [04/Dec/2017:01:23:25 +0000] "POST /dl HTTP/1.1" 200 0 "-" "curl/7.52.1"
INFO | request #37: http://file_server/files/1MB.bin
INFO | request #37: downloaded 1.0MB in 62.19s. code: 200, hsh: e4b7f62da691604a692844e93304de6b

Reproduction Steps

  1. Clone my repo: szborows/blockade-example.
  2. Install blockade (I've tested both 0.3.1 and 0.4.0): pip install blockade
  3. Build docker image: ./build and pull nginx: docker pull nginx:1.11-alpine
  4. Start blockade: blockade up. It should confirm with two containers up
  5. I open separate console to see output from backend container in real time: docker logs -f blockadeexample_backend
  6. Customize and run the script. You can disable tcpdump at start etc. To start script execute ./run
  7. If you don't observe any anomalies in the logs after several runs (each does 100 iterations), then try to change 30 in test_flaky to something bigger. On some machines in cloud I needed to increase this to 40 to observe the failure sooner.

System Information

Python version is: 3.6-alpine. Requests version: 2.18.4. Blockade version: 0.4.0.

Edit: Adding detailed info from my local container.

{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.6"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.0"
  },
  "platform": {
    "release": "4.8.0-1-amd64",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.18.4"
  },
  "system_ssl": {
    "version": "100020af"
  },
  "urllib3": {
    "version": "1.22"
  },
  "using_pyopenssl": false
}
@sigmavirus24
Copy link
Contributor

@sigmavirus24 sigmavirus24 commented Dec 4, 2017

Those log messages have no context or meaning. Please explain what you're trying to communicate with them. Further, please provide the actual requests code that you're using. As far as I can tell, requests isn't in use from those logs.

Loading

@szborows
Copy link
Author

@szborows szborows commented Dec 4, 2017

I think code speaks for itself:
https://github.com/szborows/blockade-example/blob/master/backend/backend.py#L29

this is just downloading file that is 1MB in size and checking md5 hash. AFAIK hash should be the same all of the times.

Loading

@Lukasa
Copy link
Member

@Lukasa Lukasa commented Dec 7, 2017

What exceptions does urlopen throw in the error case?

Loading

@szborows
Copy link
Author

@szborows szborows commented Dec 7, 2017

I have observed following two. I have also tested urllib3's connection pool and it was raising urllib3.exceptions.ProtocolError, but I don't have stacktrace.

ERROR:aiohttp.server:Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_server.py", line 62, in handle_request
    resp = yield from self._handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web.py", line 270, in _handle
    resp = yield from handler(request)
  File "/app/backend.py", line 27, in handle_download
    data = response.read()
  File "/usr/local/lib/python3.6/http/client.py", line 462, in read
    s = self._safe_read(self.length)
  File "/usr/local/lib/python3.6/http/client.py", line 614, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(85432 bytes read, 963144 more expected)
ERROR:aiohttp.server:Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web_server.py", line 62, in handle_request
    resp = yield from self._handler(request)
  File "/usr/local/lib/python3.6/site-packages/aiohttp/web.py", line 270, in _handle
    resp = yield from handler(request)
  File "/app/backend.py", line 26, in handle_download
    response = urllib.request.urlopen(post_data['url'])
  File "/usr/local/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/lib/python3.6/urllib/request.py", line 526, in open
    response = self._open(req, data)
  File "/usr/local/lib/python3.6/urllib/request.py", line 544, in _open
    '_open', req)
  File "/usr/local/lib/python3.6/urllib/request.py", line 504, in _call_chain
    result = func(*args)
  File "/usr/local/lib/python3.6/urllib/request.py", line 1346, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/usr/local/lib/python3.6/urllib/request.py", line 1321, in do_open
    r = h.getresponse()
  File "/usr/local/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/local/lib/python3.6/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.6/http/client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

Loading

@Lukasa
Copy link
Member

@Lukasa Lukasa commented Dec 10, 2017

Both of those errors are not swallowed or ignored by Requests, so it’s hard to believe that we’re ignoring them in this case.

Loading

@szborows
Copy link
Author

@szborows szborows commented Dec 10, 2017

@Lukasa have you tried reproducing this? I believe that somebody else must confirm this..

Loading

@keuko
Copy link

@keuko keuko commented Jul 24, 2020

Hi,

I think i found a solution and created pull requests for urllib3 -> urllib3/urllib3#1911
After it will be merged ( I hope ) , requests should raise MaxRetries error as it should , because they are setting max_retries = 0 to retry object of urllib3 by default - > this should be discussed if better not to leave default urllib3 value which is 3 , or pass 1.

Loading

@jakermx
Copy link

@jakermx jakermx commented Feb 5, 2021

Requests is an API for make easy HTTP/HTTPS Requests, it still depends on urlib3 and urlib depends on sockets and it from the OS...

in order to handle this things, you have the aviabily to use Adapters.... :)

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants