Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch socket timeouts and clear dead connection (refs #18) #111

Merged
merged 6 commits into from
Nov 19, 2018

Conversation

justinfx
Copy link
Contributor

Re #18 , this MR catches a socket timeout in the request and clears the cached connection instance. This allows for the following request to establish a new connection instead of hitting the dead connection.

Not ready for merge yet. While manual reproduction mentioned in #18 now works correctly with this change, the tests are not properly failing when the patch is not applied. I am looking for some feedback to understand how to adjust the test to behave the same way a normal production usage would function with the Http class.

@temoto
Copy link
Member

temoto commented Aug 15, 2018

I'm also trying to get proper test without success so far.

@temoto
Copy link
Member

temoto commented Aug 15, 2018

@justinfx checking second reponse body shows more errors than wanted.

def test_timeout_subsequent():
    print('')
    timeout = 0.5
    sync_delay = 0.1

    def handler(read):
        request = read()
        print('- request raw={}'.format(request.raw))
        time.sleep(timeout + sync_delay)
        response = tests.http_response_bytes(status=500)
        yield response
        request = read()
        print('- request raw={}'.format(request.raw))
        yield tests.http_response_bytes(status=200, body=b"correct")

    http = httplib2.Http(timeout=timeout)
    http.force_exception_to_status_code = True
    with tests.server_yield(handler, request_count=2) as uri:
        print('- test request 1, expect 408 timeout')
        response, _ = http.request(uri)
        print(response, _)
        assert response.status == 408
        # time.sleep(timeout + sync_delay + sync_delay)
        print('- test request 2, expect 200')
        response, content = http.request(uri)
        print(response, content)
        assert response.status == 200
        assert content == b"correct"
# run particular test is faster than full script/test
# loop because it failed/passed randomly
while ./venv-27/bin/pytest -svk timeout_sub tests/test_other.py ; do ; done

update: I'm sorry had error in test code, server_yield works with atomic response yields, still doesn't work overall.

@temoto
Copy link
Member

temoto commented Aug 15, 2018

tcpdump says client closes connection after 0.5s without your patch

@temoto
Copy link
Member

temoto commented Aug 15, 2018

So we've been digging the wrong way, it actually works fine on Python2 and fails on Python3 (as was specified in #18 ), I incorrectly assumed problem should be observed on py2.

It still doesn't explain random pass/fail on py2, though.

@temoto
Copy link
Member

temoto commented Aug 15, 2018

Bare sockets server gives predictable pass on py2 and predictable fail on py3. I tend to blame py2 random fails on "tech debt" lack of socket.close in test server.

So I guess final task is to get test server behave like one below. I'll handle him.

from __future__ import print_function
import socket
import time

slow_delay = 0.6


def fmt_response(status, body):
    return b'HTTP/1.1 {s}\r\ncontent-length: {cl}\r\n\r\n{b}'.format(
        s=status, cl=len(body), b=body)


def per_conn(c):
    while True:
        try:
            r = c.recv(8<<10)
        except socket.error as e:
            print('! recv err:', repr(e))
            c.close()
            return
        print('<', repr(r))
        if not r:
            c.close()
            return
        rs = per_request(r)
        c.sendall(rs)


def per_request(r):
    assert r.endswith(b'\r\n\r\n')
    words = r.split(b' ', 2)
    assert words[0] == b'GET'
    path = words[1].rsplit(b'/', 1)[1]
    print('path: ', repr(path))
    if path == 's':
        time.sleep(slow_delay)
        return fmt_response('200 OK', b'slow')
    elif path == 'f':
        return fmt_response('200 OK', b'fast')
    return fmt_response('404 Wrong Path', b'')


s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('127.0.0.1', 9091))
s.listen(10)
while True:
    c, _ = s.accept()
    per_conn(c)

@thehesiod
Copy link

@justinfx you only fixed it for py2, please also fix for py3 as waiting for the response can trigger a timeout and cause the same issue.

thehesiod added a commit to farmersbusinessnetwork/httplib2 that referenced this pull request Oct 5, 2018
@codecov
Copy link

codecov bot commented Oct 6, 2018

Codecov Report

Merging #111 into master will increase coverage by 0.12%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #111      +/-   ##
==========================================
+ Coverage   73.45%   73.57%   +0.12%     
==========================================
  Files           8        8              
  Lines        2535     2547      +12     
==========================================
+ Hits         1862     1874      +12     
  Misses        673      673
Impacted Files Coverage Δ
python2/httplib2/__init__.py 82.4% <100%> (+0.1%) ⬆️
python3/httplib2/__init__.py 83.36% <100%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 86c3b4e...fcf52f1. Read the comment docs.

@justinfx
Copy link
Contributor Author

justinfx commented Oct 6, 2018

I've added the fix to the python3 package and tests pass for python 3.6.6.
Surprisingly, the skipped test_timeout_global also passes for me on 3.6.6 so I am not sure why it is being skipped.

@thehesiod
Copy link

Hopefully someone can commit this. We've run into this on our prod box using the Google API client

@temoto
Copy link
Member

temoto commented Oct 7, 2018

@httplib2/maintainers guys help. The problem and fix is "obviously correct". Test in patch does not work (no fail against code without fix), because test server is not surfacing the issue (my bad).

This is against best practices, but I think we should merge fix without test.

@justinfx
Copy link
Contributor Author

@httplib2/maintainers bump

Copy link
Member

@temoto temoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make it uniform snake case, otherwise LGTM

python2/httplib2/__init__.py Outdated Show resolved Hide resolved
python2/httplib2/__init__.py Outdated Show resolved Hide resolved
python2/httplib2/__init__.py Outdated Show resolved Hide resolved
temoto and others added 2 commits November 16, 2018 09:45
snake_case fixes for is_timeout

Co-Authored-By: justinfx <justinisrael@gmail.com>
Copy link
Contributor Author

@justinfx justinfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've approved the suggested changes for snake_case

@thehesiod
Copy link

will there be a release with this fix?

@temoto
Copy link
Member

temoto commented Feb 17, 2019

@thehesiod done, it's 0.12.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants