Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gettinga consistent time out issues #140

Closed
AtlasPilotPuppy opened this issue Aug 27, 2015 · 9 comments
Closed

Gettinga consistent time out issues #140

AtlasPilotPuppy opened this issue Aug 27, 2015 · 9 comments

Comments

@AtlasPilotPuppy
Copy link

So i just installed grab using pip install -U grab.
I tried a simple

In [7]: g = Grab(debug=True)

In [8]: g.go('http://google.com')

and consistently get a time out

GrabTimeoutError                          Traceback (most recent call last)
<ipython-input-8-692c02a434a5> in <module>()
----> 1 g.go('http://google.com')

/home/anant/.virtualenvs/mp_rec/lib/python2.7/site-packages/grab/base.pyc in go(self, url, **kwargs)
    364         """
    365 
--> 366         return self.request(url=url, **kwargs)
    367 
    368     def download(self, url, location, **kwargs):

/home/anant/.virtualenvs/mp_rec/lib/python2.7/site-packages/grab/base.pyc in request(self, **kwargs)
    433 
    434         try:
--> 435             self.transport.request()
    436         except error.GrabError:
    437             self.reset_temporary_options()

/home/anant/.virtualenvs/mp_rec/lib/python2.7/site-packages/grab/transport/curl.pyc in request(self)
    463             else:
    464                 if ex.args[0] == 28:
--> 465                     raise error.GrabTimeoutError(ex.args[0], ex.args[1])
    466                 elif ex.args[0] == 7:
    467                     raise error.GrabConnectionError(ex.args[0], ex.args[1])

GrabTimeoutError: [Errno 28] Resolving timed out after 3511 milliseconds

I can get to google via requests, pycurl so i know its not a network issue.

@lorien
Copy link
Owner

lorien commented Aug 27, 2015

  1. Does g = Grab(); g.go('...) work with other web-sites?
  2. Does g = Grab(timeout=30) work with google.com?

@AtlasPilotPuppy
Copy link
Author

@lorien THanks for the quick response.
I have tried this on a Ubuntu 14.04 and a 15.04 machine. I have also tried it with python 2.7.9 and 3.4

I do not have any success with g = Grab(timeout=30) or any other websites.
I have checked to make sure that the sites in question are accessible via requests / urllib.
It seems like an error in pycurl from what i can tell.
There seems like there is some significance to the 3513 miliseconds.
That is the time it always fails in

GrabTimeoutError: [Errno 28] Resolving timed out after 3513 milliseconds

even with the timeout=30 parameter

➜  ~  ipython3
Python 3.4.3 (default, Mar 26 2015, 22:03:40) 
Type "copyright", "credits" or "license" for more information.

IPython 2.3.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from grab import Grab

In [2]: g = Grab()

In [3]: g.go('http://google.com')
---------------------------------------------------------------------------
GrabTimeoutError                          Traceback (most recent call last)
<ipython-input-3-692c02a434a5> in <module>()
----> 1 g.go('http://google.com')

/usr/local/lib/python3.4/dist-packages/grab/base.py in go(self, url, **kwargs)
    364         """
    365 
--> 366         return self.request(url=url, **kwargs)
    367 
    368     def download(self, url, location, **kwargs):

/usr/local/lib/python3.4/dist-packages/grab/base.py in request(self, **kwargs)
    433 
    434         try:
--> 435             self.transport.request()
    436         except error.GrabError:
    437             self.reset_temporary_options()

/usr/local/lib/python3.4/dist-packages/grab/transport/curl.py in request(self)
    463             else:
    464                 if ex.args[0] == 28:
--> 465                     raise error.GrabTimeoutError(ex.args[0], ex.args[1])
    466                 elif ex.args[0] == 7:
    467                     raise error.GrabConnectionError(ex.args[0], ex.args[1])

GrabTimeoutError: [Errno 28] Resolving timed out after 3512 milliseconds

In [4]: g = Grab(timeout=30)

In [5]: g.go('http://google.com')
---------------------------------------------------------------------------
GrabTimeoutError                          Traceback (most recent call last)
<ipython-input-5-692c02a434a5> in <module>()
----> 1 g.go('http://google.com')

/usr/local/lib/python3.4/dist-packages/grab/base.py in go(self, url, **kwargs)
    364         """
    365 
--> 366         return self.request(url=url, **kwargs)
    367 
    368     def download(self, url, location, **kwargs):

/usr/local/lib/python3.4/dist-packages/grab/base.py in request(self, **kwargs)
    433 
    434         try:
--> 435             self.transport.request()
    436         except error.GrabError:
    437             self.reset_temporary_options()

/usr/local/lib/python3.4/dist-packages/grab/transport/curl.py in request(self)
    463             else:
    464                 if ex.args[0] == 28:
--> 465                     raise error.GrabTimeoutError(ex.args[0], ex.args[1])
    466                 elif ex.args[0] == 7:
    467                     raise error.GrabConnectionError(ex.args[0], ex.args[1])

GrabTimeoutError: [Errno 28] Resolving timed out after 3513 milliseconds

Similar issue with python 2.7.9

➜  ~  ipython
Python 2.7.9 (default, Apr  2 2015, 15:33:21) 
Type "copyright", "credits" or "license" for more information.

IPython 4.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from grab import Grab

In [2]: g = Grab()

In [3]: g.go('http://google.com')
---------------------------------------------------------------------------
GrabTimeoutError                          Traceback (most recent call last)
<ipython-input-3-692c02a434a5> in <module>()
----> 1 g.go('http://google.com')

/usr/local/lib/python2.7/dist-packages/grab/base.pyc in go(self, url, **kwargs)
    364         """
    365 
--> 366         return self.request(url=url, **kwargs)
    367 
    368     def download(self, url, location, **kwargs):

/usr/local/lib/python2.7/dist-packages/grab/base.pyc in request(self, **kwargs)
    433 
    434         try:
--> 435             self.transport.request()
    436         except error.GrabError:
    437             self.reset_temporary_options()

/usr/local/lib/python2.7/dist-packages/grab/transport/curl.pyc in request(self)
    463             else:
    464                 if ex.args[0] == 28:
--> 465                     raise error.GrabTimeoutError(ex.args[0], ex.args[1])
    466                 elif ex.args[0] == 7:
    467                     raise error.GrabConnectionError(ex.args[0], ex.args[1])

GrabTimeoutError: [Errno 28] Resolving timed out after 3513 milliseconds

@oiwn
Copy link
Contributor

oiwn commented Aug 28, 2015

could you check if it's same for github?

g.go('http://github.com') and g.go('https://github.com')

@lorien
Copy link
Owner

lorien commented Aug 28, 2015

  1. What says dpkg -l | grep curl ?
  2. What says python -c import pycurl; print(pycurl.version)
  3. What says python -c import pycurl; print(pycurl.__file__)
  4. You said pycurl (without Grab) works for you. Could you show the source code of that script?
  5. What is output of that script
from grab import Grab
import logging

logging.basicConfig(level=logging.DEBUG)
g = Grab(verbose_logging=True, debug=True)
g.go('https://google.com')
  1. What is content of /etc/resolv.conf?

@AtlasPilotPuppy
Copy link
Author

Hey thanks for the responses.

  1. What says dpkg -l | grep curl ?
ii  curl                                                  7.35.0-1ubuntu2.5                                              amd64        command line tool for transferring data with URL syntax
ii  libcurl3:amd64                                        7.35.0-1ubuntu2.5                                              amd64        easy-to-use client-side URL transfer library (OpenSSL flavour)
ii  libcurl3-gnutls:amd64                                 7.35.0-1ubuntu2.5                                              amd64        easy-to-use client-side URL transfer library (GnuTLS flavour)
ii  libcurl4-openssl-dev:amd64                            7.35.0-1ubuntu2.5                                              amd64        development files and documentation for libcurl (OpenSSL flavour)
ii  python-pycurl                                         7.19.3-0ubuntu3                                                amd64        Python bindings to libcurl
ii  python3-pycurl                                        7.19.3-0ubuntu3                                                amd64        Python 3 bindings to libcurl
  1. output of python -c 'import pycurl; print(pycurl.version)'
PycURL/7.19.5.1 libcurl/7.35.0 OpenSSL/1.0.1f zlib/1.2.8 libidn/1.28 librtmp/2.3
  1. The script used in pycurl is
In [1]: import pycurl

In [2]: from StringIO import StringIO

In [3]: buffer = StringIO()

In [4]: c = pycurl.Curl()

In [5]: c.setopt(c.URL, 'https://github.com/lorien/grab/issues/140')

In [6]: c.setopt(c.WRITEDATA, buffer)

In [7]: c.perform()

In [8]: body = buffer.getvalue()

In [9]: body

body prints out the content of the page as expected

  1. output of script
In [1]: from grab import Grab

In [2]: import logging

In [3]: 

In [3]: logging.basicConfig(level=logging.DEBUG)

In [4]: g = Grab(verbose_logging=True, debug=True)

In [5]: g.go('https://google.com')
DEBUG:grab.network:[01] GET https://google.com
DEBUG:grab.transport.curl:i: Rebuilt URL to: https://google.com/
DEBUG:grab.transport.curl:i: Hostname was NOT found in DNS cache
DEBUG:grab.transport.curl:i: Resolving timed out after 3512 milliseconds
DEBUG:grab.transport.curl:i: Closing connection 0
---------------------------------------------------------------------------
GrabTimeoutError                          Traceback (most recent call last)
<ipython-input-5-f0e3d0aa17d6> in <module>()
----> 1 g.go('https://google.com')

/home/anant/.virtualenvs/mp_rec/lib/python2.7/site-packages/grab/base.pyc in go(self, url, **kwargs)
    364         """
    365 
--> 366         return self.request(url=url, **kwargs)
    367 
    368     def download(self, url, location, **kwargs):

/home/anant/.virtualenvs/mp_rec/lib/python2.7/site-packages/grab/base.pyc in request(self, **kwargs)
    433 
    434         try:
--> 435             self.transport.request()
    436         except error.GrabError:
    437             self.reset_temporary_options()

/home/anant/.virtualenvs/mp_rec/lib/python2.7/site-packages/grab/transport/curl.pyc in request(self)
    463             else:
    464                 if ex.args[0] == 28:
--> 465                     raise error.GrabTimeoutError(ex.args[0], ex.args[1])
    466                 elif ex.args[0] == 7:
    467                     raise error.GrabConnectionError(ex.args[0], ex.args[1])

GrabTimeoutError: [Errno 28] Resolving timed out after 3512 milliseconds
  1. resolv.conf contents (with some private content removed)
nameserver 8.8.8.8

Thanks again. It does seem like it is some sort of DNS issue

@lorien
Copy link
Owner

lorien commented Aug 29, 2015

  1. resolv.conf contents (with some private content removed)

Try just one nameserver 8.8.8.8 line in resolve.conf

Yeah, that is DNS issue, but I do not understand why only Grab does not work :)

  1. Could you also try this?
aptitude remove python-pycurl python3-pycurl
pip install pycurl
# then run the grab script again

@oiwn
Copy link
Contributor

oiwn commented Sep 9, 2015

any news?

@AtlasPilotPuppy
Copy link
Author

Sorry I haven't tried again yet.

On Wed, Sep 9, 2015 at 5:23 AM istinspring notifications@github.com wrote:

any news?


Reply to this email directly or view it on GitHub
#140 (comment).

@lorien
Copy link
Owner

lorien commented Nov 22, 2015

No feedback. Closing ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants