Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No results displayed, TopMoviesSpider.parse method is never called. #19

Closed
aquib-sh opened this issue May 5, 2022 · 9 comments
Closed
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@aquib-sh
Copy link
Contributor

aquib-sh commented May 5, 2022

2022-05-05 14:10:28 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: scrapybot)
2022-05-05 14:10:28 [scrapy.utils.log] INFO: Versions: lxml 4.8.0.0, libxml2 2.9.13, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.10.4 (main, Mar 24 2022, 13:07:27) [GCC 11.2.0], pyOpenSSL 22.0.0 (OpenSSL 3.0.2 15 Mar 2022), cryptography 37.0.1, Platform Linux-5.17.0-1-amd64-x86_64-with-glibc2.33
2022-05-05 14:10:28 [scrapy.crawler] INFO: Overridden settings:
{}
2022-05-05 14:10:28 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2022-05-05 14:10:28 [scrapy.extensions.telnet] INFO: Telnet Password: 91c143582ecd5828
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats']
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2022-05-05 14:10:28 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2022-05-05 14:10:28 [scrapy.core.engine] INFO: Spider opened
2022-05-05 14:10:28 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2022-05-05 14:10:28 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2022-05-05 14:10:29 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://torrentgalaxy.to/> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://torrentgalaxy.to/> (failed 2 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://torrentgalaxy.to/> (failed 3 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.core.scraper] ERROR: Error downloading <GET https://torrentgalaxy.to/>
Traceback (most recent call last):
  File "/home/aquib/.local/lib/python3.10/site-packages/scrapy/core/downloader/middleware.py", line 49, in process_request
    return (yield download_func(request=request, spider=spider))
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', '', 'unexpected eof while reading')]>]
2022-05-05 14:10:29 [scrapy.core.engine] INFO: Closing spider (finished)
2022-05-05 14:10:29 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
 'downloader/exception_type_count/twisted.web._newclient.ResponseNeverReceived': 3,
 'downloader/request_bytes': 648,
 'downloader/request_count': 3,
 'downloader/request_method_count/GET': 3,
 'elapsed_time_seconds': 0.881984,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2022, 5, 5, 8, 40, 29, 711889),
 'log_count/DEBUG': 3,
 'log_count/ERROR': 2,
 'log_count/INFO': 10,
 'memusage/max': 62857216,
 'memusage/startup': 62857216,
 'retry/count': 2,
 'retry/max_reached': 1,
 'retry/reason_count/twisted.web._newclient.ResponseNeverReceived': 2,
 'scheduler/dequeued': 3,
 'scheduler/dequeued/memory': 3,
 'scheduler/enqueued': 3,
 'scheduler/enqueued/memory': 3,
 'start_time': datetime.datetime(2022, 5, 5, 8, 40, 28, 829905)}
2022-05-05 14:10:29 [scrapy.core.engine] INFO: Spider closed (finished)

@lkabuci
Copy link
Owner

lkabuci commented May 5, 2022

We should probably add a proxy to bypass countries' restrictions.

@lkabuci lkabuci added bug Something isn't working help wanted Extra attention is needed labels May 5, 2022
@aquib-sh
Copy link
Contributor Author

aquib-sh commented May 6, 2022

In addition to above error, it also gives the following after the latest update of using python3 main.py top

Traceback (most recent call last):
  File "/usr/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/usr/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/usr/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/usr/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/usr/lib/python3.10/ssl.py", line 512, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.10/ssl.py", line 1070, in _create
    self.do_handshake()
  File "/usr/lib/python3.10/ssl.py", line 1341, in do_handshake
    self._sslobj.do_handshake()
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/aquib/src/stream-cli/main.py", line 52, in <module>
    app()
  File "/home/aquib/.local/lib/python3.10/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/aquib/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/aquib/.local/lib/python3.10/site-packages/typer/main.py", line 500, in wrapper
    return callback(**use_params)  # type: ignore
  File "/home/aquib/src/stream-cli/main.py", line 19, in top
    apprun(TopMoviesSpider, True)
  File "/home/aquib/src/stream-cli/src/runner.py", line 36, in apprun
    movies = start_scrawling(scraping_class)
  File "/home/aquib/src/stream-cli/src/runner.py", line 26, in start_scrawling
    response = urllib.request.urlopen("https://www.torrentgalaxy.to/").getcode()
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/usr/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 104] Connection reset by peer>

@aquib-sh
Copy link
Contributor Author

aquib-sh commented May 6, 2022

I understand this is something related to ISP blocking the requests from reaching

@aquib-sh
Copy link
Contributor Author

aquib-sh commented May 6, 2022

would have to integrate a Tor connectivity so anyone in the world can use

@lkabuci
Copy link
Owner

lkabuci commented May 6, 2022

would have to integrate a Tor connectivity so anyone in the world can use

I tried but it's not working, Tor request some sort of protocol?

$❯ curl http://galaxy3yrfbwlwo72q3v2wlyjinqr2vejgpkxb22ll5pcpuaxlnqjiid.onion/
curl: (6) Could not resolve host: galaxy3yrfbwlwo72q3v2wlyjinqr2vejgpkxb22ll5pcpuaxlnqjiid.onion

@lkabuci
Copy link
Owner

lkabuci commented Aug 6, 2022

if tgx is not supported in your country you can use another torrent host website

@lkabuci lkabuci closed this as completed Aug 6, 2022
@aquib-sh
Copy link
Contributor Author

aquib-sh commented Aug 6, 2022

yes you need to be connected to onion protocol to access those url

@aquib-sh
Copy link
Contributor Author

aquib-sh commented Aug 6, 2022

I will open up a new feature request (for Tor), will send a pull request if finished.
Count me in to add the feature

@lkabuci
Copy link
Owner

lkabuci commented Aug 6, 2022

I will open up a new feature request (for Tor), will send a pull request if finished. Count me in to add the feature

That will be a huge improvement!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants