Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection was refused? #1

Closed
lu161513 opened this issue Oct 19, 2018 · 1 comment
Closed

connection was refused? #1

lu161513 opened this issue Oct 19, 2018 · 1 comment

Comments

@lu161513
Copy link

$ scrapy crawl tt
2018-10-19 15:01:15 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: toutiao)
2018-10-19 15:01:15 [scrapy.utils.log] INFO: Versions: lxml 4.2.3.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0, Twisted 18.7.0, Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 05:52:31) - [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.3, Platform Darwin-18.0.0-x86_64-i386-64bit
2018-10-19 15:01:15 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'toutiao', 'COOKIES_ENABLED': False, 'DOWNLOAD_DELAY': 3, 'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter', 'HTTPCACHE_STORAGE': 'scrapy_splash.SplashAwareFSCacheStorage', 'NEWSPIDER_MODULE': 'toutiao.spiders', 'REDIRECT_ENABLED': False, 'SPIDER_MODULES': ['toutiao.spiders']}
2018-10-19 15:01:15 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.logstats.LogStats']
2018-10-19 15:01:15 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy_splash.SplashCookiesMiddleware',
'scrapy_splash.SplashMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-10-19 15:01:15 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy_splash.SplashDeduplicateArgsMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-10-19 15:01:15 [scrapy.middleware] INFO: Enabled item pipelines:
['toutiao.pipelines.ToutiaoPipeline']
2018-10-19 15:01:15 [scrapy.core.engine] INFO: Spider opened
2018-10-19 15:01:15 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-10-19 15:01:15 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-10-19 15:01:16 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.toutiao.com via http://localhost:8050/render.html> (failed 1 times): Connection was refused by other side: 61: Connection refused.
2018-10-19 15:01:19 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.toutiao.com via http://localhost:8050/render.html> (failed 2 times): Connection was refused by other side: 61: Connection refused.
2018-10-19 15:01:23 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.toutiao.com via http://localhost:8050/render.html> (failed 3 times): Connection was refused by other side: 61: Connection refused.
2018-10-19 15:01:23 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.toutiao.com via http://localhost:8050/render.html>
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 61: Connection refused.
2018-10-19 15:01:23 [scrapy.core.engine] INFO: Closing spider (finished)
2018-10-19 15:01:23 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
'downloader/exception_type_count/twisted.internet.error.ConnectionRefusedError': 3,
'downloader/request_bytes': 1818,
'downloader/request_count': 3,
'downloader/request_method_count/POST': 3,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2018, 10, 19, 7, 1, 23, 667284),
'log_count/DEBUG': 4,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'memusage/max': 58916864,
'memusage/startup': 58912768,
'retry/count': 2,
'retry/max_reached': 1,
'retry/reason_count/twisted.internet.error.ConnectionRefusedError': 2,
'scheduler/dequeued': 4,
'scheduler/dequeued/memory': 4,
'scheduler/enqueued': 4,
'scheduler/enqueued/memory': 4,
'splash/render.html/request_count': 1,
'start_time': datetime.datetime(2018, 10, 19, 7, 1, 15, 822066)}
2018-10-19 15:01:23 [scrapy.core.engine] INFO: Spider closed (finished)

这个是为什么呀

@lu161513
Copy link
Author

splash服务未正常开启导致

想问下文章内容可以爬到吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant