SSL website. `twisted.internet.error.ConnectionLost` #2916

russian-developer · 2017-09-07T13:30:51Z

Hi everybody!
I catch this error on both OS. This HTTPS site can't be downloaded via scrapy (twisted). I looked on this issue board and I don't found solution.

Both: Debian 9 / Mac OS

$ scrapy shell "https://wwwnet1.state.nj.us/"
2017-09-07 16:23:02 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
2017-09-07 16:23:02 [scrapy.utils.log] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0, 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter'}
2017-09-07 16:23:02 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2017-09-07 16:23:02 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-09-07 16:23:02 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-09-07 16:23:03 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-09-07 16:23:03 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-07 16:23:03 [scrapy.core.engine] INFO: Spider opened
2017-09-07 16:23:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-09-07 16:23:03 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/> (failed 2 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-09-07 16:23:04 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://wwwnet1.state.nj.us/> (failed 3 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
Traceback (most recent call last):
  File "scrapy", line 11, in <module>
    sys.exit(execute())
  File "/lib/python3.5/site-packages/scrapy/cmdline.py", line 149, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/lib/python3.5/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/lib/python3.5/site-packages/scrapy/cmdline.py", line 156, in _run_command
    cmd.run(args, opts)
  File "/lib/python3.5/site-packages/scrapy/commands/shell.py", line 73, in run
    shell.start(url=url, redirect=not opts.no_redirect)
  File "/lib/python3.5/site-packages/scrapy/shell.py", line 48, in start
    self.fetch(url, spider, redirect=redirect)
  File "/lib/python3.5/site-packages/scrapy/shell.py", line 115, in fetch
    reactor, self._schedule, request, spider)
  File "/lib/python3.5/site-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
    result.raiseException()
  File "/lib/python3.5/site-packages/twisted/python/failure.py", line 385, in raiseException
    raise self.value.with_traceback(self.tb)
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]

Mac OSx:

$ scrapy version -v
Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.4
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.9.0rc1
Python    : 3.5.1 (default, Jan 22 2016, 08:54:32) - [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)]
pyOpenSSL : 17.2.0 (OpenSSL 1.1.0f  25 May 2017)
Platform  : Darwin-16.7.0-x86_64-i386-64bit

Debian 9:

$ scrapy version -v
Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.3
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.9.0rc1
Python    : 3.4.2 (default, Oct  8 2014, 10:45:20) - [GCC 4.9.1]
pyOpenSSL : 17.2.0 (OpenSSL 1.1.0f  25 May 2017)
Platform  : Linux-3.16.0-4-amd64-x86_64-with-debian-8.7

Mac OSx:

$ openssl s_client -connect wwwnet1.state.nj.us:443 -servername wwwnet1.state.nj.us
CONNECTED(00000003)
140736760988680:error:140790E5:SSL routines:ssl23_write:ssl handshake failure:s23_lib.c:177:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 336 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : 0000
    Session-ID: 
    Session-ID-ctx: 
    Master-Key: 
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1504790705
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---

Debian 9:

CONNECTED(00000003)
---
Certificate chain
 0 s:/C=US/ST=New Jersey/L=Trenton/O=New Jersey State Government/OU=E-Gov Services - wwwnet1.state.nj.us/CN=wwwnet1.state.nj.us
   i:/C=US/O=Symantec Corporation/OU=Symantec Trust Network/CN=Symantec Class 3 Secure Server SHA256 SSL CA
---
Server certificate
-----BEGIN CERTIFICATE-----
<cut out>
-----END CERTIFICATE-----
<cut out>
---
No client certificate CA names sent
---
SSL handshake has read 1724 bytes and written 635 bytes
---
New, TLSv1/SSLv3, Cipher is DES-CBC3-SHA
Server public key is 2048 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : DES-CBC3-SHA
    Session-ID: 930F00007F5944DC3C6010F96E95E7FA63656EF5EA35508B055078CEC249DC38
    Session-ID-ctx:
    Master-Key: 27B02D427F006A57B121CCEFEAA7F33B870DE262848BB6F851242F48F051ABB77BA4ED06706766EE8EE55F6643C9FF55
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1504790821
    Timeout   : 300 (sec)
    Verify return code: 21 (unable to verify the first certificate)
---

Thanks you for your time.

The text was updated successfully, but these errors were encountered:

redapple · 2017-09-07T15:50:09Z

This worked for me:

force TLS 1.0
use cryptography<2 (e.g. 1.9 in my case, before OpenSSL 1.1)

$ scrapy version -v
Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.3
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.5.0
Python    : 3.6.2 (default, Aug 24 2017, 10:48:24) - [GCC 6.3.0 20170406]
pyOpenSSL : 17.2.0 (OpenSSL 1.0.2g  1 Mar 2016)


$ pip freeze
asn1crypto==0.22.0
attrs==17.2.0
Automat==0.6.0
cffi==1.10.0
constantly==15.1.0
cryptography==1.9
cssselect==1.0.1
hyperlink==17.3.1
idna==2.6
incremental==17.5.0
lxml==3.8.0
parsel==1.2.0
pyasn1==0.3.3
pyasn1-modules==0.1.1
pycparser==2.18
PyDispatcher==2.0.5
pyOpenSSL==17.2.0
queuelib==1.4.2
Scrapy==1.4.0
service-identity==17.0.0
six==1.10.0
Twisted==17.5.0
w3lib==1.18.0
zope.interface==4.4.2

$ scrapy shell "https://wwwnet1.state.nj.us/" -s DOWNLOADER_CLIENT_TLS_METHOD=TLSv1.0
2017-09-07 17:45:49 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
2017-09-07 17:45:49 [scrapy.utils.log] INFO: Overridden settings: {'DOWNLOADER_CLIENT_TLS_METHOD': 'TLSv1.0', 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0}
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage']
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-09-07 17:45:49 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-09-07 17:45:49 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-09-07 17:45:49 [scrapy.core.engine] INFO: Spider opened
2017-09-07 17:45:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://wwwnet1.state.nj.us/> (referer: None)
[s] Available Scrapy objects:
[s]   scrapy     scrapy module (contains scrapy.Request, scrapy.Selector, etc)
[s]   crawler    <scrapy.crawler.Crawler object at 0x7f24fb802ac8>
[s]   item       {}
[s]   request    <GET https://wwwnet1.state.nj.us/>
[s]   response   <200 https://wwwnet1.state.nj.us/>
[s]   settings   <scrapy.settings.Settings object at 0x7f24f314d9e8>
[s]   spider     <DefaultSpider 'default' at 0x7f24f24ba7b8>
[s] Useful shortcuts:
[s]   fetch(url[, redirect=True]) Fetch URL and update local objects (by default, redirects are followed)
[s]   fetch(req)                  Fetch a scrapy.Request and update local objects 
[s]   shelp()           Shell help (print this help)
[s]   view(response)    View response in a browser
>>>

Using OpenSSL 1.1.0f (with cryptography==2.0.3), did not work for me, even when forcing TLS1.0

russian-developer · 2017-09-08T08:34:02Z

@redapple thanks you for your replay.
Yes, this is working for me too…
btw, How do you know about forcing TLS version?

derrickmar · 2017-12-03T03:51:42Z

Hmm I also tried pip install --upgrade 'cryptography<2' but I'm getting an error when running
scrapy shell "https://wwwnet1.state.nj.us/" -s DOWNLOADER_CLIENT_TLS_METHOD=TLSv1.0

scrapy version -v
Scrapy    : 1.4.0
lxml      : 4.1.1.0
libxml2   : 2.9.7
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.18.0
Twisted   : 17.9.0
Python    : 2.7.10 (default, Sep 23 2015, 04:34:14) - [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.72)]
pyOpenSSL : 17.5.0 (OpenSSL 1.1.0f  25 May 2017)
Platform  : Darwin-16.7.0-x86_64-i386-64bit

pip freeze
asn1crypto==0.23.0
attrs==17.3.0
Automat==0.6.0
cffi==1.11.2
constantly==15.1.0
cryptography==1.9
cssselect==1.0.1
enum34==1.1.6
hyperlink==17.3.1
idna==2.6
incremental==17.5.0
ipaddress==1.0.18
lxml==4.1.1
parsel==1.2.0
pyasn1==0.4.2
pyasn1-modules==0.2.1
pycparser==2.18
PyDispatcher==2.0.5
pyOpenSSL==17.5.0
queuelib==1.4.2
Scrapy==1.4.0
service-identity==17.0.0
six==1.11.0
Twisted==17.9.0
w3lib==1.18.0
zope.interface==4.4.3

Error

2017-12-02 19:41:37 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
2017-12-02 19:41:37 [scrapy.utils.log] INFO: Overridden settings: {'DOWNLOADER_CLIENT_TLS_METHOD': 'TLSv1.0', 'LOGSTATS_INTERVAL': 0, 'RETRY_TIMES': '0', 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter'}
2017-12-02 19:41:37 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2017-12-02 19:41:37 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-12-02 19:41:37 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-12-02 19:41:37 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-12-02 19:41:37 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-12-02 19:41:37 [scrapy.core.engine] INFO: Spider opened
2017-12-02 19:41:37 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www5.apply2jobs.com/jupitermed/ProfExt/index.cfm?fuseaction=mExternal.searchJobs> (failed 1 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
Traceback (most recent call last):
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/scrapy/cmdline.py", line 149, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/scrapy/cmdline.py", line 156, in _run_command
    cmd.run(args, opts)
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/scrapy/commands/shell.py", line 73, in run
    shell.start(url=url, redirect=not opts.no_redirect)
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/scrapy/shell.py", line 48, in start
    self.fetch(url, spider, redirect=redirect)
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/scrapy/shell.py", line 115, in fetch
    reactor, self._schedule, request, spider)
  File "/Users/dmar/.local/share/virtualenvs/pathwise-scrape-7G7iLF5G/lib/python2.7/site-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread
    result.raiseException()
  File "<string>", line 2, in raiseException
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]

aashmishra · 2017-12-10T09:09:38Z

I am also facing the same error

PS D:\fresh\zomatodata> scrapy version -v
Scrapy : 1.4.0
lxml : 4.1.1.0
libxml2 : 2.9.5
cssselect : 1.0.1
parsel : 1.2.0
w3lib : 1.18.0
Twisted : 17.9.0
Python : 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:14:34) [MSC v.1900 32 bit (Intel)]
pyOpenSSL : 17.5.0 (OpenSSL 1.1.0f 25 May 2017)
Platform : Windows-10-10.0.15063-SP0

PS D:\fresh\zomatodata> python -m pip install --upgrade 'cryptography<2

'
Collecting cryptography<2
Downloading cryptography-1.9-cp36-cp36m-win32.whl (1.1MB)
100% |████████████████████████████████| 1.1MB 750kB/s
Requirement already up-to-date: six>=1.4.1 in d:\python_installed\lib\site-packages (from cryptography<2)
Requirement already up-to-date: asn1crypto>=0.21.0 in d:\python_installed\lib\site-packages (from cryptography<2)
Requirement already up-to-date: cffi>=1.7 in d:\python_installed\lib\site-packages (from cryptography<2)
Requirement already up-to-date: idna>=2.1 in d:\python_installed\lib\site-packages (from cryptography<2)
Requirement already up-to-date: pycparser in d:\python_installed\lib\site-packages (from cffi>=1.7->cryptography<2)
Installing collected packages: cryptography
Found existing installation: cryptography 2.1.4
Uninstalling cryptography-2.1.4:
Successfully uninstalled cryptography-2.1.4
Successfully installed cryptography-1.9
PS D:\fresh\zomatodata> scrapy shell "https://wwwnet1.state.nj.us/" -s DOWNLOADER_CLIENT_TLS_METHOD=TLSv1.0
2017-12-10 14:35:56 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: zomatodata)
2017-12-10 14:35:56 [scrapy.utils.log] INFO: Overridden settings: {'BOT_NAME': 'zomatodata', 'DOWNLOADER_CLIENT_TLS_METHOD': 'TLSv1.0', 'DUPEFILTER_CLASS': 'scrapy
.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0, 'NEWSPIDER_MODULE': 'zomatodata.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['zomatodata.spiders']}
2017-12-10 14:35:56 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole']
2017-12-10 14:35:56 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-12-10 14:35:56 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-12-10 14:35:56 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-12-10 14:35:56 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-12-10 14:35:56 [scrapy.core.engine] INFO: Spider opened
2017-12-10 14:35:57 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/robots.txt> (failed 1 times): [<twisted.python.failure.Fa
ilure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-12-10 14:35:57 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/robots.txt> (failed 2 times): [<twisted.python.failure.Fa
ilure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-12-10 14:35:58 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://wwwnet1.state.nj.us/robots.txt> (failed 3 times): [<twisted.python.fa
ilure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-12-10 14:35:58 [scrapy.downloadermiddlewares.robotstxt] ERROR: Error downloading <GET https://wwwnet1.state.nj.us/robots.txt>: [<twisted.python.failure.Failur
e twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a no
n-clean fashion: Connection lost.>]
2017-12-10 14:35:58 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/> (failed 1 times): [<twisted.python.failure.Failure twis
ted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-12-10 14:35:59 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://wwwnet1.state.nj.us/> (failed 2 times): [<twisted.python.failure.Failure twis
ted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2017-12-10 14:35:59 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://wwwnet1.state.nj.us/> (failed 3 times): [<twisted.python.failure.Fail
ure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
Traceback (most recent call last):
File "d:\python_installed\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "d:\python_installed\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "D:\python_installed\Scripts\scrapy.exe_main.py", line 9, in
File "d:\python_installed\lib\site-packages\scrapy\cmdline.py", line 149, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "d:\python_installed\lib\site-packages\scrapy\cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "d:\python_installed\lib\site-packages\scrapy\cmdline.py", line 156, in _run_command
cmd.run(args, opts)
File "d:\python_installed\lib\site-packages\scrapy\commands\shell.py", line 73, in run
shell.start(url=url, redirect=not opts.no_redirect)
File "d:\python_installed\lib\site-packages\scrapy\shell.py", line 48, in start
self.fetch(url, spider, redirect=redirect)
File "d:\python_installed\lib\site-packages\scrapy\shell.py", line 115, in fetch
reactor, self._schedule, request, spider)
File "d:\python_installed\lib\site-packages\twisted\internet\threads.py", line 122, in blockingCallFromThread
result.raiseException()
File "d:\python_installed\lib\site-packages\twisted\python\failure.py", line 385, in raiseException
raise self.value.with_traceback(self.tb)
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-
clean fashion: Connection lost.>]

raphapassini · 2017-12-11T14:23:03Z

@derrickmar seems that you have different version of OpenSSL, this is the line that works for
@redapple: pyOpenSSL : 17.2.0 (OpenSSL 1.0.2g 1 Mar 2016) and this is the line you posted: pyOpenSSL : 17.5.0 (OpenSSL 1.1.0f 25 May 2017) try to change the OpenSSL version on your system to 1.0.x

ejulio · 2018-12-17T17:17:09Z

I reached the same issue a couple of weeks ago and the solution was to change the TLS method.
I changed the config https://doc.scrapy.org/en/latest/topics/settings.html#downloader-client-tls-method to SSLv23_METHOD/TLS

niquepa · 2019-02-28T15:28:38Z

I had the same Issue, in my case the solution was to set the USER_AGENTin the seetings-pyfile:

USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'

ejulio · 2019-02-28T16:02:46Z

This issue seems to be related to lots of stuff.
Not sure if we should document it somewhere to keep it easier for other people to find "solutions".
Maybe StackOverflow or Scrapy docs...

@Gallaecio , @raphapassini , @victor-torres ideas here?

victor-torres · 2019-02-28T18:09:13Z

@ejulio, I like to think about this issue as an edge case. Every time things like this happen to me, the first thing I do is to copy and paste the exception core message and that usually leads me to Stack Overflow or a Mailing List or a GitHub issue. In this case, I think users are pretty much covered with such good content here in this thread.

SachitNayak · 2020-02-20T22:01:59Z

I had the same Issue, in my case the solution was to set the USER_AGENTin the seetings-pyfile:

USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'

Aforementioned solution worked:

if you are using scrapy shell:

scrapy shell -s USER_AGENT=USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36' 'http://www.expedia.com'

anapaulagomes · 2020-06-13T06:41:03Z

I tried all the suggestions above but still didn't manage to fix this problem.
URL: https://www.diariooficial.feiradesantana.ba.gov.br/

scrapy==2.0.0
Twisted==20.3.0
pyOpenSSL==19.1.0

Any words of wisdom are much appreciated. 🙏

russian-developer · 2020-06-13T20:40:25Z

I tried all the suggestions above but still didn't manage to fix this problem.
URL: https://www.diariooficial.feiradesantana.ba.gov.br/
scrapy==2.0.0
Twisted==20.3.0
pyOpenSSL==19.1.0
Any words of wisdom are much appreciated. 🙏

@anapaulagomes you have to use TLSv1.0 and RC4-MD5 cihper.
The next command should work in the scraper environment
curl -v --tlsv1.0 --ciphers RC4-MD5 https://www.diariooficial.feiradesantana.ba.gov.br/
You can reach it by compiling the OpenSSL with support SSLv3.

gbonesso · 2020-10-06T20:53:43Z

I'm having the same problem in the url
https://fnet.bmfbovespa.com.br/fnet/publico/exibirDocumento?id=88001
In my case I just remove the "s" as a workaround and I'm able to scrape the site without using SSL. Still trying the suggestions above to support SSL...
http://fnet.bmfbovespa.com.br/fnet/publico/exibirDocumento?id=88001
@anapaulagomes, maybe this work for Feira de Santana site...
http://www.diariooficial.feiradesantana.ba.gov.br/

anapaulagomes · 2020-10-07T07:55:44Z

In my case, the website had changed their protocol but after talking to them (meaning: complaining in public) they changed it again. Thanks, @gbonesso.
Also, before their latest change, we managed to run a Docker image thanks to @Laerte using @unk2k tips. Sharing in case someone is trapped in a problem like this. 👍🏽

russian-developer · 2020-10-07T08:04:48Z

I had the same Issue, in my case the solution was to set the USER_AGENTin the seetings-pyfile:

USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'

This issue was about TLS problems. Your problem was about post-TLS connection issue.

russian-developer · 2020-10-07T08:08:08Z

Scrapy is very sensitive to OpenSSL version. You also should keep in mind, that python, pyopenssl, cryptography should be compiled with your custom OpenSSL version, even it's not system based version of course.

russian-developer · 2020-10-07T08:13:06Z

You may use my Dockerfile to avoid problems regarding broken TLS connection.

Dockerfile.base.zip

wRAR · 2023-01-29T20:39:10Z

Closing as there is no single specific problem discussed here, the original issue is no longer reproducible and we have many workarounds some of which were even mentioned.

Gallaecio added docs enhancement labels Aug 19, 2019

SachitNayak mentioned this issue Feb 20, 2020

twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost #3103

Closed

jgnan mentioned this issue May 18, 2020

Scrapy 2.1. with Openssl 18 dont support RSA-MD5 cipher website #4582

Closed

anapaulagomes mentioned this issue Jul 3, 2020

Feira de Santana's gazette not working after city hall website changes okfn-brasil/querido-diario#176

Closed

jpmckinney mentioned this issue Jul 21, 2020

Add spider for new Portugal API open-contracting/kingfisher-collect#439

Closed

wRAR closed this as not planned Won't fix, can't repro, duplicate, stale Jan 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSL website. `twisted.internet.error.ConnectionLost` #2916

SSL website. `twisted.internet.error.ConnectionLost` #2916

russian-developer commented Sep 7, 2017

redapple commented Sep 7, 2017

russian-developer commented Sep 8, 2017

derrickmar commented Dec 3, 2017 •

edited

aashmishra commented Dec 10, 2017

raphapassini commented Dec 11, 2017

ejulio commented Dec 17, 2018

niquepa commented Feb 28, 2019

ejulio commented Feb 28, 2019

victor-torres commented Feb 28, 2019

SachitNayak commented Feb 20, 2020 •

edited

anapaulagomes commented Jun 13, 2020

russian-developer commented Jun 13, 2020 •

edited

gbonesso commented Oct 6, 2020

anapaulagomes commented Oct 7, 2020

russian-developer commented Oct 7, 2020

russian-developer commented Oct 7, 2020

russian-developer commented Oct 7, 2020 •

edited

wRAR commented Jan 29, 2023

SSL website. twisted.internet.error.ConnectionLost #2916

SSL website. twisted.internet.error.ConnectionLost #2916

Comments

russian-developer commented Sep 7, 2017

redapple commented Sep 7, 2017

russian-developer commented Sep 8, 2017

derrickmar commented Dec 3, 2017 • edited

aashmishra commented Dec 10, 2017

raphapassini commented Dec 11, 2017

ejulio commented Dec 17, 2018

niquepa commented Feb 28, 2019

ejulio commented Feb 28, 2019

victor-torres commented Feb 28, 2019

SachitNayak commented Feb 20, 2020 • edited

anapaulagomes commented Jun 13, 2020

russian-developer commented Jun 13, 2020 • edited

gbonesso commented Oct 6, 2020

anapaulagomes commented Oct 7, 2020

russian-developer commented Oct 7, 2020

russian-developer commented Oct 7, 2020

russian-developer commented Oct 7, 2020 • edited

wRAR commented Jan 29, 2023

SSL website. `twisted.internet.error.ConnectionLost` #2916

SSL website. `twisted.internet.error.ConnectionLost` #2916

derrickmar commented Dec 3, 2017 •

edited

SachitNayak commented Feb 20, 2020 •

edited

russian-developer commented Jun 13, 2020 •

edited

russian-developer commented Oct 7, 2020 •

edited