Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS handshake failure #2717

Closed
povilasb opened this issue Apr 25, 2017 · 27 comments
Closed

TLS handshake failure #2717

povilasb opened this issue Apr 25, 2017 · 27 comments

Comments

@povilasb
Copy link

I have this simple spider:

import scrapy


class FailingSpider(scrapy.Spider):
    name = 'Failing Spider'
    start_urls = ['https://www.skelbiu.lt/']

    def parse(self, response: scrapy.http.Response) -> None:
        pass

On debian 9 it fails with:

2017-04-25 19:01:39 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.skelbiu.lt/>
Traceback (most recent call last):
  File "/home/povilas/projects/skelbiu-scraper/pyenv/lib/python3.6/site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/povilas/projects/skelbiu-scraper/pyenv/lib/python3.6/site-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/povilas/projects/skelbiu-scraper/pyenv/lib/python3.6/site-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]

On debian 8 it works well.
And "https://www.skelbiu.lt" is the only target I can reproduce the problem.

Some more context:

$ pyenv/bin/pip freeze
asn1crypto==0.22.0
attrs==16.3.0
Automat==0.5.0
cffi==1.10.0
constantly==15.1.0
cryptography==1.8.1
cssselect==1.0.1
funcsigs==0.4
idna==2.5
incremental==16.10.1
lxml==3.7.3
mock==1.3.0
packaging==16.8
parsel==1.1.0
pbr==3.0.0
py==1.4.33
pyasn1==0.2.3
pyasn1-modules==0.0.8
pycparser==2.17
PyDispatcher==2.0.5
PyHamcrest==1.8.5
pyOpenSSL==17.0.0
pyparsing==2.2.0
pytest==2.7.2
queuelib==1.4.2
Scrapy==1.3.3
service-identity==16.0.0
six==1.10.0
Twisted==17.1.0
w3lib==1.17.0
zope.interface==4.4.0

$ dpkg --get-selections | grep libssl
libssl-dev:amd64                                install
libssl-doc                                      install
libssl1.0.2:amd64                               install
libssl1.1:amd64                                 install
libssl1.1:i386                                  install


$ apt-cache show libssl1.1
Package: libssl1.1
Source: openssl
Version: 1.1.0e-1

Any ideas what I should look for? :)

@povilasb
Copy link
Author

povilasb commented Apr 25, 2017

My hypothesis is that the server rejects TLS client hello because of some specified ciphers:

Cipher Suites (28 suites)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (0xc02c)
    Cipher Suite: TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (0xc030)
    Cipher Suite: TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 (0x009f)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 (0xcca9)
    Cipher Suite: TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 (0xcca8)
    Cipher Suite: TLS_DHE_RSA_WITH_CHACHA20_POLY1305_SHA256 (0xccaa)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 (0xc02b)
    Cipher Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (0xc02f)
    Cipher Suite: TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 (0x009e)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384 (0xc024)
    Cipher Suite: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 (0xc028)
    Cipher Suite: TLS_DHE_RSA_WITH_AES_256_CBC_SHA256 (0x006b)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 (0xc023)
    Cipher Suite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 (0xc027)
    Cipher Suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 (0x0067)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA (0xc00a)
    Cipher Suite: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (0xc014)
    Cipher Suite: TLS_DHE_RSA_WITH_AES_256_CBC_SHA (0x0039)
    Cipher Suite: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA (0xc009)
    Cipher Suite: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (0xc013)
    Cipher Suite: TLS_DHE_RSA_WITH_AES_128_CBC_SHA (0x0033)
    Cipher Suite: TLS_RSA_WITH_AES_256_GCM_SHA384 (0x009d)
    Cipher Suite: TLS_RSA_WITH_AES_128_GCM_SHA256 (0x009c)
    Cipher Suite: TLS_RSA_WITH_AES_256_CBC_SHA256 (0x003d)
    Cipher Suite: TLS_RSA_WITH_AES_128_CBC_SHA256 (0x003c)
    Cipher Suite: TLS_RSA_WITH_AES_256_CBC_SHA (0x0035)
    Cipher Suite: TLS_RSA_WITH_AES_128_CBC_SHA (0x002f)
    Cipher Suite: TLS_EMPTY_RENEGOTIATION_INFO_SCSV (0x00ff)

Wireshark displays me this response from the server:

TLSv1.2 Record Layer: Alert (Level: Fatal, Description: Handshake Failure)
    Content Type: Alert (21)
    Version: TLS 1.2 (0x0303)
    Length: 2
    Alert Message
        Level: Fatal (2)
        Description: Handshake Failure (40)

It comes immediately after TLS client hello message.

@kmike
Copy link
Member

kmike commented Apr 25, 2017

@redapple is the man who knows everything about such issues, but have you tried setting a different DOWNLOADER_CLIENT_TLS_METHOD option value?

@povilasb
Copy link
Author

Unfortunately, changing TLS version does not help.

@redapple
Copy link
Contributor

redapple commented Apr 25, 2017 via email

@povilasb
Copy link
Author

How do you make scrapy/python to choose specific openssl version?

@redapple
Copy link
Contributor

I haven't tried it yet myself but I believe you can use https://cryptography.io/en/latest/installation/#static-wheels

I was planning on using an Debian 9 Sid docker image.

@redapple
Copy link
Contributor

Alright, I just tried #2717 (comment)
and I was able to reproduce the issue:

$ scrapy version -v
Scrapy    : 1.3.3
lxml      : 3.7.3.0
libxml2   : 2.9.3
cssselect : 1.0.1
parsel    : 1.1.0
w3lib     : 1.17.0
Twisted   : 17.1.0
Python    : 2.7.12+ (default, Sep 17 2016, 12:08:02) - [GCC 6.2.0 20160914]
pyOpenSSL : 17.0.0 (OpenSSL 1.1.0e  16 Feb 2017)
Platform  : Linux-4.8.0-49-generic-x86_64-with-Ubuntu-16.10-yakkety


$ cat testssl.py
import scrapy


class FailingSpider(scrapy.Spider):
    name = 'Failing Spider'
    start_urls = ['https://www.skelbiu.lt/']

    def parse(self, response):
        pass



$ scrapy runspider testssl.py 
2017-04-26 15:45:18 [scrapy.utils.log] INFO: Scrapy 1.3.3 started (bot: scrapybot)
2017-04-26 15:45:18 [scrapy.utils.log] INFO: Overridden settings: {'SPIDER_LOADER_WARN_ONLY': True}
(...)
2017-04-26 15:45:18 [scrapy.core.engine] INFO: Spider opened
2017-04-26 15:45:19 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-04-26 15:45:19 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-04-26 15:45:19 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.skelbiu.lt/> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-04-26 15:45:19 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.skelbiu.lt/> (failed 2 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-04-26 15:45:19 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.skelbiu.lt/> (failed 3 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-04-26 15:45:19 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.skelbiu.lt/>: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-04-26 15:45:19 [scrapy.core.engine] INFO: Closing spider (finished)
2017-04-26 15:45:19 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
 'downloader/exception_type_count/twisted.web._newclient.ResponseNeverReceived': 3,
 'downloader/request_bytes': 636,
 'downloader/request_count': 3,
 'downloader/request_method_count/GET': 3,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2017, 4, 26, 13, 45, 19, 855881),
 'log_count/DEBUG': 4,
 'log_count/ERROR': 1,
 'log_count/INFO': 7,
 'scheduler/dequeued': 3,
 'scheduler/dequeued/memory': 3,
 'scheduler/enqueued': 3,
 'scheduler/enqueued/memory': 3,
 'start_time': datetime.datetime(2017, 4, 26, 13, 45, 19, 1654)}
2017-04-26 15:45:19 [scrapy.core.engine] INFO: Spider closed (finished)

@redapple
Copy link
Contributor

For the record, I've collected .pcap files and expanded ClientHello message for Scrapy and OpenSSL client 1.0.2g and 1.1.0e in https://github.com/redapple/scrapy-issues/tree/master/2717

I'm leaning towards something to do with Elliptic Curves.
I'll keep you updated.

@redapple
Copy link
Contributor

Yeah, it looks like an EC thing:

  • using Twisted trunk and patching this line with
-_defaultCurveName = u"prime256v1"
+_defaultCurveName = u"secp384r1"

made the connection to 'https://www.skelbiu.lt/' work for me.

Now, I'll have a look at how to properly configure this with Twisted Agent.

@redapple
Copy link
Contributor

redapple commented Apr 26, 2017

From what I see on https://www.ssllabs.com/ssltest/analyze.html?d=www.skelbiu.lt&s=92.62.130.22&hideResults=on, the website indeed requires (at least?) "secp384r1", which I tested in #2717 (comment)

By default, openssl 1.1.0e client sends:

                Elliptic curves (4 curves)
                    Elliptic curve: ecdh_x25519 (0x001d)
                    Elliptic curve: secp256r1 (0x0017)
                    Elliptic curve: secp521r1 (0x0019)
                    Elliptic curve: secp384r1 (0x0018)

but Scrapy1.3.3/Twisted 17.1 with OpenSSL 1.1.0e only sends:

                Elliptic curves (1 curve)
                    Elliptic curve: secp256r1 (0x0017)

The code in Twisted using _defaultCurveName = u"prime256v1" was added 3 years ago apparently. Maybe OpenSSL now uses the setting. I'm not sure.

A couple of (non-exclusive) options :

  • report this to the Twisted team to see what they think, and maybe allow configurable Elliptic curves (like it does for ciphers)
  • work on the SSL context in Scrapy and force EC settings

@redapple
Copy link
Contributor

fyi, I've sent a message on Twisted Web mailing list: https://twistedmatrix.com/pipermail/twisted-web/2017-April/005293.html

@redapple
Copy link
Contributor

I just tested with Twisted 17.5.0rc2 and this does NOT look fixed.

@felixonmars
Copy link
Contributor

For me the issue is https://bugs.python.org/issue29697

The patch date is after all python stable versions and it causes the same error for urllib2.urlopen for python 2.7 here. Applying the patch in that issue fixes it for me.

@redapple
Copy link
Contributor

redapple commented Jul 6, 2017

Twisted bug: https://twistedmatrix.com/trac/ticket/9210
(I had not opened it at the time)

@jsakars
Copy link

jsakars commented Jul 13, 2017

I'm having the same issue with following versions:

Scrapy    : 1.4.0
lxml      : 3.8.0.0
libxml2   : 2.9.4
cssselect : 1.0.1
parsel    : 1.2.0
w3lib     : 1.17.0
Twisted   : 17.5.0
Python    : 3.6.0 (v3.6.0:41df79263a11, Dec 22 2016, 17:23:13) - [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
pyOpenSSL : 17.1.0 (OpenSSL 1.1.0f  25 May 2017)
Platform  : Darwin-16.6.0-x86_64-i386-64bit

Is there a workaround?

@redapple
Copy link
Contributor

@werdlv , I don't know any workaround.
Can you comment on which website is showing this failure? (to check if it's indeed related to OpenSSL 1.1 with Twisted)

@jsakars
Copy link

jsakars commented Jul 13, 2017

@redapple sure. At least these are giving SSL error:

  1. https://www.cvbankas.lt/
  2. https://www.skelbiu.lt/

Here are some that are working without errors:

  1. https://www.cvmarket.lt/
  2. https://www.alio.lt/
  3. https://cvzona.lt/

@redapple
Copy link
Contributor

Thanks @werdlv .
So it appears that https://www.skelbiu.lt/ and https://www.cvbankas.lt/ are served by the same machines 92.62.130.22 and 92.62.130.23.
https://www.skelbiu.lt/ is the host in this very issue (#2717 (comment))

@tonal
Copy link
Contributor

tonal commented Oct 8, 2017

also error site https://www.teplodvor.ru/

@tonal
Copy link
Contributor

tonal commented Oct 9, 2017

see also #2944

@redapple
Copy link
Contributor

redapple commented Oct 9, 2017

Right @tonal . https://www.teplodvor.ru/ does not look compatible with OpenSSL 1.1 (some weak ciphers were removed).
Downgrading to cryptography<2 , which ships with OpenSSL 1.0.2 (at least for me on Ubuntu), makes it work.

@sulangsss
Copy link

sulangsss commented Dec 10, 2017

@redapple I have run pip install --upgrade 'cryptography<2', but not work

url: https://www.archdaily.com

Scrapy : 1.4.0
lxml : 4.1.1.0
libxml2 : 2.9.7
cssselect : 1.0.1
parsel : 1.2.0
w3lib : 1.18.0
Twisted : 17.9.0
Python : 3.6.3 (default, Oct 24 2017, 14:48:20) - [GCC 7.2.0]
pyOpenSSL : 17.5.0 (OpenSSL 1.1.0g 2 Nov 2017)
Platform : Linux-4.9.66-1-MANJARO-x86_64-with-arch-Manjaro-Linux

<GET https://www.archdaily.com>
2017-12-10 16:14:21 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.archdaily.com> (failed 1 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-12-10 16:14:26 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.archdaily.com> (failed 2 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-12-10 16:14:27 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <GET https://www.archdaily.com> (failed 3 times): [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]
2017-12-10 16:14:27 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.archdaily.com>: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]>]

@raphapassini
Copy link
Contributor

@sulangsss seems that you are still using OpenSSL 1.1.0.
pyOpenSSL : 17.5.0 (OpenSSL 1.1.0g 2 Nov 2017) try to install OpenSSL == 1.0.x

@Laruxo
Copy link

Laruxo commented Apr 18, 2018

I just installed Twisted==18.4.0rc1 and www.skelbiu.lt seem to work for me.

@Gallaecio
Copy link
Member

Closing since this has been fixed in Twisted 18.4.0.

@cpatulea
Copy link

cpatulea commented Sep 9, 2019

I'm experiencing this in Ubuntu 18.04 (Twisted 17.9.0, OpenSSL 1.1.1). I cannot update to newer packages, but I do control my entire application. I've made this workaround in my main file, after imports:

from twisted.internet import _sslverify
def _raise(_):
  raise NotImplementedError()
_sslverify._OpenSSLECCurve = _raise

This should probably be used only as a last resort if libraries cannot be updated.

@iamarifdev
Copy link

I'm experiencing this in Ubuntu 18.04 (Twisted 17.9.0, OpenSSL 1.1.1). I cannot update to newer packages, but I do control my entire application. I've made this workaround in my main file, after imports:

from twisted.internet import _sslverify
def _raise(_):
  raise NotImplementedError()
_sslverify._OpenSSLECCurve = _raise

This should probably be used only as a last resort if libraries cannot be updated.

Its working for the version 1.4.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests