Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Use best practices for TLS connections when using Twisted>=14.0 #1794

Merged
merged 10 commits into from Feb 24, 2016

Conversation

@redapple
Copy link
Contributor

@redapple redapple commented Feb 19, 2016

This is another take at the various TLS, SSL, SNI issues, assuming scrapy using Twisted 14.0+

It solved #1227 for me.

It changes the default TLSv1.0-only to SSLv23 (protocol version negotiation, "version flexible" as OpenSSL calls it)

A new setting DOWNLOADER_CLIENT_TLS_METHOD allows setting the TLS/SSL method for the HTTP/1.1 downloader.
Setting values are strings, mapped to OpenSSL.SSL methods, "TLS" being the default, version flexible, method.

@redapple redapple changed the title Use best practices for TLS connections when using Twisted>=14.0 [WIP] Use best practices for TLS connections when using Twisted>=14.0 Feb 19, 2016
redapple added 2 commits Feb 19, 2016
Also remove TLSv1.1 and TLSv1.2 method: these are available only
from pyOpenSSL 0.14
https://github.com/pyca/pyopenssl/releases/tag/v0.14a1
New DOWNLOADER_CLIENT_TLS_METHOD setting to configure TLS method
@redapple
Copy link
Contributor Author

@redapple redapple commented Feb 20, 2016

server.pem from redapple@57990fb was actually copied from https://twistedmatrix.com/trac/browser/tags/releases/twisted-15.5.0/twisted/test when running tests with BrowserLikePolicyForHTTPS-based context factory
then I switched to non-verifying version. I'll see to remove that change

redapple added 2 commits Feb 20, 2016
Revert test server certificate change
@codecov-io
Copy link

@codecov-io codecov-io commented Feb 20, 2016

Current coverage is 83.15%

Merging #1794 into master will increase coverage by +0.05% as of a447a0f

Powered by Codecov. Updated on successful CI builds.

@redapple redapple changed the title [WIP] Use best practices for TLS connections when using Twisted>=14.0 Use best practices for TLS connections when using Twisted>=14.0 Feb 20, 2016
@@ -18,6 +18,7 @@
from scrapy.http import Headers
from scrapy.responsetypes import responsetypes
from scrapy.core.downloader.webclient import _parse
from scrapy.core.downloader.tls import openssl_methods, METHOD_TLS

This comment has been minimized.

@dangra

dangra Feb 23, 2016
Member

METHOD_TLS not needed

This comment has been minimized.

@redapple

redapple Feb 23, 2016
Author Contributor

Yup. good catch.

self._contextFactory = self._contextFactoryClass(method=self._sslMethod)
except TypeError:
# use defaults
self._contextFactory = self._contextFactoryClass()

This comment has been minimized.

@dangra

dangra Feb 23, 2016
Member

can it warn when a method-less factory is used?

This comment has been minimized.

@redapple

redapple Feb 23, 2016
Author Contributor

Good idea. Worth a mention in the settings docs. if you use your own client factory, make sure it accepts a TLS method string argument (that you can discard) or similar

This comment has been minimized.

@redapple

redapple Feb 24, 2016
Author Contributor

@dangra , I added the warning in redapple@c9890d5, but I'm not really happy with how it displays:

2016-02-24 01:12:51 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-02-24 01:12:51 [py.warnings] WARNING: /home/paul/src/scrapy/scrapy/core/downloader/handlers/http11.py:48: UserWarning: 
    You are using a context factory class that does not accept the `method` argument
    (type OpenSSL.SSL method, e.g. OpenSSL.SSL.SSLv23_METHOD).
    Please upgrade your context factory class to handle or ignore it.
  Please upgrade your context factory class to handle or ignore it.""")

I don't understand why Please upgrade your context factory class to handle or ignore it.""") is repeated.
Any ideas? Any suggestions on making it prettier?

This comment has been minimized.

@dangra

dangra Feb 24, 2016
Member

it is the content of the warning line, it only prints the last line
because it is multi line.
You can assign the warning message to a variable and then use it as message
parameter.

On Tue, Feb 23, 2016 at 9:39 PM, Paul Tremberth notifications@github.com
wrote:

In scrapy/core/downloader/handlers/http11.py
#1794 (comment):

     self._contextFactoryClass = load_object(settings['DOWNLOADER_CLIENTCONTEXTFACTORY'])
  •    self._contextFactory = self._contextFactoryClass()
    
  •    # try method-aware context factory
    
  •    try:
    
  •        self._contextFactory = self._contextFactoryClass(method=self._sslMethod)
    
  •    except TypeError:
    
  •        # use defaults
    
  •        self._contextFactory = self._contextFactoryClass()
    

@dangra https://github.com/dangra , I added the warning in redapple@
c9890d5
redapple@c9890d5,
but I'm not really happy with how it displays:

2016-02-24 01:12:51 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-02-24 01:12:51 [py.warnings] WARNING: /home/paul/src/scrapy/scrapy/core/downloader/handlers/http11.py:48: UserWarning:
You are using a context factory class that does not accept the method argument
(type OpenSSL.SSL method, e.g. OpenSSL.SSL.SSLv23_METHOD).
Please upgrade your context factory class to handle or ignore it.
Please upgrade your context factory class to handle or ignore it.""")

I don't understand why Please upgrade your context factory class to
handle or ignore it.""") is repeated.
Any ideas? Any suggestions on making it prettier?


Reply to this email directly or view it on GitHub
https://github.com/scrapy/scrapy/pull/1794/files#r53876725.

This comment has been minimized.

@redapple

redapple Feb 24, 2016
Author Contributor

thx @dangra .
It's a bit nicer now in redapple@0336c25 :

2016-02-24 16:41:46 [py.warnings] WARNING: /home/paul/src/scrapy/scrapy/core/downloader/handlers/http11.py:50: UserWarning: 
 'sslissues.contextfactory.TLSFlexibleContextFactory' does not accept `method` argument (type OpenSSL.SSL method, e.g. OpenSSL.SSL.SSLv23_METHOD). Please upgrade your context factory class to handle it or ignore it.
  warnings.warn(msg)

@dangra
Copy link
Member

@dangra dangra commented Feb 23, 2016

what do you think about backporting it to v1.0 but keeping the default for TLS v1 ?

@redapple
Copy link
Contributor Author

@redapple redapple commented Feb 23, 2016

@dangra , I wasn't planning on another 1.0 release.
Do you think it's needed?
Also, in any case, it'd be good to test it out with users and 1.1rc2, so they can tell us if it's indeed better with this fix

@dangra
Copy link
Member

@dangra dangra commented Feb 23, 2016

@redapple: testing on 1.1rc2 sounds good to me, let's see later if we need to backport it to 1.0 then. thankx

@redapple redapple added this to the v1.1 milestone Feb 24, 2016
METHOD_TLSv12 = 'TLSv1.2'

openssl_methods = {
METHOD_TLS: SSL.SSLv23_METHOD, # protocol negotiation (recommended)

This comment has been minimized.

@kmike

kmike Feb 24, 2016
Member

Where did this name came from? It is a surprising to have METHOD_TLS doing negotiation and falling back to SSL.

This comment has been minimized.

@redapple

redapple Feb 24, 2016
Author Contributor

This is similar to OpenSSL (yet to come) 1.1:
https://www.openssl.org/docs/manmaster/ssl/SSL_CTX_new.html

 #define SSLv23_method           TLS_method
 #define SSLv23_server_method    TLS_server_method
 #define SSLv23_client_method    TLS_client_method

TLS_method(), TLS_server_method(), TLS_client_method()
These are the general-purpose version-flexible SSL/TLS methods. The actual protocol version used will be negotiated to the highest version mutually supported by the client and the server. The supported protocols are SSLv3, TLSv1, TLSv1.1 and TLSv1.2. Most applications should use these method, and avoid the version specific methods described below.

SSLv23_method(), SSLv23_server_method(), SSLv23_client_method()
Use of these functions is deprecated. They have been replaced with the above TLS_method(), TLS_server_method() and TLS_client_method() respectively. New code should use those functions instead.

Also see https://www.openssl.org/news/changelog.txt

Changes between 1.0.2f and 1.1.0  [xx XXX xxxx]
(...)
  *) Version negotiation has been rewritten. In particular SSLv23_method(),
     SSLv23_client_method() and SSLv23_server_method() have been deprecated,
     and turned into macros which simply call the new preferred function names
     TLS_method(), TLS_client_method() and TLS_server_method(). All new code
     should use the new names instead. Also as part of this change the ssl23.h
     header file has been removed.
     [Matt Caswell]

I would not want us to use SSLv23 as a setting default so as no to confuse users that it's SSL only.
But I'm open for other names for the default protocol flexible setting.

This comment has been minimized.

@redapple

redapple Feb 24, 2016
Author Contributor

does it make sense, @kmike ?
or you prefer to use some other name?

@dangra
Copy link
Member

@dangra dangra commented Feb 24, 2016

LGTM.

@dangra dangra changed the title Use best practices for TLS connections when using Twisted>=14.0 [MRG+1] Use best practices for TLS connections when using Twisted>=14.0 Feb 24, 2016
kmike added a commit that referenced this pull request Feb 24, 2016
[MRG+1] Use best practices for TLS connections when using Twisted>=14.0
@kmike kmike merged commit f7a48b0 into scrapy:master Feb 24, 2016
1 of 2 checks passed
1 of 2 checks passed
codecov/patch 89.00% of diff hit (target 100.00%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.