Skip to content

Commit

Permalink
Drop the Authorization header on cross-domain redirect
Browse files Browse the repository at this point in the history
  • Loading branch information
Gallaecio committed Nov 15, 2023
1 parent 5fccf37 commit 080fecd
Show file tree
Hide file tree
Showing 3 changed files with 66 additions and 2 deletions.
27 changes: 27 additions & 0 deletions docs/news.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,20 @@
Release notes
=============

.. _release-2.11.1:

Scrapy 2.11.1 (unreleased)
--------------------------

**Security bug fix:**

- The ``Authorization`` header is now dropped on redirects to a different
domain. Please, see the `cw9j-q3vf-hrrv security advisory`_ for more
information.

.. _cw9j-q3vf-hrrv security advisory: https://github.com/scrapy/scrapy/security/advisories/GHSA-cw9j-q3vf-hrrv


.. _release-2.11.0:

Scrapy 2.11.0 (2023-09-18)
Expand Down Expand Up @@ -2869,6 +2883,19 @@ affect subclasses:

(:issue:`3884`)

.. _release-1.8.4:

Scrapy 1.8.4 (unreleased)
-------------------------

**Security bug fix:**

- The ``Authorization`` header is now dropped on redirects to a different
domain. Please, see the `cw9j-q3vf-hrrv security advisory`_ for more
information.

.. _cw9j-q3vf-hrrv security advisory: https://github.com/scrapy/scrapy/security/advisories/GHSA-cw9j-q3vf-hrrv


.. _release-1.8.3:

Expand Down
10 changes: 8 additions & 2 deletions scrapy/downloadermiddlewares/redirect.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,17 @@ def _build_redirect_request(source_request, *, url, **kwargs):
**kwargs,
cookies=None,
)
if "Cookie" in redirect_request.headers:
has_cookie_header = "Cookie" in redirect_request.headers
has_authorization_header = "Authorization" in redirect_request.headers
if has_cookie_header or has_authorization_header:
source_request_netloc = urlparse_cached(source_request).netloc
redirect_request_netloc = urlparse_cached(redirect_request).netloc
if source_request_netloc != redirect_request_netloc:
del redirect_request.headers["Cookie"]
if has_cookie_header:
del redirect_request.headers["Cookie"]
# https://fetch.spec.whatwg.org/#ref-for-cors-non-wildcard-request-header-name
if has_authorization_header:
del redirect_request.headers["Authorization"]
return redirect_request


Expand Down
31 changes: 31 additions & 0 deletions tests/test_downloadermiddleware_redirect.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,37 @@ def test_utf8_location(self):
perc_encoded_utf8_url = "http://scrapytest.org/a%C3%A7%C3%A3o"
self.assertEqual(perc_encoded_utf8_url, req_result.url)

def test_cross_domain_header_dropping(self):
safe_headers = {"A": "B"}
original_request = Request(
"https://example.com",
headers={"Cookie": "a=b", "Authorization": "a", **safe_headers},
)

internal_response = Response(
"https://example.com",
headers={"Location": "https://example.com/a"},
status=301,
)
internal_redirect_request = self.mw.process_response(
original_request, internal_response, self.spider
)
self.assertIsInstance(internal_redirect_request, Request)
self.assertEqual(original_request.headers, internal_redirect_request.headers)

external_response = Response(
"https://example.com",
headers={"Location": "https://example.org/a"},
status=301,
)
external_redirect_request = self.mw.process_response(
original_request, external_response, self.spider
)
self.assertIsInstance(external_redirect_request, Request)
self.assertEqual(
safe_headers, external_redirect_request.headers.to_unicode_dict()
)


class MetaRefreshMiddlewareTest(unittest.TestCase):
def setUp(self):
Expand Down

0 comments on commit 080fecd

Please sign in to comment.