Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve cookie handling #5463

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

farsene
Copy link
Contributor

@farsene farsene commented Apr 4, 2022

Adding storage of cookies in local file that allows cross-spider access of cookies and providing interface method for the spiders to retrieve the cookies. Spiders automatically load the saved cookies when they are opened and write the new cookies to the file when they are closed such that another new opened spider can reuse the existing cookies.

Part of #5431

@codecov
Copy link

codecov bot commented Apr 4, 2022

Codecov Report

Merging #5463 (3f279e8) into master (2d6042b) will decrease coverage by 0.27%.
The diff coverage is 63.63%.

@@            Coverage Diff             @@
##           master    #5463      +/-   ##
==========================================
- Coverage   88.75%   88.48%   -0.28%     
==========================================
  Files         163      165       +2     
  Lines       10666    10786     +120     
  Branches     1818     1836      +18     
==========================================
+ Hits         9467     9544      +77     
- Misses        923      961      +38     
- Partials      276      281       +5     
Impacted Files Coverage Δ
scrapy/http/cookies.py 83.07% <23.07%> (-5.82%) ⬇️
scrapy/spiders/__init__.py 85.89% <47.61%> (-14.11%) ⬇️
scrapy/storage/__init__.py 66.66% <66.66%> (ø)
scrapy/storage/in_memory.py 70.00% <70.00%> (ø)
scrapy/downloadermiddlewares/cookies.py 93.60% <80.00%> (-4.30%) ⬇️
scrapy/settings/default_settings.py 98.77% <100.00%> (+0.02%) ⬆️

Comment on lines +191 to +216
.. method:: set_cookie_jar(cj)

Initiate the spider's cookie jar from an existing one

.. method:: add_cookie(cookie)

Add cookie to the spider's cookiejar

.. method:: get_cookies(name, names, return_type)

:param name: name of cookie to fetch, default = None
:type name: str

:param names: names of cookies to fetch, default = None
:type names: List[str]

:param return_type: return format if multiple cookies, default = list, options are list, dict
:type return_type: object

Get cookie by name or the cookies whose name is in names.
If names is used, then the return value will be in the format of the return_type

.. method:: clear_cookies()

Erase spider's cookiejar

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big API change, which should probably be discussed in #1878 before working on an implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants