Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update broad-crawls.rst #1264

Closed
wants to merge 1 commit into from
Closed

Update broad-crawls.rst #1264

wants to merge 1 commit into from

Conversation

rmuellerb
Copy link
Contributor

Added section on how to treat memory consumption problems of broad crawls.

Added section on how to treat memory consumption problems of broad crawls.
@kmike
Copy link
Member

kmike commented May 29, 2015

👍 I like the idea of adding this kind of information to docs, it could be very helpful.

For this particular case see http://doc.scrapy.org/en/master/topics/leaks.html#too-many-requests - if you use a disk queue, and callbacks are spider methods (not lambdas), then the memory required to store all requests should be less an issue.

3) **Use the profiling and trackref capabilities of scrapy:** scrapy provides an own and interactive profiling and reference tracking tool. See `debugging memory leaks`_ for more information.

.. _debugging memory leaks: http://doc.scrapy.org/en/latest/topics/leaks.html
.. _Requests: http://doc.scrapy.org/en/latest/topics/request-response.html#request-objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal crossreferences should use proper reST markup instead of fixed links to latest version.

@Gallaecio
Copy link
Member

Closing due to inactivity, to be continued at #3866.

@Gallaecio Gallaecio closed this Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants