You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To add logging functionality to memusage extension.
Motivation
Scrapy jobs with MEMUSAGE_ENABLED : True and defined MEMUSAGE_LIMIT_MB (all jobs on scrapy cloud) can be stopped early due to overuse of RAM memory and receive memusage_exceeded outcome.
First thing required to debug RAM memory leaks - is to identify.. pattern of RAM memory usage.
Is RAM usage continuously increased at higher rates during runtime?
or Is RAM usage rapidly increased over limit in last several minutes after hours or even days of stable runtime performance?
Each reason require different approaches to debug RAM memory leaks.
It will be much easier to debug this if value of self.get_virtual_size() will be added to log in _check_limit method of memusage extension
Applying MEMUSAGE_WARNING_MB setting to ~80-90% of MEMUSAGE_LIMIT_MB - current implementation of memusage extension warns only 1 time so it is not enough data for this.
Manually subclass memusage extension with similar changes - as well as any other option it will require to reschedule job. It may be not siutable for jobs with several days(and more) total runtime. So from this side it is preferable that it will be applied in scrapy itself and with enabled this loggin by default.
Additional context
Similar functionality previously requested here #2173
The text was updated successfully, but these errors were encountered:
Summary
To add logging functionality to memusage extension.
Motivation
Scrapy jobs with
MEMUSAGE_ENABLED : True
and definedMEMUSAGE_LIMIT_MB
(all jobs on scrapy cloud) can be stopped early due to overuse of RAM memory and receivememusage_exceeded
outcome.First thing required to debug RAM memory leaks - is to identify.. pattern of RAM memory usage.
Is RAM usage continuously increased at higher rates during runtime?
or Is RAM usage rapidly increased over limit in last several minutes after hours or even days of stable runtime performance?
Each reason require different approaches to debug RAM memory leaks.
It will be much easier to debug this if value of
self.get_virtual_size()
will be added to log in_check_limit
method ofmemusage
extensionscrapy/scrapy/extensions/memusage.py
Lines 77 to 89 in 6ded3cf
Describe alternatives you've considered
Applying
MEMUSAGE_WARNING_MB
setting to ~80-90% ofMEMUSAGE_LIMIT_MB
- current implementation ofmemusage
extension warns only 1 time so it is not enough data for this.Manually subclass
memusage
extension with similar changes - as well as any other option it will require to reschedule job. It may be not siutable for jobs with several days(and more) total runtime. So from this side it is preferable that it will be applied in scrapy itself and with enabled this loggin by default.Additional context
Similar functionality previously requested here #2173
The text was updated successfully, but these errors were encountered: