Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG +1] LogCounterHandler is never removed from root handlers list, fix that #1259

Merged

Conversation

@chekunkov
Copy link
Contributor

@chekunkov chekunkov commented May 27, 2015

lambda is garbage collected and because receiver is added as weak reference by default - when signals.engine_stopped is fired logging.root.removeHandler is not executed. Fixed that by assigning lambda to a private argument and not by using connect(..., weak=False) because I believe this lambda function should be collected with crawler object and not stay in dispatcher.connections forever.

Tested in ScrapyRT where multiple crawls are running in the same process. Before the fix:

(Pdb) logging.root.handlers
[<scrapy.utils.log.LogCounterHandler object at 0x7f93caafca10>, <scrapy.utils.log.LogCounterHandler object at 0x7f93ca0ee690>, <scrapy.utils.log.LogCounterHandler object at 0x7f93c8048110>, <scrapy.utils.log.LogCounterHandler object at 0x7f93c806ba10>, <logging.FileHandler object at 0x7f93c38cce90>, <scrapy.utils.log.LogCounterHandler object at 0x7f93c38dc310>]

After the fix:

(Pdb) logging.root.handlers
[<logging.FileHandler object at 0x7f603536ce50>, <scrapy.utils.log.LogCounterHandler object at 0x7f60352c1510>]
lambda is garbage collected and because receiver is added as weak reference by default - when signals.engine_stopped is fired logging.root.removeHandler is not executed. Fixed that by assigning lambda to a private argument and not by using connect(..., weak=False) because I belive this lambda function should be collected with crawler object
signals.engine_stopped)
# lambda is assigned to Crawler attribute because this way it is not
# garbage collected after leaving __init__ scope
self.__remove_handler = lambda: logging.root.removeHandler(handler)

This comment has been minimized.

@kmike

kmike May 27, 2015
Member

Sounds reasonable. We can also call it '_on_engine_stopped'. Another option is to create a real '_on_engine_stopped' method and keep the reference to a handler: self.loghandler = LogCounterHandler(..). But your method also looks fine.

This comment has been minimized.

@nramirezuy

nramirezuy May 27, 2015
Contributor

has to be an attribute? can't be just a variable ?
_remove_handler = lambda: logging.root.removeHandler(handler)

This comment has been minimized.

@kmike

kmike May 27, 2015
Member

it can't - the point of this change is to store a reference to lambda

@kmike kmike changed the title LogCounterHandler is never removed from root handlers list, fix that [MRG +1] LogCounterHandler is never removed from root handlers list, fix that May 27, 2015
curita added a commit that referenced this pull request May 27, 2015
…removed

[MRG +1] LogCounterHandler is never removed from root handlers list, fix that
@curita curita merged commit 62a6eff into scrapy:master May 27, 2015
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.