Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom_settings isn't reflected in the "Overridden settings:" log message #1343

Closed
canhduong28 opened this issue Jul 7, 2015 · 12 comments · Fixed by #2894
Closed

custom_settings isn't reflected in the "Overridden settings:" log message #1343

canhduong28 opened this issue Jul 7, 2015 · 12 comments · Fixed by #2894
Milestone

Comments

@canhduong28
Copy link

Hi,

I would like to notice that custom_settings per spider does not work properly. Global settings updated nothing when I added custom_settings in spiders, and unittests for that are missing too.

Regards,
Canh

@kmike
Copy link
Member

kmike commented Jul 7, 2015

Do you have an exampe? Which settings don't work?

@barraponto
Copy link
Contributor

@canhduong28
Copy link
Author

in my project settings.py I placed DOWNLOAD_TIMEOUT = 60 and in simple_spider.py

class SimpleSpider(scrapy.Spider):

name = 'simple'
custom_settings = {
    'DOWNLOAD_TIMEOUT': 30,
    'CONCURRENT_REQUESTS_PER_DOMAIN': 4
}

def start_requests(self):
    yield Request('http://www.google.com/', callback='parse')

and Scrapy logs say INFO: Overridden settings: {'DOWNLOAD_TIMEOUT': 60} when I ran the simple spider. It looks like that DOWNLOAD_TIMEOUT was not updated.

@curita
Copy link
Member

curita commented Jul 9, 2015

Hi @nautilus28 ! That log message gets printed before settings are updated with custom_settings :/ You can verify it by adding this print to the parse method in your spider:

def parse(self, response):
    ...
    print 'DOWNLOAD_TIMEOUT: {}'.format(self.settings.getint('DOWNLOAD_TIMEOUT'))

@curita
Copy link
Member

curita commented Jul 9, 2015

If that's not it, make sure you're using Scrapy 1.0.

@barraponto
Copy link
Contributor

Shouldn't scrapy print that message later then?

@curita
Copy link
Member

curita commented Jul 10, 2015

Yes, I agree. I'll update the issue title to reflect the actual problem.

@curita curita changed the title custom_settings does not work Settings in custom_settings aren't reflected in the "Overridden settings:" log message Jul 10, 2015
@curita curita changed the title Settings in custom_settings aren't reflected in the "Overridden settings:" log message custom_settings isn't reflected in the "Overridden settings:" log message Jul 10, 2015
@nramirezuy
Copy link
Contributor

@curita I solved this by just using an Extension, maybe it helps 😄

@canhduong28
Copy link
Author

I place CONCURRENT_REQUESTS_PER_DOMAIN = 2 in custom_settings, but this does not shown in the log message. Then how can I make sure that custom_settings does really work?

@ngoanhtan
Copy link

@nautilus28 : just see in debug log. It should alway have 1-2 request in same time.Or you can override open_spider/close_spider method for count it.

@canhduong28
Copy link
Author

@ngoanhtan the CONCURRENT_REQUESTS_PER_DOMAIN setting does not shown in debug log, and overriding open_spider/close_spider does not help.

@jdemaeyer
Copy link
Contributor

This doesn't solve the underlying issue, but you can issue your own log entry from within the spider if you really need to see the settings value:

self.logger.debug('Concurrent requests per domain: %d', self.settings['CONCURRENT_REQUESTS_PER_DOMAIN'])

You could place it in the Spider's closed method if it's sufficient to log it after the crawling process, or in the parse method (though, depending on your spider, it will probably be logged multiple times in that case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants