-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logging level won't working #2149
Comments
Version: |
@RustJason , how are you running your spider? |
scrapyd and |
We're running into the same problem using scrapyd with We see loads of junk like:
|
Hi @dan-blanchard ,
I don't understand. Also, Scrapy does not use urllib3 as far as I know, nor does scrapyd. |
Sorry, mistyped a couple things there. We're using I know the logging messages are not from parts of scrapy, but the root handler that's being created must have the logging level set to |
@dan-blanchard ,
Can you detail how you're setting logging levels for your crawls? I'm a total noob regarding Python logging so I don't think I can help fix this in reasonable time, but if you can build a reproducible example of this behavior, someone could jump in to help. |
@redapple I apologize for the confusion, this was entirely our fault. What happened was that we're running scrapyd via supervisord and we had logging enable via supervisord that I was not aware of. |
Thanks for the heads up @dan-blanchard ! |
level setting in logging.basicConfig canot work correctly. No matter what you set, spider will always write dubug level log to file. |
@Jack-Kingdom , how are you running your spider? a standalone script or With If you're using a standalone script, are you using
With # -*- coding: utf-8 -*-
import logging
from twisted.internet import reactor
import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
class ExampleSpider(scrapy.Spider):
name = 'example'
allowed_domains = ['example.com']
start_urls = ['http://example.com/']
def parse(self, response):
self.logger.debug('... inside parse(%r) ...' % response)
self.logger.error('some FAKE error!')
yield {'url': response.url}
# commenting this next line from the docs example,
# otherwise scrapy.utils.log.DEFAULT_LOGGING is used, with scrapy level to DEBUG
#configure_logging(install_root_handler=False)
logging.basicConfig(
filename='log.txt',
format='%(levelname)s: %(message)s',
level=logging.ERROR
)
runner = CrawlerRunner()
d = runner.crawl(ExampleSpider)
d.addBoth(lambda _: reactor.stop())
reactor.run() # the script will block here until the crawling is finished I believe the example in the docs is wrong about calling |
I am by no means a logging export, but I recently read up on logging and believe @redapple is correct. What dictConfig(DEFAULT_LOGGING) Where DEFAULT_LOGGING is: DEFAULT_LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'loggers': {
'scrapy': {
'level': 'DEBUG',
},
'twisted': {
'level': 'ERROR',
},
}
}
All loggers (such as scrapy.core.scraper or even just scrapy) are child loggers of root. What this means is that when scrapy logs something, it will send it up to the root logger since the default level of the child logger 'scrapy' is set to DEBUG. The root logger's handler then has the level of NOTSET, so it will process all logs and emit them in the logs. You can read more about how this works in this flow: https://docs.python.org/3/howto/logging.html#logging-flow Using both thus does not make sense. As such, one should preferably either call The answer is thus that the documentation for scrapy needs to be improved here: |
Using the code in the doc:
But in the file
scrapy.log
, I can still seeINFO
DEBUG
etcI did not specify any log level in my settings.py, Which part could be wrong?
Thanks
The text was updated successfully, but these errors were encountered: