Skip to content

Command parse unhandled error :AttributeError: 'NoneType' object has no attribute 'start_requests' #3264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wangrenlei opened this issue May 18, 2018 · 4 comments · Fixed by #5497

Comments

@wangrenlei
Copy link

wangrenlei commented May 18, 2018

Scrapy version :1.5.0
When i run the command scrapy parse http://www.baidu.com, and the url www.baidu.com dosn't have spider matched , then i got the error:

2018-03-11 16:23:35 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: DouTu)
2018-03-11 16:23:35 [scrapy.utils.log] INFO: Versions: lxml 4.2.1.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.9.0, Python 2.7.12 (default, Dec 4 2017, 14:50:18) - [GCC 5.4.0 20160609], pyOpenSSL 17.5.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.2.2, Platform Linux-4.13.0-38-generic-x86_64-with-Ubuntu-16.04-xenial
2018-05-18 16:23:35 [scrapy.commands.parse] ERROR: Unable to find spider for: http://www.baidu.com
Traceback (most recent call last):
File "/home/wangsir/code/sourceWorkSpace/scrapy/cmdline.py", line 239, in
execute(['scrapy','parse','http://www.baidu.com'])
File "/home/wangsir/code/sourceWorkSpace/scrapy/cmdline.py", line 168, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/home/wangsir/code/sourceWorkSpace/scrapy/cmdline.py", line 98, in _run_print_help
func(*a, **kw)
File "/home/wangsir/code/sourceWorkSpace/scrapy/cmdline.py", line 176, in _run_command
cmd.run(args, opts)
File "/home/wangsir/code/sourceWorkSpace/scrapy/commands/parse.py", line 250, in run
self.set_spidercls(url, opts)
File "/home/wangsir/code/sourceWorkSpace/scrapy/commands/parse.py", line 151, in set_spidercls
self.spidercls.start_requests = _start_requests
AttributeError: 'NoneType' object has no attribute 'start_requests'.

The failed reason should be follwing code(scrapy/commands/parse.py line 151):
self.spidercls.start_requests = _start_requests
because the url www.baidu.com dosn't have spider matched,so self.spidercls is none,so self.spidercls.start_requests throw the error.

@Congee
Copy link

Congee commented Jul 11, 2018

For those who need a quick-and-dirty workaround, specify the spider --spider=NAME_OF_MY_SPIDER

@wRAR
Copy link
Member

wRAR commented Jan 21, 2022

The original PR, #3265, looks correct to me, someone just needs to create a new one, import the commit from that one and publish it.

@PushanAgrawal
Copy link

Hello , I think i can help with this issue if someone could give me a little insight .

@wRAR
Copy link
Member

wRAR commented Feb 10, 2022

hi @PushanAgrawal , as you can see above your comment there is already a PR open for it.

andreastziortz added a commit to AngelikiBoura/scrapy that referenced this issue May 6, 2022
Changes
Implementation:
- Check whether Spider exists or is None, and if it's None skip execution of start_requests() with non existing Spider
Testing:
- Add a test case with invalid url inside test_command_parse
  Test proves that non-matched Spider does not throw an AttributeError
kmike added a commit that referenced this issue May 28, 2022
…_unhandled_error

Issue #3264, fix error handling when spider is not matched
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment