-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
defer.inlineCallbacks in spider ? #4263
Comments
After reading the code, I think I've found the solution. Perhaps documenting it would be beneficial? # -*- coding: utf-8 -*-
import scrapy
import re
from twisted.internet.defer import inlineCallbacks
from sherlock import utils, items, regex
class PagesSpider(scrapy.spiders.SitemapSpider):
name = 'pages'
allowed_domains = ['thing.com']
sitemap_follow = [r'sitemap_page']
def __init__(self, site=None, *args, **kwargs):
super(PagesSpider, self).__init__(*args, **kwargs)
@inlineCallbacks
def parse(self, response):
# things
request = scrapy.Request("https://google.com")
response = yield self.crawler.engine.download(request, self)
# Twisted execute the request and resume the generator here with the response
print(response.text) It's a little verbose, but it works. Correct me if I'm wrong |
Relevant: #542 (comment). Not sure if we want to document these If, on the other hand, this issue is motivated by the above one (#3500), the "Scrapy way" of achieving such result would be Passing additional data to callback functions |
The current way to do the same is to declare the callback |
Hi,
I'm trying to use this feature of Twisted to run the following code:
Is this possible ? I'm trying to use this to dispense with the inline-request module.
Thanks
The text was updated successfully, but these errors were encountered: