defer.inlineCallbacks in spider ? #4263

poupryc · 2020-01-01T14:23:26Z

Hi,

I'm trying to use this feature of Twisted to run the following code:

# -*- coding: utf-8 -*-
import scrapy
import re
from twisted.internet.defer import inlineCallbacks

from sherlock import utils, items, regex


class PagesSpider(scrapy.spiders.SitemapSpider):
    name = 'pages'
    allowed_domains = ['thing.com']
    sitemap_follow = [r'sitemap_page']

    def __init__(self, site=None, *args, **kwargs):
        super(PagesSpider, self).__init__(*args, **kwargs)

    @inlineCallbacks
    def parse(self, response):
        # things
        response = yield scrapy.Request("https://google.com")
        # Twisted execute the request and resume the generator here with the response
        print(response.text)

Is this possible ? I'm trying to use this to dispense with the inline-request module.

Thanks

poupryc · 2020-01-02T11:12:52Z

After reading the code, I think I've found the solution. Perhaps documenting it would be beneficial?

# -*- coding: utf-8 -*-
import scrapy
import re
from twisted.internet.defer import inlineCallbacks

from sherlock import utils, items, regex


class PagesSpider(scrapy.spiders.SitemapSpider):
    name = 'pages'
    allowed_domains = ['thing.com']
    sitemap_follow = [r'sitemap_page']

    def __init__(self, site=None, *args, **kwargs):
        super(PagesSpider, self).__init__(*args, **kwargs)

    @inlineCallbacks
    def parse(self, response):
        # things
        request = scrapy.Request("https://google.com")
        response = yield self.crawler.engine.download(request, self) 
        # Twisted execute the request and resume the generator here with the response
        print(response.text)

It's a little verbose, but it works. Correct me if I'm wrong

elacuesta · 2020-01-14T23:32:13Z

Relevant: #542 (comment). Not sure if we want to document these ExecutionEngine methods though. Any thoughts @dangra?

If, on the other hand, this issue is motivated by the above one (#3500), the "Scrapy way" of achieving such result would be Passing additional data to callback functions

wRAR · 2023-10-29T16:38:44Z

The current way to do the same is to declare the callback async def, which is already documented. I don't want us to describe any inlineCallbacks uses where the alternatives exists, so I'm closing this.

Gallaecio added docs enhancement labels Jan 2, 2020

poupryc mentioned this issue Jan 4, 2020

Fetch data from API inside Scrapy #3500

Closed

wRAR closed this as completed Oct 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

defer.inlineCallbacks in spider ? #4263

defer.inlineCallbacks in spider ? #4263

poupryc commented Jan 1, 2020 •

edited

poupryc commented Jan 2, 2020 •

edited

elacuesta commented Jan 14, 2020

wRAR commented Oct 29, 2023

defer.inlineCallbacks in spider ? #4263

defer.inlineCallbacks in spider ? #4263

Comments

poupryc commented Jan 1, 2020 • edited

poupryc commented Jan 2, 2020 • edited

elacuesta commented Jan 14, 2020

wRAR commented Oct 29, 2023

poupryc commented Jan 1, 2020 •

edited

poupryc commented Jan 2, 2020 •

edited