Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding spider args as attrs to docs #1719

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 13 additions & 1 deletion docs/topics/spiders.rst
Expand Up @@ -283,7 +283,19 @@ Spider arguments are passed through the :command:`crawl` command using the

scrapy crawl myspider -a category=electronics

Spiders receive arguments in their constructors::
Spider arguments are exposed as attributes::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @demelziraptor !
This is about documenting the feature of spider arguments being exposed as attributes, right? (this line: https://github.com/scrapy/scrapy/blob/master/scrapy/spiders/__init__.py#L30)

From the example it isn't clear what this is trying to show.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly @eliasdorneles, from the current docs it seems like the only way to access spider input arguments is from the constructor arguments.

Maybe this example is better?

    import scrapy

    class MySpider(scrapy.Spider):
        name = 'myspider'

        def parse(self, response):
            products = response.css('.%s' % self.category)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think it'll need more than an example.
Looking at the example alone it looks like broken code. :)

Perhaps adding a default as a class attribute, like:

import scrapy

class MySpider(scrapy.Spider):
    name = 'myspider'
    category = 'default-category'  # use -a category=another from the cmdline to override this

    def parse(self, response):
        products = response.css('.%s' % self.category)

This may deserve its own section in the docs really. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add a spider that can be copied, ran with runspider and get a result; something somehow interactive. Doesn't need to browse, it can do some comparison and close, maybe using start_requests or adding a signal.


import scrapy

class MySpider(scrapy.Spider):
name = 'myspider'

def __init__(self, *args, **kwargs):
super(MySpider, self).__init__(*args, **kwargs)
self.start_urls = ['http://www.example.com/categories/%s' % self.category]
# ...

And arguments are also received in spider constructors::

import scrapy

Expand Down