Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Item class field access and item creation using this #2749

Manslow opened this issue May 18, 2017 · 5 comments

Feature Request: Item class field access and item creation using this #2749

Manslow opened this issue May 18, 2017 · 5 comments


Copy link

Manslow commented May 18, 2017

Items provide a simple way to structure data. However, as items are currently implemented, you can only populate an item by referencing a field using a string that matches the field's name e.g item['field_name'] = value. This becomes a problem when you want to change the name of a field after creation. It also requires that you remember the name of the fields defined for an item.

I think it would be nice to be able to populate items using syntax like the following:

item = Item()
item[Item.field] = value

In this way, you can use code completion in an IDE to be sure that you are referencing a field the Item class has defined and can leverage refactoring to change the names of fields across your entire project wherever Item.field is referenced.

Perhaps there is a good reason why this option was discarded during development but it isn't obvious to me why.

Copy link

redapple commented May 26, 2017

Hi @Manslow ,
are you suggesting being able to get or set an item value by a scrapy.item.Field index in addition to a string index?
I'm not a user myself of scrapy.Item, but this looks doable, updating its __getitem__ and __setitem__ methods.

Copy link

anaisabel7 commented Jun 26, 2017

Hi, I am looking into this, and while it can be done, I was wondering whether a simpler solution would achieve the same aims. I am not familiar with the

use code completion in an IDE

mentioned by @Manslow.
Would it help if the field could be populated and retrieved with dot notation rather than only dict notation? So:

item = Item()
item.field = value
item['another_field'] = another_value

Please, let me know if this would work for your aims.

Copy link

kmike commented Jun 26, 2017

I'm not opposed to adding support to Scrapy, so that objects created with @attr.s decorator can be used as Scrapy items. attrs is already a Scrapy dependency (via twisted -> automat). It solves validation problem, allows to add data type convertors and preprocessors, and requires item.field - style access. It is also possible to attach metadata to fields, like with Scrapy items, though Scrapy currently wants Item.field metadata.

A simplest implementatin could be the following - check if an object has .todict() method; call it, if it is present; use the reulting dict as a Scrapy item. Maybe we can also provide a scrapy.BetterItem based on attrs; missing parts are object_ref inheritance, support for field metadata and introspection support (get a list of fields).

Copy link

anaisabel7 commented Jun 30, 2017

I attempted an implementation in the PR above. With that implementation, certain arguments need to be passed onto the @attr.s decorator. I couldn't find a way to avoid that but would be happy to know if there is one, as that seems inconvenient, and something left to the proper usage of Scrapy Items with the decorator.

@kmike kmike mentioned this issue Feb 20, 2018
Copy link

Gallaecio commented Jun 14, 2020

I believe #3881 has implemented this by supporting dataclasses and attr.s as items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

Successfully merging a pull request may close this issue.

5 participants