Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed #3574

Closed
ghost opened this issue Jan 10, 2019 · 2 comments · Fixed by #4099
Closed

Closed #3574

ghost opened this issue Jan 10, 2019 · 2 comments · Fixed by #4099

Comments

@ghost
Copy link

ghost commented Jan 10, 2019

Closed

@elacuesta
Copy link
Member

elacuesta commented Jan 11, 2019

The explanation for this behaviour is that ItemLoader.item is an attribute of the Loader class (technically a property as seen in https://github.com/scrapy/scrapy/blob/1.5/scrapy/loader/__init__.py#L46-L51), and it's used internally to store data from the add_* methods. Since you are overriding it with an instance of MyItem, every MyLoader instance holds a reference to the same MyItem object:

In [1]: import scrapy 
   ...: from scrapy.loader import ItemLoader 
   ...:  
   ...: class MyItem(scrapy.Item): 
   ...:     a = scrapy.Field() 
   ...:  
   ...: class MyLoader(ItemLoader): 
   ...:     item = MyItem() 
   ...:  
   ...: l1 = MyLoader() 
   ...: l2 = MyLoader() 
   ...: id(l1.item) == id(l2.item)                                                                                                                                                                                                                                              
Out[1]: True

I believe what you are looking for is ItemLoader.default_item_class:

In [4]: import scrapy 
   ...: from scrapy.loader import ItemLoader 
   ...:  
   ...: class MyItem(scrapy.Item): 
   ...:     a = scrapy.Field() 
   ...:  
   ...: class MyLoader(ItemLoader): 
   ...:     default_item_class = MyItem 
   ...:  
   ...: l1 = MyLoader() 
   ...: l2 = MyLoader() 
   ...: l1.add_value("a", "b") 
   ...: l2.add_value("a", []) 
   ...: print(l1.load_item()) 
   ...: print(l2.load_item())                                                                                                                                                                                                                                                   
{'a': ['b']}
{}

@Gallaecio
Copy link
Member

Gallaecio commented Oct 24, 2019

@hibpm Do you think the documentation changes proposed at #4099 would have prevented you from redefining ItemLoader.item in the first place? Do you remember what in the existing documentation made you define item in your item loader subclass?

@ghost ghost changed the title ItemLoader adds wrong value instead of an empty array Closed May 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants