ItemLoader: zeros as field values #2498

medse · 2017-01-15T16:46:58Z

I'm scraping a eshop, there's a price xpath for an item's price, which is empty if the item is out of stock.
I use ItemLoader, and add_xpath():

item.add_xpath('price', './/span[@class="price rub"]/text()')

I want to set the price to 0.0 if the item is missing, for this I check if the xpath is empty in the price_in declaration within the ItemLoader-based class:

price_in=Compose(TakeFirst(), lambda _: float(_) if _ else 0)
But the value of zero isn't stored in the _values dict because of the following code in the scrapy/loaders/init.py:_add_value():

    def _add_value(self, field_name, value):
        value = arg_to_iter(value)
        processed_value = self._process_input_value(field_name, value)
        if processed_value:
            self._values[field_name] += arg_to_iter(processed_value)

I don't know why the logic is like this, but it won't store neither zeros nor empty strings. Is it illegal? I use scrapy for about a week, so don't know the usage practices at all, but this seems strange to me.
Maybe change the condition to “processed_value is not None”?

The text was updated successfully, but these errors were encountered:

kmike · 2017-01-15T16:50:45Z

See also: #741

medse · 2017-01-15T16:58:14Z

Thanks, @kmike .
If I understand correctly, it's exactly the same problem (and the same fix incidentally:)) But it's not merged for 2.5 years?

IAlwaysBeCoding · 2017-01-16T04:26:04Z

I do a lot of scraping as well from ecommerce stores(that is my specialty). I pass the Item instance after loading it through this function

def default_missing_keys(item, default_value='', except_keys=[]):

    missing_keys = list(set(item.fields.keys()) - set(item.keys()))
    for missing_key in missing_keys:
        if except_keys:
            if missing_key not in except_keys:
                item[missing_key] = default_value
        else:
            item[missing_key] = default_value

Essentially, I get the keys from the Item class and minus the keys found. This gives me all of the missing keys, and then I just fill a default value.

It is an ugly hack, but oh well at the moment Scrapy doesn't have a default for missing keys.

wRAR · 2023-10-28T10:39:58Z

scrapy/itemloaders#73

Gallaecio added the enhancement label Jul 8, 2019

wRAR closed this as completed Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ItemLoader: zeros as field values #2498

ItemLoader: zeros as field values #2498

medse commented Jan 15, 2017

kmike commented Jan 15, 2017

medse commented Jan 15, 2017

IAlwaysBeCoding commented Jan 16, 2017

wRAR commented Oct 28, 2023

ItemLoader: zeros as field values #2498

ItemLoader: zeros as field values #2498

Comments

medse commented Jan 15, 2017

kmike commented Jan 15, 2017

medse commented Jan 15, 2017

IAlwaysBeCoding commented Jan 16, 2017

wRAR commented Oct 28, 2023