Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase verbosity for ItemLoader, Compose and MapCompose errors #3836

Closed
mabelvj opened this issue Jun 20, 2019 · 2 comments
Closed

Increase verbosity for ItemLoader, Compose and MapCompose errors #3836

mabelvj opened this issue Jun 20, 2019 · 2 comments

Comments

@mabelvj
Copy link
Contributor

@mabelvj mabelvj commented Jun 20, 2019

ItemLoader already provides error raising for the case of the output processor:

    def get_output_value(self, field_name):
        proc = self.get_output_processor(field_name)
        proc = wrap_loader_context(proc, self.context)
        try:
            return proc(self._values[field_name])
        except Exception as e:
            raise ValueError("Error with output processor: field=%r value=%r error='%s: %s'" % \
                (field_name, self._values[field_name], type(e)._name__, str(e)))

It could be helpful to extend this behaviour could to other ItemLoader methods:

  • _process_input_value for input processors
    def _process_input_value(self, field_name, value):
        proc = self.get_input_processor(field_name)
        proc = wrap_loader_context(proc, self.context)
        return proc(value)
  • get_value for processors that are passed as an argument to add_css, add_xpath or add_value
    def get_value(self, value, *processors, **kw):
        regex = kw.get('re', None)
        if regex:
            value = arg_to_iter(value)
            value = flatten(extract_regex(regex, x) for x in value)

        for proc in processors:
            if value is None:
                break
            proc = wrap_loader_context(proc, self.context)
            value = proc(value)
        return value

Also, Compose and MapCompose could raise errors occurring while processing the values.

mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
@mabelvj
Copy link
Contributor Author

@mabelvj mabelvj commented Jun 25, 2019

Added error messages to ItemLoader (for input processor, and processor as argument) and also for MapCompose and Compose.

For MapCompose and Compose, I'm not sure though if the format is correct, since these errors will be displayed when processing the input/out processors and they will be nested to these ones, example:

ValueError: Error with processor MapCompose value=['change,deep-thoughts,thinking,world'] 
error='ValueError: Error in MapCompose with function _strip 
value=['change,deep-thoughts,thinking,world'] error='TypeError: must be str, not int''

I would appreciate any suggestions here.

mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
mabelvj added a commit to mabelvj/scrapy that referenced this issue Jun 25, 2019
@mabelvj
Copy link
Contributor Author

@mabelvj mabelvj commented Jun 26, 2019

I added some tests in test_loaders.py for this new functionality.

Changes made to previous tests:

Line 485, in ProcessorsTest.test_compose

proc = Compose(str.upper, stop_on_none=False)
proc(None)

would raise:

TypeError: descriptor 'upper' requires a 'str' object but received a 'NoneType'

Now it raises:

ValueError: Error in Compose with <method 'upper' of 'str' objects> 
value=None error='TypeError: descriptor 'upper' requires a 'str' object 
but received a 'NoneType'

So I added checking for ValueError instead, changing self.assertRaises(TypeError, proc, None) for self.assertRaises(ValueError, proc, None)

New tests added:

  • In BasicItemLoaderTest:
    • test_error_input_processor

      def test_error_input_processor(self):
           class TestItem(Item):
               name = Field()
      
           class TestItemLoader(ItemLoader):
               default_item_class = TestItem
               name_in = MapCompose(float)
      
           il = TestItemLoader()
           self.assertRaises(ValueError, il.add_value, 'name',
                             [u'marta', u'other'])

      previously

       il.add_value('name',[u'marta', u'other'])

      would raise:

       *** ValueError: could not convert string to float: 'marta'
      

      now it raises:

       ValueError: Error with input processor MapCompose: field='name' 
       value=['marta', 'other'] error='ValueError: Error in MapCompose 
       with <class 'float'> value=['marta', 'other'] error='ValueError: 
       could not convert string to float: 'marta'
      

      so just checking that ValueError is returned is necessary:

      self.assertRaises(ValueError, il.add_value, 'name', [u'marta', u'other'])

    • test_error_output_processor

      The error raising was already implemented for the output processor, so what I did here
      was to add an specific test for it.

       def test_error_output_processor(self):
           class TestItem(Item):
               name = Field()
      
           class TestItemLoader(ItemLoader):
               default_item_class = TestItem
               name_out = Compose(Join(), float)
      
           il = TestItemLoader()
           il.add_value('name', u'marta')
           with self.assertRaises(ValueError):
               il.load_item()

      where

       il.add_value('name', u'marta')
       il.load_item()

      raises:

       ValueError: Error with output processor: field='name' value=['marta']
       error='ValueError: Error in Compose with <class 'float'> 
       value='marta' error='ValueError: could not convert string to float: 'marta'
      
    • test_error_processor_as_argument

       def test_error_processor_as_argument(self):
           class TestItem(Item):
               name = Field()
      
           class TestItemLoader(ItemLoader):
               default_item_class = TestItem
      
           il = TestItemLoader()
           self.assertRaises(ValueError, il.add_value, 'name',
                             [u'marta', u'other'], Compose(float))

      previously

       il.add_value('name',[u'marta', u'other'], Compose(float))

      would raise:

       TypeError: float() argument must be a string or a number, not 'list'
      

      now it raises:

       ValueError: Error with processor Compose value=['marta', 'other']
       error='ValueError: Error in Compose with <class 'float'> 
       value=['marta', 'other'] error='TypeError: float() argument must be 
       a string or a number, not 'list'``
      
      
      
  • in ProcessorsTest
    • test_compose

       proc = Compose(str.upper, lambda x: x + 1)
       proc('hello')

      would raise:

      *** TypeError: must be str, not int

      now it raises:

       ValueError: Error in Compose with <function ProcessorsTest.test_compose.<locals>.
       <lambda> at 0x110606730> value='HELLO' error='TypeError: must be str, not int
      

      so I added to the test:

       proc = MapCompose(filter_world, lambda x: x + 1)
      self.assertRaises(ValueError, proc, 'hello')
      
    • test_mapcompose

      (1)

       proc = MapCompose(filter_world, six.text_type.upper)
       proc(None)

      would return: [] and it keeps returning it.
      So I added

      self.assertEqual(proc(None), [])

      to check it to check this functionality

      (2)

       proc = MapCompose(filter_world, six.text_type.upper)
       proc([1])

      would return:

      TypeError: descriptor 'upper' requires a 'str' object but received a 'int'

      now it returns:

       ValueError: Error in MapCompose with <method 'upper' of 'str' objects> 
       value=[1] error='TypeError: descriptor 'upper' requires a 'str' object 
       but received a 'int'
      

      so I added:

       proc = MapCompose(filter_world, six.text_type.upper)
       self.assertRaises(ValueError, proc, [1])

      (3)

       proc = MapCompose(filter_world, lambda x: x + 1)
       proc('hello')

      would return:

      *** TypeError: must be str, not int

      now it returns:

       ValueError: Error in MapCompose with 
       <function ProcessorsTest.test_mapcompose.<locals>.<lambda> at 0x10f502620> 
       value='hello' error='TypeError: must be str, not int'
      

      so I added:

       proc = MapCompose(filter_world, lambda x: x + 1)
       self.assertRaises(ValueError, proc, 'hello')
      

@dangra dangra closed this in #3840 Jul 5, 2019
dangra added a commit that referenced this issue Jul 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants