Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Itemloader errors #3836 #3840

Merged
merged 11 commits into from Jul 5, 2019
Merged

[MRG+1] Itemloader errors #3836 #3840

merged 11 commits into from Jul 5, 2019

Conversation

@mabelvj
Copy link
Contributor

@mabelvj mabelvj commented Jun 25, 2019

Added error messages to ItemLoader (for input processor, and processor as argument) and also for MapCompose and Compose.

For MapCompose and Compose, I'm not sure though if the format is correct, since these errors will be displayed when processing the input/out processors and they will be nested to these ones, example:

ValueError: Error with processor MapCompose value=['change,deep-thoughts,thinking,world'] 
error='ValueError: Error in MapCompose with function _strip 
value=['change,deep-thoughts,thinking,world'] error='TypeError: must be str, not int''

I would appreciate some suggestions here.

Fixes #3836

@mabelvj mabelvj force-pushed the itemloader-errors branch from 15afa62 to 859008a Jun 25, 2019
@mabelvj mabelvj force-pushed the itemloader-errors branch from c0b9687 to e5d17b4 Jun 25, 2019
@mabelvj mabelvj force-pushed the itemloader-errors branch from 8089e73 to f134b1d Jun 25, 2019
@codecov
Copy link

@codecov codecov bot commented Jun 25, 2019

Codecov Report

Merging #3840 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3840      +/-   ##
==========================================
+ Coverage   85.46%   85.48%   +0.02%     
==========================================
  Files         169      169              
  Lines        9664     9680      +16     
  Branches     1440     1440              
==========================================
+ Hits         8259     8275      +16     
  Misses       1157     1157              
  Partials      248      248
Impacted Files Coverage Δ
scrapy/loader/__init__.py 94.87% <100%> (+0.27%) ⬆️
scrapy/loader/processors.py 100% <100%> (ø) ⬆️
scrapy/crawler.py 92.39% <0%> (ø) ⬆️
scrapy/settings/default_settings.py 98.65% <0%> (ø) ⬆️
scrapy/downloadermiddlewares/redirect.py 96.82% <0%> (+0.05%) ⬆️

@mabelvj
Copy link
Contributor Author

@mabelvj mabelvj commented Jun 26, 2019

I added some tests in test_loaders.py for this new functionality.

Changes made to previous tests:

Line 485, in ProcessorsTest.test_compose

proc = Compose(str.upper, stop_on_none=False)
proc(None)

would raise:

TypeError: descriptor 'upper' requires a 'str' object but received a 'NoneType'

Now it raises:

ValueError: Error in Compose with <method 'upper' of 'str' objects> 
value=None error='TypeError: descriptor 'upper' requires a 'str' object 
but received a 'NoneType'

So I added checking for ValueError instead, changing self.assertRaises(TypeError, proc, None) for self.assertRaises(ValueError, proc, None)

New tests added:

  • In BasicItemLoaderTest:
    • test_error_input_processor

      def test_error_input_processor(self):
           class TestItem(Item):
               name = Field()
      
           class TestItemLoader(ItemLoader):
               default_item_class = TestItem
               name_in = MapCompose(float)
      
           il = TestItemLoader()
           self.assertRaises(ValueError, il.add_value, 'name',
                             [u'marta', u'other'])

      previously

       il.add_value('name',[u'marta', u'other'])

      would raise:

       *** ValueError: could not convert string to float: 'marta'
      

      now it raises:

       ValueError: Error with input processor MapCompose: field='name' 
       value=['marta', 'other'] error='ValueError: Error in MapCompose 
       with <class 'float'> value=['marta', 'other'] error='ValueError: 
       could not convert string to float: 'marta'
      

      so just checking that ValueError is returned is necessary:

      self.assertRaises(ValueError, il.add_value, 'name', [u'marta', u'other'])

    • test_error_output_processor

      The error raising was already implemented for the output processor, so what I did here
      was to add an specific test for it.

       def test_error_output_processor(self):
           class TestItem(Item):
               name = Field()
      
           class TestItemLoader(ItemLoader):
               default_item_class = TestItem
               name_out = Compose(Join(), float)
      
           il = TestItemLoader()
           il.add_value('name', u'marta')
           with self.assertRaises(ValueError):
               il.load_item()

      where

       il.add_value('name', u'marta')
       il.load_item()

      raises:

       ValueError: Error with output processor: field='name' value=['marta']
       error='ValueError: Error in Compose with <class 'float'> 
       value='marta' error='ValueError: could not convert string to float: 'marta'
      
    • test_error_processor_as_argument

       def test_error_processor_as_argument(self):
           class TestItem(Item):
               name = Field()
      
           class TestItemLoader(ItemLoader):
               default_item_class = TestItem
      
           il = TestItemLoader()
           self.assertRaises(ValueError, il.add_value, 'name',
                             [u'marta', u'other'], Compose(float))

      previously

       il.add_value('name',[u'marta', u'other'], Compose(float))

      would raise:

       TypeError: float() argument must be a string or a number, not 'list'
      

      now it raises:

       ValueError: Error with processor Compose value=['marta', 'other']
       error='ValueError: Error in Compose with <class 'float'> 
       value=['marta', 'other'] error='TypeError: float() argument must be 
       a string or a number, not 'list'``
      
      
      
  • in ProcessorsTest
    • test_compose

       proc = Compose(str.upper, lambda x: x + 1)
       proc('hello')

      would raise:

      *** TypeError: must be str, not int

      now it raises:

       ValueError: Error in Compose with <function ProcessorsTest.test_compose.<locals>.
       <lambda> at 0x110606730> value='HELLO' error='TypeError: must be str, not int
      

      so I added to the test:

       proc = MapCompose(filter_world, lambda x: x + 1)
      self.assertRaises(ValueError, proc, 'hello')
      
    • test_mapcompose

      (1)

       proc = MapCompose(filter_world, six.text_type.upper)
       proc(None)

      would return: [] and it keeps returning it.
      So I added

      self.assertEqual(proc(None), [])

      to check it to check this functionality

      (2)

       proc = MapCompose(filter_world, six.text_type.upper)
       proc([1])

      would return:

      TypeError: descriptor 'upper' requires a 'str' object but received a 'int'

      now it returns:

       ValueError: Error in MapCompose with <method 'upper' of 'str' objects> 
       value=[1] error='TypeError: descriptor 'upper' requires a 'str' object 
       but received a 'int'
      

      so I added:

       proc = MapCompose(filter_world, six.text_type.upper)
       self.assertRaises(ValueError, proc, [1])

      (3)

       proc = MapCompose(filter_world, lambda x: x + 1)
       proc('hello')

      would return:

      *** TypeError: must be str, not int

      now it returns:

       ValueError: Error in MapCompose with 
       <function ProcessorsTest.test_mapcompose.<locals>.<lambda> at 0x10f502620> 
       value='hello' error='TypeError: must be str, not int'
      

      so I added:

       proc = MapCompose(filter_world, lambda x: x + 1)
       self.assertRaises(ValueError, proc, 'hello')
      

Copy link
Member

@Gallaecio Gallaecio left a comment

It looks good to me.

In Scrapy 2.0, which will drop Python 2 support, we should probably consider taking advantage of exception chaining.

@Gallaecio Gallaecio changed the title Itemloader errors #3836 [MRG+1] Itemloader errors #3836 Jul 5, 2019
@dangra dangra merged commit 3d8f075 into scrapy:master Jul 5, 2019
3 checks passed
@kmike
Copy link
Member

@kmike kmike commented Jul 5, 2019

regarding exception chaining: six.reraise can be used to write compatible code

@kmike kmike added this to the v1.7 milestone Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants