Users of Scrapy<1.1 ImagesPipeline could access upper-case attributes,
e.g.
class CustomImagesPipeline(ImagesPipeline):
(...)
def item_completed(self, results, item, info):
item = super(CustomImagesPipeline, self).item_completed(
results, item, info)
# Note: not all items do have an images field.
if self.IMAGES_RESULT_FIELD not in item.fields:
return item
Leading to the following exception in 1.1.0(rc4):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 588, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/tmp/unpacked-eggs/__main__.egg/*****/pipelines/screenshots.py", line 539, in item_completed
results, item, info)
File "/tmp/unpacked-eggs/__main__.egg/*****/pipelines/screenshots.py", line 242, in item_completed
images = item.pop(self.IMAGES_RESULT_FIELD, [])
AttributeError: 'CustomImagesPipeline' object has no attribute 'IMAGES_RESULT_FIELD'
The text was updated successfully, but these errors were encountered:
But I must ask how is the philosophy to write Scrapy code?
And use it!?
Because I would expect that self.IMAGES_RESULT_FIELD belongs only to the pipeline's internal code and does not belong to the public API. As such, it could be changed freely without prejudice to the public API (as long as the public API remains untouched).
Also, I would expect that the item_completed method should use only results, item and info. If it can access any attributes from self, there is absolutely no encapsulation.
Plus, I would expect that any attribute intended to be exposed and accessibly should have its own get method.
@djunzu I think your approach is right, but if a change breaks users code and there is an easy workaround then we usually prefer to add a backwards compatibility shim with deprecation warnings. It doesn't mean users should write code like that (relying on undocumented attributes), but the userbase is large, and we don't want to break user code without a good reason.
Also, we should have named attribute self._IMAGES_RESULT_FIELD if we wanted to communicate it is private.
#1891 is not backward compatible.
Users of Scrapy<1.1
ImagesPipeline
could access upper-case attributes,e.g.
Leading to the following exception in 1.1.0(rc4):
The text was updated successfully, but these errors were encountered: