Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1265 #1267

Merged
merged 1 commit into from Jun 1, 2015
Merged

Fix #1265 #1267

merged 1 commit into from Jun 1, 2015

Conversation

@curita
Copy link
Member

@curita curita commented May 29, 2015

Attempt to solve #1265 by normalizing the paths in build_component_list, a helper used to merge component settings with their base settings. This solution is not bulletproof regarding priority sorting, but I think it's as close as we can get it to be.

Example of a use-case that is still broken:

>>> EXTENSIONS =  {'new_path.Obj': 10,'another_path.Another': 20, 'old_path.Obj': 30}
>>> build_components({}, EXTENSIONS)
>>> ['another_path.Another', 'new_path.Obj']

Current implementation will load each object once, but it will give preference to the old path ordering. That isn't always what we want, the problem here is that we can't know which path was updated/added to the dictionary first (Another reason why #1149 is going to be really helpful, though those pairings could still have the same priority). For a concrete example, Scrapy Cloud uses addons with old paths, users won't be able to update them if they use the new paths since the old ones take precedence.

I don't see how we can fix that in Scrapy side, I just hope those are really rare cases (mixing old paths and new ones in the same dictionary should be uncommon) so we can move forward with the fix.

@jdemaeyer
Copy link
Contributor

@jdemaeyer jdemaeyer commented May 30, 2015

If we cannot guarantee that our fix will result in what the user intended, maybe it's less of a pitfall to throw a helpful error message and shut down the crawler? After all, the new paths came with 1.0, I don't know what you decided for backwards compatibility, but maybe an error message with clear instructions how to fix it is on par for a major release?

@nramirezuy
Copy link
Contributor

@nramirezuy nramirezuy commented Jun 1, 2015

I agree with @jdemaeyer, this bug came from Scrapy Cloud because were using old paths. They have to simply start using the new ones.

@curita
Copy link
Member Author

@curita curita commented Jun 1, 2015

Hate to see a backward incompatible change out of all the module relocations, but it surely makes sense to raise an error if we can't guarantee the same behavior from last stable release, specially if we could have avoided an actual related issue. Mixing paths in the same setting dict is still something that should be uncommon, so users shouldn't get this error too often.

I have updated the pr, what do you think? /cc @jdemaeyer @nramirezuy @dangra

This function tries to load the object based on the given path, hence it'll
only work if the old path is still supported.
"""
obj = load_object(path)
Copy link
Member

@dangra dangra Jun 1, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling load_object() is a bad idea not only because it triggers an import but because it brings all the side cases of skipping the error trapping logic of https://github.com/scrapy/scrapy/blob/master/scrapy/middleware.py#L31-L45

Copy link
Member

@dangra dangra Jun 1, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the list of relocated modules is known, my vote is to add a function to each component (extensions, spidermw, downmw) so each of them rewrite classpaths using string only operations.

@kmike kmike added this to the Scrapy 1.0 milestone Jun 1, 2015
@curita curita force-pushed the fix-1265 branch 2 times, most recently from abf23ae to 35fb4bf Jun 1, 2015
@dangra
Copy link
Member

@dangra dangra commented Jun 1, 2015

LGTM.

@kmike
Copy link
Member

@kmike kmike commented Jun 1, 2015

Can we have a test which uses one of the deprecation rules?

@curita
Copy link
Member Author

@curita curita commented Jun 1, 2015

@kmike Sure thing, I've added it and squashed the commits so it's easier to cherry-pick.

dangra added a commit that referenced this issue Jun 1, 2015
@dangra dangra merged commit d52cf8b into scrapy:master Jun 1, 2015
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants