Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Use w3lib.url.canonicalize_url() from w3lib 1.15.0 #2168

Merged
merged 1 commit into from Aug 16, 2016
Merged

[MRG+1] Use w3lib.url.canonicalize_url() from w3lib 1.15.0 #2168

merged 1 commit into from Aug 16, 2016

Conversation

@ashkulz
Copy link
Contributor

@ashkulz ashkulz commented Aug 8, 2016

fixes #2157

@kmike
Copy link
Member

@kmike kmike commented Aug 8, 2016

Thanks!

It seems a few other places needs to be changed:

  • setup.py file needs a version bump for w3lib;
  • it looks like some code in scrapy.utils.url was only required for canonicalize_url - I wonder if we can drop it now.
@codecov-io
Copy link

@codecov-io codecov-io commented Aug 8, 2016

Current coverage is 83.40% (diff: 100%)

Merging #2168 into master will decrease coverage by 0.05%

Powered by Codecov. Last update 9a734e6...bb3b806

@ashkulz
Copy link
Contributor Author

@ashkulz ashkulz commented Aug 8, 2016

Thanks, incorporated your feedback. Did a rebase, as not sure whether an additional commit would be acceptable (might be a good idea to incorporate what is recommended as a part of docs/contributing.rst).

@kmike
Copy link
Member

@kmike kmike commented Aug 9, 2016

@ashkulz: it is good to have a readable commits history, thanks for taking time to do the rebase!

Currently we are not too strict on rebase vs extra commits - sometimes we ask for a rebase when a commit history gets messy, but not always. Probably that's why it is not in contributing.rst - we haven't agreed on a policy, it is just a common sense now.


# scrapy.utils.url was moved to w3lib.url and import * ensures this
# move doesn't break old code
from w3lib.url import *
from w3lib.url import _safe_chars

This comment has been minimized.

@kmike

kmike Aug 9, 2016
Member

I believe this import is needed for backwards compatibility

return urlunparse((scheme, netloc.lower(), path, params, query, fragment))


def _unquotepath(path):

This comment has been minimized.

@kmike

kmike Aug 9, 2016
Member

hmm, this function was here for a long time; I wonder if we should import it from w3lib.url to avoid breaking user code

This comment has been minimized.

@kmike

kmike Aug 9, 2016
Member

other removed functions were here for much less time; I think it is fine to remove them as it is unlikely removal will break anything

@ashkulz
Copy link
Contributor Author

@ashkulz ashkulz commented Aug 16, 2016

@kmike: sorry for the delay. Is there a necessity to keep _safe_chars and _unquotepath, as those are supposed to be internal interfaces for which backwards compatibility may not apply?

Also remove code/imports which are now unused due to this change.

fixes #2157
@redapple
Copy link
Contributor

@redapple redapple commented Aug 16, 2016

@ashkulz , we sometimes break this guideline to be on the safe side with what users import.
We may have to live with this for some time.
Another thing to do (in another PR) could be to add a warning in the release notes that a later version may not expose _safe_chars and _unquotepath.

@ashkulz
Copy link
Contributor Author

@ashkulz ashkulz commented Aug 16, 2016

@redapple: any other feedback? I've already pushed a rebased commit which includes this change, unless you want me to include an entry in docs/news.rst in this PR itself.

@redapple
Copy link
Contributor

@redapple redapple commented Aug 16, 2016

@ashkulz , no other feedback. Looks good to me!
Thanks again.

@redapple redapple changed the title Use w3lib.url.canonicalize_url() from w3lib 1.15.0 [MRG+1] Use w3lib.url.canonicalize_url() from w3lib 1.15.0 Aug 16, 2016
@kmike kmike merged commit 241bd00 into scrapy:master Aug 16, 2016
2 checks passed
2 checks passed
codecov/patch 100% of diff hit (target 100%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kmike
Copy link
Member

@kmike kmike commented Aug 16, 2016

Thanks @ashkulz!

@ashkulz ashkulz deleted the advarisk:w3lib-canonicalize-url branch Aug 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants