You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, canonicalize_url lowercase all the letters in the path of the canonicalized urls. Therefore, we will need to work on keeping all the letters the way the urls were before canonicalizing them. We can change GURL source code for this
The text was updated successfully, but these errors were encountered:
The goal here is to make it 100% compatible with canonicalize_url from w3lib: this is required for scrapy integration, or else this will be a backwards incompatible change. Besides making it compatible now, it's important that we know if this breaks. We discussed this with @kmike and it seems that this can be achieved in the following way:
make w3lib import canonicalize_url from scurl if it's available
add a new test env to w3lib that is run using scurl: this will make sure that if canonicalize_url in w3lib is changed, scurl version is updated accordingly
run w3lib tests in the scurl travis build: probably this means cloning w3lib repo and running w3lib's canonicalize_url tests (which will pick up scurl) - this will ensure that if scurl version is changed and no longer passes w3lib tests, we know this.
Right now, canonicalize_url lowercase all the letters in the path of the canonicalized urls. Therefore, we will need to work on keeping all the letters the way the urls were before canonicalizing them. We can change GURL source code for this
The text was updated successfully, but these errors were encountered: