Skip to content

dominicsayers/url_canonicalize

Repository files navigation

URLCanonicalize

Gem Version Gem downloads Build status Maintainability Coverage Status Dependency Status Security Codacy Badge

URLCanonicalize is a Ruby gem that finds the canonical version of a URL. It provides canonicalize methods for the String, URI::HTTP, URI::HTTPS and Addressable::URI classes.

Installation

Add this line to your application's Gemfile:

gem 'url_canonicalize'

Usage

'http://www.twitter.com'.canonicalize # => 'https://twitter.com/'

URI('http://www.twitter.com').canonicalize # => #<URI::HTTP:0x00000008767908 URL:https://twitter.com/>

Addressable::URI.canonicalize('http://www.twitter.com') # => #<Addressable::URI:0x43c9 URI:https://twitter.com/>

More Information

URLCanonical follows HTTP redirects and also looks for rel="canonical" hints in both the HTTP headers and the <head> section of the response HTML. The URL it returns will be both normalized and canonical. The intention is that whatever variant of a URL is supplied the result will always be the same. The intended use case is for applications that need to dedupe a list of URLs, for instance to check if a new URL is already present in a list. If the list is built from canonicalized URLs then the resulting set will have fewer URLs that point to the same ultimate resource.

About

Finds the canonical version of a URL

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages