Skip to content

slevithan/parseuri

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parseUri 3.0.0

parseUri is a mighty but tiny JavaScript URI/URN/URL parser that splits any URI into its parts (all of which are optional). Its combination of accuracy, comprehensiveness, and brevity is unrivaled (1KB min/gzip, with no dependencies).

Compared to the built-in URL

parseUri includes several advantages over URL:

  • parseUri gives you many additional properties (authority, userinfo, subdomain, domain, tld, resource, directory, filename, suffix) that aren’t available from URL.
  • URL throws e.g. if not given a protocol, and in many other cases of valid (but not supported) and invalid URIs. parseUri makes a best case effort even with partial or invalid URIs and is extremely good with edge cases.
  • URL’s rules don’t allow correctly handling many non-web protocols. For example, URL doesn’t throw on any of 'git://localhost:1234', 'ssh://myid@192.168.1.101', or 't2ab:///path/entry', but it also doesn’t get their details correct since it treats everything after <non-web-protocol>: up to ? or # as part of the pathname.
  • parseUri includes a “friendly” parsing mode (in addition to its default mode) that handles human-friendly URLs like 'example.com/file.html' as expected.
  • parseUri supports providing a list of second-level domains that should be treated as part of the top-level domain (ex: co.uk).

Conversely, parseUri is single-purpose and doesn’t apply normalization. You can of course use a URI normalizer separately, or build one on top of parseUri.

parseUri’s demo page allows easily comparing with URL’s results.

Results / URI parts

Returns an object with 20 URI parts as properties plus queryParams, a URLSearchParams object that includes methods get(key), getAll(key), etc.

Here’s an example of what each part contains:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                  href                                                    │
├────────────────────────────────────────────────────────────────┬─────────────────────────────────────────┤
│                             origin                             │                resource                 │
├──────────┬─┬───────────────────────────────────────────────────┼──────────────────────┬───────┬──────────┤
│ protocol │ │                     authority                     │       pathname       │ query │ fragment │
│          │ ├─────────────────────┬─────────────────────────────┼───────────┬──────────┤       │          │
│          │ │      userinfo       │            host             │ directory │ filename │       │          │
│          │ ├──────────┬──────────┼──────────────────────┬──────┤           ├─┬────────┤       │          │
│          │ │ username │ password │       hostname       │ port │           │ │ suffix │       │          │
│          │ │          │          ├───────────┬──────────┤      │           │ ├────────┤       │          │
│          │ │          │          │ subdomain │  domain  │      │           │ │        │       │          │
│          │ │          │          │           ├────┬─────┤      │           │ │        │       │          │
│          │ │          │          │           │    │ tld │      │           │ │        │       │          │
"  https   ://   user   :   pass   @ sub1.sub2 . dom.com  : 8080   /p/a/t/h/  a.html    ?  q=1  #   hash   "
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘

parseUri additionally supports IPv4 and IPv6 addresses, URNs, and many edge cases not shown here. See tests.

Parsing modes

parseUri has two parsing modes: default and friendly. The default mode follows official URI rules. Friendly mode doesn’t require '<protocol>:', ':', or '//' to signal the start of an authority, which allows handling human-friendly URLs like 'example.com/file.html' as expected. This change has several effects:

  • It allows starting a URI with an authority (as noted above).
  • It precludes friendly mode from properly handling relative paths (that don’t start from root '/') such as 'dir/file.html'.
  • Since the web protocols http, https, ws, wss, and ftp don’t require '//', this also means that friendly mode extends this behavior to non-web protocols.

Usage examples

let uri = parseUri('https://a.b.example.com:80/@user/a/my.img.jpg?q=x&q=#hash');
uri.protocol // → 'https'
uri.host // → 'a.b.example.com:80'
uri.hostname // → 'a.b.example.com'
uri.subdomain // → 'a.b'
uri.domain // → 'example.com'
uri.port // → '80'
uri.resource // → '/@user/a/my.img.jpg?q=x&q=#hash'
uri.pathname // → '/@user/a/my.img.jpg'
uri.directory // → '/@user/a/'
uri.filename // → 'my.img.jpg'
uri.suffix // → 'jpg'
uri.query // → 'q=x&q='
uri.fragment // → 'hash'
uri.queryParams.get('q') // → 'x'
uri.queryParams.getAll('q') // → ['x', '']
uri.queryParams.get('not-present') // → null
uri.queryParams.getAll('not-present') // → []
// also available: href, origin, authority, userinfo, username, password, tld

// relative path (not starting from root /)
uri = parseUri('dir/file.html?q=x');
uri.hostname // → ''
uri.directory // → 'dir/'
uri.filename // → 'file.html'
uri.query // → 'q=x'

// friendly mode allows starting with an authority
uri = parseUri('example.com/file.html', 'friendly');
uri.hostname // → 'example.com'
uri.directory // → '/'
uri.filename // → 'file.html'

// IPv4 address
uri = parseUri('ssh://myid@192.168.1.101');
uri.protocol // → 'ssh'
uri.username // → 'myid'
uri.hostname // → '192.168.1.101'
uri.domain // → ''

// IPv6 address
uri = parseUri('https://[2001:db8:85a3::7334]:80?q=x');
uri.hostname // → '[2001:db8:85a3::7334]'
uri.port // → '80'
uri.domain // → ''
uri.query // → 'q=x'

// mailto
uri = parseUri('mailto:first@my.com,second@my.com?subject=Hey&body=Sign%20me%20up!');
uri.protocol // → 'mailto'
uri.username // → ''
uri.hostname // → ''
uri.pathname // → 'first@my.com,second@my.com'
uri.query // → 'subject=Hey&body=Sign%20me%20up!'
uri.queryParams.get('body') // → 'Sign me up!'

// mailto in friendly mode
uri = parseUri('mailto:me@my.com?subject=Hey', 'friendly');
uri.protocol // → 'mailto'
uri.username // → 'me'
uri.hostname // → 'my.com'
uri.pathname // → ''

/* Also supports e.g.:
- https://[2001:db8:85a3::7334%en1]/ipv6-with-zone-identifier
- git://localhost:1234
- file:///path/file
- tel:+1-800-555-1212
- urn:uuid:c5542ab6-3d96-403e-8e6b-b8bb52f48d9a?q=x
*/

Use parseUri’s demo page to easily test and compare results.