Skip to content

Commit

Permalink
Expose bound methods (#87)
Browse files Browse the repository at this point in the history
* Expose bound methods

* Add documentation and simplify #init signature

* const -> var, damned ES2015 habits

* Remove use of Function.bind to maintain IE8 compat

* Update README examples and make custom factory name explicit

* Test destructured bound method
  • Loading branch information
Thomas Parisot committed Feb 13, 2017
1 parent fa633a7 commit 981bbae
Show file tree
Hide file tree
Showing 16 changed files with 501 additions and 424 deletions.
115 changes: 68 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,9 @@ Thanks Mozilla!
## Node.js

```javascript
var tld = require('tldjs');
const { getDomain } = require('tldjs');

tld.getDomain('mail.google.co.uk');
getDomain('mail.google.co.uk');
// -> 'google.co.uk'
```

Expand All @@ -65,59 +65,79 @@ An [UMD module](https://github.com/umdjs/umd) will be created as of `tld.js`.

# API

`tldjs` can be use either as a whole, or using *destructuring*.

```js
// ES2015 modules syntax
import tldjs from 'tldjs';
import { getDomain } from 'tldjs';

// Node/CommonJS modules syntax
const tldjs = require('tldjs');
const { getDomain } = require('tldjs');
```

## tldExists()

Checks if the TLD is valid for a given host.

```javascript
tld.tldExists('google.com'); // returns `true`
tld.tldExists('google.local'); // returns `false` (not an explicit registered TLD)
tld.tldExists('com'); // returns `true`
tld.tldExists('uk'); // returns `true`
tld.tldExists('co.uk'); // returns `true` (because `uk` is a valid TLD)
tld.tldExists('amazon.fancy.uk'); // returns `true` (still because `uk` is a valid TLD)
tld.tldExists('amazon.co.uk'); // returns `true` (still because `uk` is a valid TLD)
tld.tldExists('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns `true`
const { tldExists } = tldjs;

tldExists('google.com'); // returns `true`
tldExists('google.local'); // returns `false` (not an explicit registered TLD)
tldExists('com'); // returns `true`
tldExists('uk'); // returns `true`
tldExists('co.uk'); // returns `true` (because `uk` is a valid TLD)
tldExists('amazon.fancy.uk'); // returns `true` (still because `uk` is a valid TLD)
tldExists('amazon.co.uk'); // returns `true` (still because `uk` is a valid TLD)
tldExists('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns `true`
```

## getDomain()

Returns the fully qualified domain from a host string.

```javascript
tld.getDomain('google.com'); // returns `google.com`
tld.getDomain('fr.google.com'); // returns `google.com`
tld.getDomain('fr.google.google'); // returns `google.google`
tld.getDomain('foo.google.co.uk'); // returns `google.co.uk`
tld.getDomain('t.co'); // returns `t.co`
tld.getDomain('fr.t.co'); // returns `t.co`
tld.getDomain('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns `example.co.uk`
const { getDomain } = tldjs;

getDomain('google.com'); // returns `google.com`
getDomain('fr.google.com'); // returns `google.com`
getDomain('fr.google.google'); // returns `google.google`
getDomain('foo.google.co.uk'); // returns `google.co.uk`
getDomain('t.co'); // returns `t.co`
getDomain('fr.t.co'); // returns `t.co`
getDomain('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns `example.co.uk`
```

## getSubdomain()

Returns the complete subdomain for a given host.

```javascript
tld.getSubdomain('google.com'); // returns ``
tld.getSubdomain('fr.google.com'); // returns `fr`
tld.getSubdomain('google.co.uk'); // returns ``
tld.getSubdomain('foo.google.co.uk'); // returns `foo`
tld.getSubdomain('moar.foo.google.co.uk'); // returns `moar.foo`
tld.getSubdomain('t.co'); // returns ``
tld.getSubdomain('fr.t.co'); // returns `fr`
tld.getSubdomain('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns ``
const { getSubdomain } = tldjs;

getSubdomain('google.com'); // returns ``
getSubdomain('fr.google.com'); // returns `fr`
getSubdomain('google.co.uk'); // returns ``
getSubdomain('foo.google.co.uk'); // returns `foo`
getSubdomain('moar.foo.google.co.uk'); // returns `moar.foo`
getSubdomain('t.co'); // returns ``
getSubdomain('fr.t.co'); // returns `fr`
getSubdomain('https://user:password@secure.example.co.uk:443/some/path?and&query#hash'); // returns `secure`
```

## getPublicSuffix()

Returns the public suffix for a given host.

```javascript
tld.getPublicSuffix('google.com'); // returns `com`
tld.getPublicSuffix('fr.google.com'); // returns `com`
tld.getPublicSuffix('google.co.uk'); // returns `co.uk`
tld.getPublicSuffix('s3.amazonaws.com'); // returns `s3.amazonaws.com`
const { getPublicSuffix } = tldjs;

getPublicSuffix('google.com'); // returns `com`
getPublicSuffix('fr.google.com'); // returns `com`
getPublicSuffix('google.co.uk'); // returns `co.uk`
getPublicSuffix('s3.amazonaws.com'); // returns `s3.amazonaws.com`
```

## isValid()
Expand All @@ -126,40 +146,43 @@ Checks if the host string is valid.
It does not check if the *tld* exists.

```javascript
tld.isValid('google.com'); // returns `true`
tld.isValid('.google.com'); // returns `false`
tld.isValid('my.fake.domain'); // returns `true`
tld.isValid('localhost'); // returns `false`
tld.isValid('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns `true`
const { isValid } = tldjs;

isValid('google.com'); // returns `true`
isValid('.google.com'); // returns `false`
isValid('my.fake.domain'); // returns `true`
isValid('localhost'); // returns `false`
isValid('https://user:password@example.co.uk:8080/some/path?and&query#hash'); // returns `true`
```

# Troubleshouting

## Retrieving subdomain of `localhost` and custom hostnames

`tld.js` methods `getDomain` and `getSubdomain` are designed to **work only with *valid* TLDs**.
`tld.js` methods `getDomain` and `getSubdomain` are designed to **work only with *known and valid* TLDs**.
This way, you can trust what a domain is.

Unfortunately, `localhost` is a valid hostname but it is not a TLD.
`tld.js` has a concept of `validHosts` you declare
`localhost` is a valid hostname but not a TLD. Although you can instanciate your own flavour of `tld.js` with *additional valid hosts*:

```js
var tld = require('tldjs');
const tldjs = require('tldjs');

tld.getDomain('localhost'); // returns null
tld.getSubdomain('vhost.localhost'); // returns null
tldjs.getDomain('localhost'); // returns null
tldjs.getSubdomain('vhost.localhost'); // returns null

tld.validHosts = ['localhost'];
const myTldjs = tldjs.fromUserSettings({
validHosts: ['localhost']
});

tld.getDomain('localhost'); // returns 'localhost'
tld.getSubdomain('vhost.localhost'); // returns 'vhost'
myTldjs.getDomain('localhost'); // returns 'localhost'
myTldjs.getSubdomain('vhost.localhost'); // returns 'vhost'
```

## Updating the TLDs List

Many libraries offer a list of TLDs. But, are they up-to-date? And how to update them?

`tldjs` bundles a list of known TLDs but this list can become outdated.
`tld.js` bundles a list of known TLDs but this list can become outdated.
This is especially true if the package have not been updated on npm for a while.

Hopefully for you, even if I'm flying over the world, if I've lost my Internet connection or even if
Expand All @@ -175,9 +198,7 @@ npm install --tldjs-update-rules
npm install --save tldjs --tldjs-update-rules
```

Open an issue to request an update of the bundled rules.
Or else, fork the project and open a PR after having run `npm version patch`.
Once merged, the `tldjs` package will be published on npmjs.com.
Open an issue to request an update of the bundled TLDs.


# Contributing
Expand Down
47 changes: 44 additions & 3 deletions index.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,47 @@
"use strict";

var tld = require('./lib/tld.js').init();
tld.rules = require('./rules.json');
var allRules = require('./rules.json');

module.exports = tld;
var cleanHostValue = require('./lib/clean-host.js');
var escapeRegExp = require('./lib/escape-regexp.js');
var getRulesForTld = require('./lib/tld-rules.js');
var getDomain = require('./lib/domain.js');
var getSubdomain = require('./lib/subdomain.js');
var isValid = require('./lib/is-valid.js');
var getPublicSuffix = require('./lib/public-suffix.js');
var tldExists = require('./lib/tld-exists.js');

/**
* Creates a new instance of tldjs
*
* @param {Object.<rules,validHosts>} options [description]
* @return {tldjs|Object} [description]
*/
function factory(options) {
var rules = options.rules || allRules || {};
var validHosts = options.validHosts || [];

return {
cleanHostValue: cleanHostValue,
escapeRegExp: escapeRegExp,
getRulesForTld: getRulesForTld,
getDomain: function (host) {
return getDomain(rules, validHosts, host);
},
getSubdomain: function (host) {
return getSubdomain(rules, validHosts, host);
},
isValid: function (host) {
return isValid(validHosts, host);
},
getPublicSuffix: function (host) {
return getPublicSuffix(rules, host);
},
tldExists: function (tld) {
return tldExists(rules, tld);
},
fromUserSettings: factory
};
}

module.exports = factory({ validHosts: [], rules: allRules });
48 changes: 48 additions & 0 deletions lib/canditate-rule.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
var some = require('./polyfills/array-some.js');

/**
* Returns the best rule for a given host based on candidates
*
* @static
* @param host {String} Hostname to check rules against
* @param rules {Array} List of rules used to work on
* @return {Object} Candidate object, with a normal and exception state
*/
module.exports = function getCandidateRule (host, rules, options) {
var rule = {'normal': null, 'exception': null};

options = options || { lazy: false };

some(rules, function (r) {
var pattern;

// sld matching or validHost? escape the loop immediately (except if it's an exception)
if ('.' + host === r.getNormalXld()) {
if (options.lazy || r.exception || r.isHost) {
rule.normal = r;
}

return true;
}

// otherwise check as a complete host
// if it's an exception, we want to loop a bit more to a normal rule
pattern = '.+' + r.getNormalPattern() + '$';

if ((new RegExp(pattern)).test(host)) {
rule[r.exception ? 'exception' : 'normal'] = r;
return !r.exception;
}

return false;
});

// favouring the exception if encountered
// previously we were copy-altering a rule, creating inconsistent results based on rule order order
// @see https://github.com/oncletom/tld.js/pull/35
if (rule.normal && rule.exception) {
return rule.exception;
}

return rule.normal;
};
32 changes: 32 additions & 0 deletions lib/clean-host.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
var URL = require('url');

/**
* Utility to cleanup the base host value. Also removes url fragments.
*
* Works for:
* - hostname
* - //hostname
* - scheme://hostname
* - scheme+scheme://hostname
*
* @param {string} value
* @return {String}
*/

// scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
var hasPrefixRE = /^(([a-z][a-z0-9+.-]*)?:)?\/\//;
var invalidHostnameChars = /[^A-Za-z0-9.-]/;

function trim(value) {
return String(value).replace(/(^\s+|\s+$)/g, '');
}

module.exports = function cleanHostValue(value){
value = trim(value).toLowerCase();

var parts = URL.parse(hasPrefixRE.test(value) ? value : '//' + value, null, true);

if (parts.hostname && !invalidHostnameChars.test(parts.hostname)) { return parts.hostname; }
if (!invalidHostnameChars.test(value)) { return value; }
return '';
};
37 changes: 37 additions & 0 deletions lib/domain.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
var Rule = require('./rule.js');
var isValid = require('./is-valid.js');
var cleanHostValue = require('./clean-host.js');
var extractTldFromHost = require('./from-host.js');
var getCandidateRule = require('./canditate-rule.js');
var getRulesForTld = require('./tld-rules.js');

/**
* Detects the domain based on rules and upon and a host string
*
* @api
* @param {string} host
* @return {String}
*/
module.exports = function getDomain (allRules, validHosts, host) {
var domain = null, hostTld, rules, rule;
var _validHosts = validHosts || [];

if (isValid(_validHosts, host) === false) {
return null;
}

host = cleanHostValue(host);
hostTld = extractTldFromHost(host);
rules = getRulesForTld(allRules, hostTld, new Rule({"firstLevel": hostTld, "isHost": _validHosts.indexOf(hostTld) !== -1}));
rule = getCandidateRule(host, rules);

if (rule === null) {
return null;
}

host.replace(new RegExp(rule.getPattern()), function (m, d) {
domain = d;
});

return domain;
};
11 changes: 11 additions & 0 deletions lib/escape-regexp.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
/**
* Escapes RegExp specific chars.
*
* @since 1.3.1
* @see https://github.com/oncletom/tld.js/pull/33
* @param {String|Mixed} s
* @returns {string} Escaped string for a safe use in a `new RegExp` expression
*/
module.exports = function escapeRegExp(s) {
return String(s).replace(/([.*+?^=!:${}()|\[\]\/\\])/g, "\\$1");
};
9 changes: 9 additions & 0 deletions lib/from-host.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
/**
* Utility to extract the TLD from a host string
*
* @param {string} host
* @return {String}
*/
module.exports = function extractTldFromHost(host){
return host.split('.').pop();
};
13 changes: 13 additions & 0 deletions lib/is-valid.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
/**
* Checking if a host string is valid
* It's usually a preliminary check before trying to use getDomain or anything else
*
* Beware: it does not check if the TLD exists.
*
* @api
* @param host {String}
* @return {Boolean}
*/
module.exports = function isValid (validHosts, host) {
return typeof host === 'string' && (validHosts.indexOf(host) !== -1 || (host.indexOf('.') !== -1 && host[0] !== '.'));
};
Loading

0 comments on commit 981bbae

Please sign in to comment.