Skip to content
This repository has been archived by the owner on Nov 27, 2019. It is now read-only.
/ unicode-ranger Public archive

A utility that scans URL contents and returns a unicode-range value!

License

Notifications You must be signed in to change notification settings

malchata/unicode-ranger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

unicode-ranger

Get a unicode range from the contents of a URL.

This is a node module that will gather the content of one or more URLs. It uses request to get URL contents, and cheerio to read that content. It then reduces all of this content down to unique characters, finds their decimal unicode values with charCodeAt, sorts the list and finds ranges. Finally, it converts that list to a unicode-range-friendly list of unicode ranges like so:

U+20,U+26-29,U+2C-32,U+35-36,U+38,U+3A,U+3F,U+41-4A,U+4C-57,U+59,U+61-79,U+A9,U+BB,U+2019

Usage

If you want maximum convenience, use the CLI version aptly named unicode-ranger-cli. It's much more convenient than noodling with the module directly. That said, it's not too difficult to use the module either. Just grab it from npm and use it like so:

const unicodeRanger = require("unicode-ranger");
unicodeRanger("https://example.com").then((data) => console.log(data));

Or do multiple URLs separated by semicolons:

const unicodeRanger = require("unicode-ranger");
unicodeRanger("https://example.com;https://en.wikipedia.org").then((data) => console.log(data));

The CLI version has an option for specifying multiple URLs via a text file.

Options

The second argument for the module is for user options:

excludeElements: CSS selectors for contents that you want excluded from the analysis. This value is fed into cheerio's remove method.

Contributing/whatever

Do whatever you want with this module and its code. If you do incorporate it somewhere, I'd appreciate a mention. If you have questions about it, bug me on Twitter, or better yet, log an issue. This module is not perfect, so if you have some ideas for how to make it better or want to contribute, just fork the code and submit a PR for me to review.

Special thanks

Thanks to both Ben Briggs and Ray Nicholus for their help with some snags I hit. Check them out on twitter!

About

A utility that scans URL contents and returns a unicode-range value!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published