is a nodejs module that breaks down a corpus of text into lines and tokens.
$ npm install text2token
The module has one method: text2token
, which returns an object that contains a list of each line
in your text file, as well as a list of all unique tokens
.
$ node
>
> var lib = require('text2token');
> var converted = lib.text2token('./src/bigtext.txt')
> converted.tokens
[ '©',
'2015',
'GitHub,',
'Inc.',
'Terms',
'Privacy',
'Security',
..........
> converted.lines
[ '© 2015 GitHub, Inc. Terms Privacy Security Contact Help',
'Status API Training Shop Blog About Pricing',
'The quick brown fox jumped over the lazy dog'
.......
MIT License 2015-2016 © Andy Craze & Contributors