Skip to content

teragrep/blf_01

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BLF_01

Tokenizer used to split extremely large inputs into major and minor tokens with pre-set delimiters (splitters)

Features

  • Fast tokenization of large inputs

  • Tokenization splits input into major and minor tokens

  • Permutations are generated from major tokens

  • Configurable delimiters for major and minor tokens (character or pattern)

Documentation

See the official documentation on docs.teragrep.com.

Limitations

Uses Java version 1.8 other versions might not work correctly.

Expects InputStream as input for tokenization.

How to [compile/use/implement]

See tests for how to implement.

Import the Tokenizer class.

Contributing

You can involve yourself with our project by opening an issue or submitting a pull request.

Contribution requirements:

  1. All changes must be accompanied by a new or changed test. If you think testing is not required in your pull request, include a sufficient explanation as why you think so.

  2. Security checks must pass

  3. Pull requests must align with the principles and values of extreme programming.

  4. Pull requests must follow the principles of Object Thinking and Elegant Objects (EO).

Read more in our Contributing Guideline.

Contributor License Agreement

Contributors must sign Teragrep Contributor License Agreement before a pull request is accepted to organization’s repositories.

You need to submit the CLA only once. After submitting the CLA you can contribute to all Teragrep’s repositories.