Skip to content

The library containing the functionality of vextractor-cli

License

Notifications You must be signed in to change notification settings

SaadiSave/vextractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vextractor

Crates.io Crates.io

vextractor is a simple library for extracting the vocabulary of a text file.

About

vextractor works for any language in any script supported by unicode, as long as the language separates words with a unicode space ' ' (U+20).

Quick Example

extern crate vextractor;
use vextractor::vex::Vextract;
let x = Vextract::new(
    "somepath/somefile.txt", // file containing the text to be processed
    vec!["EU", "etc.", "i.e.", "e.g."], // Acronyms
    vec!["Germany", "France", "Belgium", "Italy"] // Proper Nouns
);
println!("{}", x.get_pretty_vocab()); // Prints the vocabulary
println!("{}", x.get_sorted_pretty_vocab()); // Sorts, then prints
x.write_to_file("somepath/somefile.txt"); // Writes vocab to a text file

Licence

vextractor is licensed under GNU AFFERO GENERAL PUBLIC LICENSE version 3. Please read the LICENSE.md file for more information.

About

The library containing the functionality of vextractor-cli

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages