-
Notifications
You must be signed in to change notification settings - Fork 0
Kawaboongawa/Spellchecker
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
################################################################################ ################################################################################ _____ _ _ _____ _ _ / ____| | | |/ ____| | | | | (___ _ __ ___| | | | | |__ ___ ___| | _____ _ __ \___ \| '_ \ / _ \ | | | | '_ \ / _ \/ __| |/ / _ \ '__| ____) | |_) | __/ | | |____| | | | __/ (__| < __/ | |_____/| .__/ \___|_|_|\_____|_| |_|\___|\___|_|\_\___|_| | | |_| ################################################################################ ################################################################################ VERSION : Last Update done the 31/08/2017 Spellchecker v1.0 AUTHORS : LUGAND Jérémy lugand_j@epita.fr CETRE Cyril cyril.cetre@epita.fr PREVIEW : This is a C language spell checker building first a disk written dictionary an then using it to give every word that matches the request within a given distance of Damerau-Levenshtein. This project was written for an EPITA school project. REQUIREMENT : This project is available on both MacOs and LINUX operating system. You will only need a version of gcc and gcc-7 on MacOs. BUILD : To build the project run the make command just as shown in the following in the root directory of the project : 42sh$ make This should generate two different binaries : TextMiningCompiler and TextMiningCompiler. To ensure that everything is working properly you can run our test case that compare to the reference given by the teacher. you can do so by typing : 42sh$ make test To clean binaries & trash files generated by the project, you can simply type: 42sh$ make clean THE COMPILER : usage : 42sh$ ./ref/osx/TextMiningCompiler /path/to/word/freq.txt /path/to/output/dict.bin The binary will take a text file as first argument and will generate a dictionary with the name of the second argument as output. The text file must respect a proper syntax, which is the word, followed by at least one space and followed by its frequency and a linefeed. this is an example of input : 42sh$ cat -e example_word.txt this 705$ was 695$ a 2014$ cool 758$ project 810$ to 69619$ do 5349$ THE REQUEST APPLICATION: usage : 42sh$ echo "approx 0 example" | ./TextMiningApp /path/to/compiled/dict.bin [{"word":"example","freq":984528,"distance":0}] 42sh$ echo "approx 0 anotherone" | ./TextMiningApp /path/to/compiled/dict.bin [{"word":"anotherone","freq":933,"distance":0}] 42sh$ ./TextMiningApp /path/to/compiled/dict.bin < /path/to/file_with_request.txt ... This binary will take the dictionary compiled by the compiler as first argument and will read stdin. The input must have the format given above. the number given is the maximal distance that we are looking for. A distance of 0 means that we are looking for the exact word. Be careful that greater is the distance, greater is the time taken to process the request. The output result is given in JSON format.
About
C language spell checker using a patricia trie and Damereau-Levenshtein distance.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published