A text-analyzer java implementation for Data Structures and Algorithms class using Red-black trees, Hash Maps, AVL trees, Sets and graphs.
In order to execute and compile the project, you need to have installed maven
.
- dnf
dnf install maven
-
Clone repository
git clone https://github.com/julianrosas11032002/text-analyzer.git
cd text-analyzer
-
Install .jar (using maven) by doing:
mvn install
For a succesful installation, you need to place in the
text-analyzer
directory (where thepom.xml
file is) and execute the command above.
After a succesful installation, (placing in text-analyzer
directory) execute the .jar by doing:
java -jar target/proyecto3.jar [A] -o [B]
Where:
[A] - Files to Analyse.
[B] - Name of the directory where the text analysis is going to be stored, it must be after -o
flag.
NOTE: The -o
flag can be anywhere in the proyecto3.jar arguments, however, it must be specified the directory where the analysis is going to be stored, otherwise, the program will terminate.
After execution, change the current directory to the specified one by the program ([B]
) and execute the index.html
file in order to check the analysis.
Within the index.html
file there is a Graph which connects two analysed files if and only if both files have in common 7 words with at least 7 characters. Moreover, you can navigate to each of the analysed files by clicking on their names.
Each Analysed file (stated by [A]
) has:
- An SVG generated Red-Black tree of the 15 most repetitive words within the file.
- An SVG generated AVL tree of the same information.
- An SVG generated Bar Chart of the 20 most repetitive words and their respective percentage among the total words.
- An SVG generated Pie Chart with the same information.
- The total repetitions of each word in the file.
Julián Rosas Scull - julian.rosas@ciencias.unam.mx
Project Link: https://github.com/julianrosas11032002/text-analyzer