Java Text File Analyzer This is a simple command-line application built in Java that analyzes a given text file to provide various statistics, including word count, character count, line count, and a list of the most frequent words. The application is designed to be a portfolio piece showcasing core Java skills in file I/O, data structures, and object-oriented programming.
Features File Analysis: Counts lines, words, and characters in a specified file.
Word Frequency: Identifies the top 10 most frequent words in the document.
File Type Support: Natively handles plain text (.txt) files. With the inclusion of the Apache POI library, it can also process Microsoft Word (.docx) files.
Command-Line Interface: Provides a simple, interactive interface for users to input a file path and view the results.
Prerequisites To run this application, you need to have the following installed:
Java Development Kit (JDK) 8 or later
For .docx file analysis, you also need the Apache POI library and its dependencies. You can download the necessary JAR files from the official Apache POI website.
How to Run Save the Files: Ensure you have the three Java files (Main.java, FileAnalyzer.java, and FileAnalyzerResults.java) in the same directory.
Setup for .docx files: If you plan to analyze .docx files, create a folder (e.g., libs) and place all the Apache POI JARs inside it.
Compile the Code: Open your terminal or command prompt, navigate to the project directory, and compile the code.
For .txt files only:
javac *.java
For .txt and .docx files (using Apache POI):
javac -cp "libs*;." *.java
javac -cp "libs/*:." *.java
Run the Application: Execute the Main class.
For .txt files only:
java Main
For .txt and .docx files:
java -cp "libs*;." Main
java -cp "libs/*:." Main
The application will then prompt you to enter the path to the file you want to analyze.
Code Structure Main.java: The entry point of the application. It handles user input, file path validation, and displaying the analysis results.
FileAnalyzer.java: The core logic of the program. This class reads the file content, performs the various counts, and determines the most frequent words. It is designed to handle both .txt and .docx files.
FileAnalyzerResults.java: A simple data container (POJO) to store the results of the analysis, making it easy to pass the data between classes.