-
Algorithm Theory:
- Algorithm used to compress data without losing information (i.e., lossless compression).
- Begin by counting the frequency of characters in the input data.
- Create a binary tree where each leaf node represents a character, and the path from the root to the leaf corresponds to the binary code of the character.
- Once the tree is completed, traverse the tree and assign a binary code to each character. Traversing left is '0', and right is '1'.
- Encode input data by replacing each character with its corresponding Huffman code.
- The Huffman tree is stored first, followed by the encoded information.
-
Trade-offs:
- Compression process can be computationally intensive for large datasets.
- Requires knowledge of the entire input data before constructing the tree.
-
Algorithm Theory:
- Algorithm used to compress data without losing information (i.e., lossless compression).
- Build a dictionary of strings encountered in the input data and replace recurring strings with shorter codes.
- The dictionary begins with single-character data for all possible characters (i.e., all ASCII characters).
- Scan the input data from left to right, building substrings, and checking if they are already present in the dictionary.
- During the process, encoded output is generated by replacing substrings with their corresponding codes from the dictionary.
-
Trade-off:
- Compression is dependent on the dictionary size. A larger dictionary can capture more patterns but may use more memory for storage.
- Algorithm Theory:
- Bundling multiple files into a single file.
- Archived file contains information about the bundled files in a structured format.
- Archived file contains the following information in order:
- A 4-byte integer ending in 10011 (decimal 19), representing the length of the filename.
- A separator character "11111111".
- The filename, a string.
- Another separator character "11111111".
- A 64-bit number (Java long) ending in 00001100 (decimal 12), the length of the file.
- Another separator character "11111111".
- Contents of the file.
- Clone Repository:
git clone <repository_url>
- Navigate into project directory:
cd Project_File_Compression_yxk19a
- Ensure you are in the root project directory:
pwd
- "pwd" should result in something like
/Users/username/Desktop/Project_File_Compression
- "pwd" should result in something like
- Ensure the following Java programs and classes are in the same directory as the main files:
SchubsL.java
,SchubsH.java
,SchubsArc.java
,Deschubs.java
BinaryOut.java
,BinaryIn.java
BinaryStdIn.java
,BinaryStdOut.java
StdIn.java
,StdOut.java
Queue.java
,MinPQ.java
TST.java
- Ensure you are in the same level as the "pom.xml" file.
- Run:
mvn compile // mvn Test
- Ensure you are in the root project directory.
- Ensure all Java programs are compiled:
javac example.java
- Run Programs:
- Huffman Compression:
java SchubsH <filename>
- LZW Compression:
java SchubsL <filename>
- Archive using Tar:
java SchubsArc archive-name <file1name> <file2name> ...
- Decompress files:
java Deschubs <filename.ll|hh|hz>
- Huffman Compression: