This project implements a text file compressor and decompressor using Huffman coding in C.
-
Clone the Repository
git clone https://github.com/sourinPal2003/textFile-Compressor-using-c.git cd textFile-Compressor-using-c
-
Compile the Programs
gcc compressor.c -o compressor.exe gcc decompressor.c -o decompress.exe
-
Compress a File
-
Place your input file as
inp.txt
in the project directory. -
Run the compressor:
./compressor.exe
-
This will generate a compressed file:
inp_compressed.bin
.
-
-
Decompress the File
-
Run the decompressor:
./decompress.exe
-
This will generate
inp_decompressed.txt
, which should be identical to your originalinp.txt
.
-
-
Algorithm:
Huffman Coding for lossless data compression. -
Data Structures:
- Min-Heap (Priority Queue) for building the Huffman tree.
- Binary Tree for representing the Huffman tree.
- Arrays for storing frequencies, codes, and file data.
-
Dynamic ArrayList for Large Files:
Initially, fixed-size arrays were used, which limited the size of files that could be processed. To handle large text files efficiently, a dynamic array list was implemented so that the array size adjusts automatically as needed. -
Binary File Output:
At first, a normal.txt
file was used for storing the compressed output. Later, it was realized that a binary file (.bin
) is necessary for proper compression and decompression, as text files cannot accurately store binary data.
Note:
This project is for educational purposes and demonstrates the basic principles of Huffman coding.