This C code implements a simple Huffman compression algorithm. Huffman coding is a widely used method for lossless data compression. The code creates a Huffman tree based on character frequencies in a given input file, generates binary codes for each character, and writes the compressed data to an output file. Below is a detailed explanation of the different sections and functions in the code.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_TREE_NODES 256
#define MAX_CODE_LENGTH 256- The code includes standard libraries for input/output (
stdio.h), memory management (stdlib.h), and string manipulation (string.h). - It defines constants for the maximum number of tree nodes (256, corresponding to the number of possible ASCII characters) and the maximum length of a code (256 bits).
typedef struct Node {
char character;
int frequency;
struct Node *left, *right;
} Node;Noderepresents each node in the Huffman tree. Each node contains:character: the character it represents.frequency: how often that character appears in the input data.leftandright: pointers to the left and right children in the tree.
typedef struct PriorityQueue {
Node *nodes[MAX_TREE_NODES];
int size;
} PriorityQueue;PriorityQueueis a simple array-based priority queue to manage the nodes based on their frequencies. It holds an array ofNodepointers and the current size of the queue.
- Creating Nodes
Node* createNode(char character, int frequency) {
Node *newNode = (Node *)malloc(sizeof(Node));
newNode->character = character;
newNode->frequency = frequency;
newNode->left = newNode->right = NULL;
return newNode;
}- Allocates and initializes a new node with a given character and frequency.
- Inserting Nodes into Priority Queue
void insert(PriorityQueue *pq, Node *node) {
pq->nodes[pq->size++] = node;
// Sift up to maintain heap property
...
}- Inserts a node into the priority queue and maintains the heap property (min-heap) by sifting up.
- Removing the Minimum Node
Node* removeMin(PriorityQueue *pq) {
...
}- Removes and returns the node with the smallest frequency from the priority queue while maintaining the heap property by sifting down.
- Building the Huffman Tree
void buildHuffmanTree(PriorityQueue *pq) {
while (pq->size > 1) {
Node *left = removeMin(pq);
Node *right = removeMin(pq);
Node *combined = createNode('\0', left->frequency + right->frequency);
...
}
}- Combines the two nodes with the smallest frequencies to create a new internal node until only one node remains, which becomes the root of the Huffman tree.
- Generating Huffman Codes
void generateCodes(Node *root, char *code, int depth, char codes[MAX_TREE_NODES][MAX_CODE_LENGTH]) {
...
}- Recursively traverses the Huffman tree to generate binary codes for each character, storing them in the
codesarray.
- Compressing the File
void compressFile(const char *inputFile, const char *outputFile) {
...
}- Reads the input file, calculates character frequencies, builds the Huffman tree, generates codes, and writes the compressed binary data to the output file. It handles bit-level operations to pack bits into bytes.
int main() {
compressFile("input.txt", "output.bin");
return 0;
}- The
mainfunction callscompressFile, specifyinginput.txtas the input file andoutput.binas the output file where the compressed data will be stored.
- The code efficiently compresses a text file using Huffman coding by creating a tree based on character frequencies, generating unique binary codes for each character, and writing the compressed data as a binary file.