## Implementation of LZ algorithm

**I have implemented the LZ algorithm using Python programming language. My program consists of 3 modules:**

**- Compression**

**- Decompression**

**- Comparison**

***The Orchestrator is used to run the 3 modules sequentially one after the another.***

## 1. Compressor

**This program is used to compress the input text using LZ algorithm.**

#### 1. Imports

**I have imported the os library to perform file operations.**

In [None]:
import os

#### 2. Define functions

**This function is used to read a text file and return its content.**

In [None]:
def read_text_file(file_path):
    with open(file_path, 'r') as file :
        return file.read().strip()

**This function is used to write binary data to a file.**

In [None]:
def write_binary_file(file_path, binary_data):
    with open(file_path, 'wb') as file :
        file.write(binary_data)

**This function is used to compress the input text using LZ78 algorithm.**

**It compresses the text and generates encodings for each segment.**

In [None]:
def lempel_ziv_compression(input_text):
    phrase_dictionary = {}
    next_index = 1
    current_phrase = ""
    compressed_text = []
    
    for char in input_text:
        new_phrase = current_phrase + char
        if new_phrase not in phrase_dictionary :
            if current_phrase:
                compressed_text.append((phrase_dictionary[current_phrase],
                                        char))
            else:
                compressed_text.append((0, char))
            phrase_dictionary[new_phrase] = next_index
            next_index += 1
            current_phrase = ""
        else:
            current_phrase = new_phrase

    if current_phrase:
        compressed_text.append((phrase_dictionary[current_phrase] , ''))

    return compressed_text

**This function is used to convert the compressed text encodings into binary format.**

- 8-bit binary for index
- 8-bit binary for the last bit (ASCII)

In [None]:
def compressed_phrase_to_binary(compressed_data):
    binary_output = ""
    for index, char in compressed_data :
        binary_output += f"{index:08b}"  
        if char:
            binary_output += f"{ord(char):08b}" 
    return binary_output

#### 3. Main function and code execution

**The main function is where all the operations are happening. I am calling all the methods defined above in the main function and executing them systematically and producing the output.**

In [None]:
def main():
    # I/p and O/p file paths
    input_file = "./Dumps/Input_Text_File.txt"
    compressed_file= "./Dumps/Compressed_version.bin"

    # Step 1: Reading the input text file and finding its size in bits
    input_text = read_text_file(input_file)
    print("\033[93m\nInput text:\033[0m", input_text)
    print("\033[96m\n    Size of the input text:\033[0m",
          len(input_text)*8,"Bits")

    # Step 2: Compressing the input text using the LZ78 algorithm
    compressed_data = lempel_ziv_compression(input_text)

    # Step 3: Converting the compressed phrases into binary format
    binary_data = compressed_phrase_to_binary(compressed_data)

    # Step 4: Writing the compressed binary data to a binary file
    write_binary_file(compressed_file, binary_data.encode('utf-8'))
    print("\033[92m\n    Size of the compressed binary output:\033[0m",
          len(binary_data),"Bits")

if __name__ == "__main__":
    main()