# Packages Installation and Importing Libraries
#### Package Descriptions:
- `bitstring`: This package is useful for creating, reading, and manipulating binary data. It's essential when working with binary file formats or low-level data processing.
- `wave`: The `wave` package allows you to read and write WAV audio files, which is crucial for audio signal processing tasks.
- `pandas`: Pandas is a powerful data manipulation and analysis library. It provides data structures like DataFrames, which are essential for handling and analyzing structured data efficiently.


In [2]:
# Install required packages
!pip install bitstring
!pip install wave
!pip install pandas

Collecting bitstring
  Downloading bitstring-4.2.3-py3-none-any.whl (71 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.7/71.7 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting bitarray<3.0.0,>=2.9.0 (from bitstring)
  Downloading bitarray-2.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m288.3/288.3 kB[0m [31m10.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: bitarray, bitstring
Successfully installed bitarray-2.9.2 bitstring-4.2.3
Collecting wave
  Downloading Wave-0.0.2.zip (38 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wave
  Building wheel for wave (setup.py) ... [?25l[?25hdone
  Created wheel for wave: filename=Wave-0.0.2-py3-none-any.whl size=1220 sha256=75070b95184a1fd284a46236303f640d100be03329b24c99c66265002ddd5e7d
  Stored in directory: /root/.cache/pip/wheels/f8/24/4d/1b01c0

In [3]:
# Import necessary libraries
import wave
from bitstring import BitStream, BitArray
import pandas as pd

# Encoding and Decoding
### Code Explanation

The following code snippet includes functions for encoding and decoding audio data using the Rice encoding algorithm. This approach is often used for lossless data compression. The code adapts the concepts from a GitHub repository (https://github.com/RoninSanta/AudioProcessing_with_Python-Audio_Compression_Testing).

#### Function Descriptions:

- **unary(t)**:
  - This function performs unary encoding of an integer `t`. Unary encoding represents the number `t` as `t` ones followed by a zero.
  - For example, `unary(3)` returns `"1110"`.

- **rice_encode(S, K)**:
  - This function encodes an integer `S` using the Rice encoding algorithm with parameter `K`.
  - The algorithm splits `S` into a quotient `q` and a remainder `r` based on `M = 2^K`. The quotient is unary encoded and the remainder is binary encoded, then concatenated to form the codeword.

- **rice_decode(codeword, K)**:
  - This function decodes a Rice encoded string back to its original integer value.
  - It finds the unary encoded part to get the quotient and then reads the binary part to get the remainder, combining them to reconstruct the original integer.

This code is useful for compressing audio data using Rice encoding, which is particularly efficient for certain types of data distributions often found in audio processing.

In [4]:
# Codes inside this cell are adaptep from https://github.com/RoninSanta/AudioProcessing_with_Python-Audio_Compression_Testing
# Unary function for encoding
def unary(t):
    """
    Unary encoding of an integer.

    Parameters:
    t (int): The integer to be encoded.

    Returns:
    str: The unary encoded string.
    """
    return '1' * t + '0'

# Rice encoding algorithm
def rice_encode(S, K):
    """
    Rice encoding of an integer.

    Parameters:
    S (int): The integer to be encoded.
    K (int): The parameter for the Rice encoding.

    Returns:
    str: The Rice encoded string.
    """
    M = 2 ** K
    q = S // M # Quotient
    r = S % M # Remainder
    q_un = unary(q) # Unary encoding of quotient
    r_bin = format(r, 'b').zfill(K) # Binary encoding of remainder
    codeword = q_un + r_bin # Concatenating unary and binary parts
    return codeword

# Rice decoding algorithm
def rice_decode(codeword, K):
    """
    Rice decoding of a codeword.

    Parameters:
    codeword (str): The Rice encoded string.
    K (int): The parameter for the Rice encoding.

    Returns:
    int: The decoded integer.
    """
    q = codeword.find('0') # Finding the position of the first '0' (end of unary part)
    r = codeword[q+1:q+1+K] # Extracting the binary part
    r_int = int(r, 2) # Converting binary part to integer
    M = 2 ** K
    S = q * M + r_int # Calculating the original integer
    return S

### Code Explanation

The following code snippet includes functions to decode Rice encoded audio data back to its original form, analyze the compression efficiency, and compare original and decoded audio files. These functions facilitate a complete workflow for audio compression and validation.

#### Function Descriptions:

- **encode_wav(file_path, K, output_path)**:
  - This function reads audio data from a WAV file, encodes each byte using Rice encoding, and writes the encoded data to an output file.
  - It reads the raw byte data from the WAV file, encodes each byte, and writes the resulting encoded bits to the specified output path.

- **decode_wav(encoded_file_path, K, output_path)**:
  - This function reads a Rice encoded file, decodes it, and writes the audio data to a WAV file.
  - It reads the encoded bits, decodes each bit using the Rice decoding algorithm, and reconstructs the byte data to be written as a WAV file.

- **analyze_compression(file_path, K)**:
  - This function analyzes the efficiency of Rice encoding by comparing the size of the original and encoded audio data.
  - It reads the original audio data, encodes it using the Rice encoding algorithm, calculates the encoded size, and computes the compression percentage.

- **compare_wav_files(original_file_path, decoded_file_path)**:
  - This function compares the audio data of the original and decoded WAV files to verify if they are identical.
  - It reads both the original and decoded WAV files and checks if their audio data matches exactly.

These functions complete the Rice encoding workflow by enabling decoding, compression analysis, and validation of the compressed audio data.

In [5]:

# Read and encode a WAV file
def encode_wav(file_path, K, output_path):
    """
    Read a WAV file and encode its audio data using Rice encoding.

    Parameters:
    file_path (str): Path to the input WAV file.
    K (int): The parameter for the Rice encoding.
    output_path (str): Path to save the encoded output.
    """
    with wave.open(file_path, 'rb') as w:
        byte_data = w.readframes(w.getnframes()) # Reading frames from WAV file

    bits = [rice_encode(byte, K) for byte in byte_data] # Encoding each byte using Rice encoding

    with open(output_path, "w") as f:
        f.write(' '.join(bits)) # Writing encoded bits to output file

# Decode and write a WAV file
def decode_wav(encoded_file_path, K, output_path):
    """
    Decode a Rice encoded file and write the audio data to a WAV file.

    Parameters:
    encoded_file_path (str): Path to the Rice encoded file.
    K (int): The parameter for the Rice encoding.
    output_path (str): Path to save the decoded WAV file.
    """
    # Read the encoded file and split into codewords
    with open(encoded_file_path, "r") as fd:
        bitsagain = fd.read().split(' ')
    # Decode each codeword using Rice decoding
    decbits = [rice_decode(block, K) for block in bitsagain]
    bytesagain = bytes(decbits)
    # Write the decoded bytes to a WAV file
    with wave.open(output_path, 'wb') as w:
        w.setnchannels(1)
        w.setsampwidth(1)
        w.setframerate(44100)
        w.writeframes(bytesagain)

# Function to analyze compression
def analyze_compression(file_path, K):
    """
    Analyze the compression efficiency of Rice encoding on a WAV file.

    Parameters:
    file_path (str): Path to the input WAV file.
    K (int): The parameter for the Rice encoding.

    Returns:
    tuple: Original size, encoded size, and compression percentage.
    """
    # Read the WAV file and extract byte data
    with wave.open(file_path, 'rb') as w:
        byte_data = w.readframes(w.getnframes())

    original_size = len(byte_data)
    # Encode each byte using Rice encoding
    encoded_bits = [rice_encode(byte, K) for byte in byte_data]
    encoded_size = sum(len(bits) for bits in encoded_bits) / 8  # Convert bits to bytes
    compression_percentage = (1 - encoded_size / original_size) * 100

    return original_size, encoded_size, compression_percentage

# Function to compare original and decoded WAV files
def compare_wav_files(original_file_path, decoded_file_path):
    """
    Compare the original WAV file with the decoded WAV file.

    Parameters:
    original_file_path (str): Path to the original WAV file.
    decoded_file_path (str): Path to the decoded WAV file.

    Returns:
    bool: True if files are identical, False otherwise.
    """
    # Read the original WAV file
    with wave.open(original_file_path, 'rb') as w:
        original_data = w.readframes(w.getnframes())
    # Read the decoded WAV file
    with wave.open(decoded_file_path, 'rb') as w:
        decoded_data = w.readframes(w.getnframes())

    # Compare the original and decoded data
    return original_data == decoded_data

In [7]:

# Encode Sound1.wav
encode_wav('/content/Sound1.wav', 4, '/content/Sound1_Enc.ex2')

# Decode Sound1_Enc.ex2
decode_wav('/content/Sound1_Enc.ex2', 4, '/content/Sound1_Enc_Dec.wav')

# Compare original and decoded WAV files
are_files_identical = compare_wav_files('/content/Sound1.wav', '/content/Sound1_Enc_Dec.wav')
print(f"Are the original and decoded Sound1.wav files identical? {'Yes' if are_files_identical else 'No'}")

# Analyze compression of Sound1.wav for different K values
original_size1, encoded_size1_4, compression1_4 = analyze_compression('/content/Sound1.wav', 4)
original_size1, encoded_size1_2, compression1_2 = analyze_compression('/content/Sound1.wav', 2)
# Analyze compression of Sound2.wav for different K values
original_size2, encoded_size2_4, compression2_4 = analyze_compression('/content/Sound2.wav', 4)
original_size2, encoded_size2_2, compression2_2 = analyze_compression('/content/Sound2.wav', 2)

# Display the results
data = {
    'File': ['Sound1.wav', 'Sound2.wav'],
    'Original Size (bytes)': [original_size1, original_size2],
    'Encoded Size (K=4 bits) (bytes)': [encoded_size1_4, encoded_size2_4],
    'Encoded Size (K=2 bits) (bytes)': [encoded_size1_2, encoded_size2_2],
    'Compression % (K=4 bits)': [compression1_4, compression2_4],
    'Compression % (K=2 bits)': [compression1_2, compression2_2]
}

df = pd.DataFrame(data)


Are the original and decoded Sound1.wav files identical? Yes


In [8]:
df

Unnamed: 0,File,Original Size (bytes),Encoded Size (K=4 bits) (bytes),Encoded Size (K=2 bits) (bytes),Compression % (K=4 bits),Compression % (K=2 bits)
0,Sound1.wav,1002044,1516220.75,4115633.5,-51.312792,-310.72383
1,Sound2.wav,1008000,1575301.375,4348504.75,-56.279898,-331.399281
