Skip to content

[FEATURE REQUEST] Proposal to add BWT (Burrows-Wheeler Transform) and MTF (Move-to-Front) #6925

@Microindole

Description

@Microindole

What would you like to Propose?

I would like to propose adding two fundamental data transformation algorithms, commonly used in compression pipelines, to the compression category:

  1. BWT (Burrows-Wheeler Transform)
  2. MTF (Move-to-Front)

These implementations will serve as data transformation tools, enhancing the repository's collection of algorithms related to lossless compression.

Issue details

Algorithm 1: BWT (Burrows-Wheeler Transform)

  • Problem Statement: The BWT is a reversible data transformation, not a direct compression algorithm. It rearranges the input data by sorting, which groups similar characters together. This improves the effectiveness of subsequent compression algorithms (like RLE or entropy coding). The well-known bzip2 compression format uses BWT.
  • Example:
    • Input: "banana$" (where $ is a special end-of-string marker)
    • Output: ("annb$aa", 4)
      • "annb$aa" is the transformed string (the last column of the sorted matrix).
      • 4 is the 0-based index of the original string within the sorted matrix.
  • File Path: src/main/java/com/thealgorithms/compression/BurrowsWheelerTransform.java

Algorithm 2: MTF (Move-to-Front)

  • Problem Statement: MTF (Move-to-Front) is also a data transformation, often used after BWT. It converts a data sequence into a sequence of integers. It maintains a dynamic list of characters; when a character occurs, it outputs its index in the list and then moves that character to the front. This results in frequently recurring characters being encoded as small integers, which is beneficial for subsequent entropy encoding.
  • Example:
    • Assuming the output from the BWT step is "annb$aa".
    • Assuming the initial character list (alphabetical) is ['$', 'a', 'b', 'n'].
    • Input: "annb$aa"
    • Output: [1, 3, 0, 3, 3, 3, 0]
  • File Path: src/main/java/com/thealgorithms/compression/MoveToFront.java

Additional Information

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions