-
Couldn't load subscription status.
- Fork 20.7k
Closed
Labels
Description
What would you like to Propose?
I would like to propose adding implementations for the following fundamental lossless compression algorithms to the compression category:
- LZ77 (Lempel–Ziv 77)
- LZ78 (Lempel–Ziv 78)
Adding these algorithms would significantly enhance the collection of classic compression techniques, complementing the existing LZW and Run-Length Encoding algorithms within the package. LZ77 and LZ78 represent key variations in dictionary-based compression and are foundational to many modern compression schemes (like DEFLATE used in gzip and PNG).
Issue details
Algorithm 1: LZ77 (Lempel–Ziv 77)
- Problem Statement: LZ77 is a lossless data compression algorithm that achieves compression by replacing repeated occurrences of data with references to a single copy of that data existing earlier in the uncompressed data stream (using a sliding window). A match is encoded by a pair of numbers: a distance (offset back in the window) and a length.
- Example (Conceptual):
- Input:
"ababcabc" - Output: A sequence representing encoded data, often as tuples like
(offset, length, next_character). The exact output format can vary, but conceptually for this input, it might look something like:(0,0,'a'), (0,0,'b'), (2,2,'c'), (3,3,'<End Marker>')(representing 'a', 'b', then 'ab' found 2 back, followed by 'c', then 'abc' found 3 back).
- Input:
- Proposed File Path:
src/main/java/com/thealgorithms/compression/LZ77.java
Algorithm 2: LZ78 (Lempel–Ziv 78)
- Problem Statement: LZ78 is a dictionary-based lossless data compression algorithm. It processes input data sequentially, building a dictionary of phrases encountered so far. It outputs pairs, typically
(dictionary_index, next_character), where the index refers to the longest matching phrase already in the dictionary. - Example (Conceptual):
- Input:
"ABAABABAABAB" - Output: A sequence representing encoded data, often as pairs like
(index, character). For this input, a possible output sequence could be:(0,'A'), (0,'B'), (1,'A'), (2,'B'), (4,'A'), (5,'B'). (This implicitly builds a dictionary like: 1:A, 2:B, 3:AA, 4:BA, 5:ABA, 6:BAB).
- Input:
- Proposed File Path:
src/main/java/com/thealgorithms/compression/LZ78.java
Additional Information
No response