Skip to content
Data Compression Service
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.stack-work
app
docs
files
src
test
.gitignore
Dockerfile
LICENSE
PiousPiper.cabal
README.md
Setup.hs
compress.sh
stack.yaml

README.md

Pious Piper Data Compression Service

Pious Piper is a multi-processor-capable lossless data compression tool. It's exciting [non-]proprietary multi-threaded compression algorithm is derived from an LZW-variant theorized by Schmuel Klein and Yair Wiseman in the Journal of Discrete Applied Mathematics.

Explanation of Implementation

LZW compression is an on-the-fly compression algorithm. It starts with a dictionary that maps ASCII characters to their standard 8-bit representations. As groups of characters are encountered, these characters are grouped together and given a new code beyond that of the ASCII range (0 - 127). A benefit of this method of determining encoding patterns is that the mapping of patterns to codes, which must be conserved for the sake of compression, is derived based on the frequency of patterns in the text.

LZW decompression operates via the same manner in reverse. As ASCII characters are encountered in the compressed code, their groupings are used to create new mappings of characters to codes.

The multi-threading component of this algorithm is implemented by segmenting the encoded text along a pre-determined pattern and then running the decompression algorithm in parallel with read-heads operating along the separate segments of the encoding.

A drawback that limits the effectiveness of standard LZW compression is the increasing number of bits that are needed per character encoding as the mapping dictionary grows larger and larger. The parallelized implementation of LZW compression lessens the impact of this drawback by forming isolated mapping dictionaries that are customized to the unique encoded text segments.

Requirements

Use and Instructions

To compress files:

$ stack exec compress file.txt

To decompress file (will search for "compressed-originalFileName" in ./files/outputs/compressed/):

$ stack exec decompress originalFileName.txt

To run tests:

$ stack test

Results

TaleOfTwoCities.txt before compression:

initialFileSizeTaleOfTwoCities

TaleOfTwoCities.txt after Pious Piper compression:

compressedFileSizeTaleOfTwoCities

Space Reduction

spaceSavings

Technologies Used

You can’t perform that action at this time.