Skip to content

Commit

Permalink
Merge pull request #3872 from TwoTau/TwoTau-lzw-readme
Browse files Browse the repository at this point in the history
Add documentation for LZW compression
  • Loading branch information
AdiChat committed Oct 30, 2018
2 parents b7256f9 + 2b1f5b2 commit 7ab6997
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 3 deletions.
Original file line number Diff line number Diff line change
@@ -1,4 +1,46 @@
# cosmos
Your personal library of every algorithm and data structure code that you will ever encounter
# Lempel-Ziv-Welch compression

Collaborative effort by [OpenGenus](https://github.com/opengenus)
The **Lempel-Ziv-Welch** (LZW) algorithm is a dictionary-based lossless compressor. The algorithm uses a dictionary to map a sequence of symbols into codes which take up less space. LZW performs especially well on files with frequent repetitive sequences.

## Encoding algorithm

1. Start with a dictionary of size 256 or 4096 (where the key is an ASCII character and the values are consecutive)
2. Find the longest entry W in the dictionary that matches the subsequent values.
3. Output the index of W and remove W from the input.
4. Add (W + (the next value)) to the dictionary as a key with value (dictionary size + 1)
5. Repeat steps 2 to 4 until the input is empty

**Example:** the input "ToBeOrNotToBeABanana" (length of 20) would be encoded in 17.
```
[T][o][B][e][O][r][N][o][t][To][Be][A][B][a][n][an][a]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
```
The brackets separate the values (e.g. `[To]` is encoded as one value). The LZW algorithm replaces the second occurences of `To`, `Be`, and `an` with one value, eliminating data redundancy.

Note that there is **no need to store the dictionary** because the LZW uncompression process reconstructs the dictionary as it goes along.

### Encoding example
![Table showing the LZW compression process by character](https://slideplayer.com/slide/5049613/16/images/11/LZW+%E2%80%93+Encoding+Example+T%3Dababcbababaaaaaaa.jpg)

> Image credits: https://slideplayer.com/slide/5049613/
## Decoding algorithm

1. Start with a dictionary of size 256 or 4096 (must be the same as what the encoding process used in step 1)
2. Variable PREV = ""
3. Is the next code V in the dictionary?
- Yes:
1. Variable T = dictionary key whose value = V
2. Output T
3. Add (PREV + (first character of T)) to the dictionary
- No:
1. Variable T = PREV + (first character of PREV)
2. Output T
3. Add T to the dictionary
4. Repeat step 3 until the end of the input is reached

---
<p align="center">
A massive collaborative effort by <a href="https://github.com/OpenGenus/cosmos">OpenGenus Foundation</a>
</p>
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/* Part of Cosmos by OpenGenus Foundation */

const primes = [];
let n = 2;

while (primes.length < 10001) {
// if this number is not divisible by any prime currently in the array
if (primes.reduce((isPrime, prime) => isPrime && (n % prime !== 0), true)) {
primes.push(n);
}

n++;
}

console.log(primes[10000]);

0 comments on commit 7ab6997

Please sign in to comment.