# Algorithm demonstrations

The following notebook provides a demonstration of all the algorithms written for the FTR

## Setup

For the code to run, please ensure that the module bitstring is installed, also for the contextual algorithm to work add the following function to the vl_codes module:


In [1]:
# initial imports
import itertools
import numpy as np
from math import log2

import adaptive_arithmetic as adarith
import context_arithmetic as conarith
import fgk
import vitter

def probability_dict(x):
    """
    Produces probability dictionary for to compress file with based on the
    normalised frequencies of each symbol.

    Parameters:
    -----------
    x: dict
    file data

    Returns:
    --------
    p: dict
    Alphabet and corresponding probability
    frequencies: dict
    Alphabet and corresponding frequencies in file data
    """

    frequencies = dict([(key, len(list(group))) for key, group in itertools.groupby(sorted(x))])
    n = sum([frequencies[a] for a in frequencies])
    p = dict([(a, frequencies[a]/n) for a in frequencies])
    return(p, frequencies)

## File properties

The following cell calculates and displays the properties of a given file:

In [2]:
file_name = "hamlet.txt"

with open(file_name) as file:
    data = file.read()
    
H_stat = lambda pr: -sum([pr[a]*log2(pr[a]) for a in pr]) # i.i.d entropy

p, freq = probability_dict(data)
transition = conarith.transition_matrix(data)

H = 0                                                     # Markov chain entropy
for char in p:
    pxy = transition[ord(char)]
    for i in pxy:
        if i== 0:
            continue
        H += p[char]*i*np.log2(1/i)
        
print("***Properties of file {}: ***".format(file_name))
print("File size:      {} bytes".format(len(data)))
print("Static Entropy: {} bits".format(H_stat(p)))
print("Markov Entropy: {} bits".format(H))


    

***Properties of file hamlet.txt: ***
File size:      207039 bytes
Static Entropy: 4.449863631694343 bits
Markov Entropy: 3.352987113263871 bits


## Adaptive Huffman

### FGK

In [3]:
y = fgk.encode(data)
x = fgk.decode(y)
print("FGK Compression rate for {}: {} bits/symbol".format(file_name, len(y)/len(data)))
print(''.join(x[:200]))

FGK Compression rate for hamlet.txt: 4.56374885891064 bits/symbol
        HAMLET


        DRAMATIS PERSONAE


CLAUDIUS        king of Denmark. (KING CLAUDIUS:)

HAMLET  son to the late, and nephew to the present king.

POLONIUS        lord chamberlain. (LORD POLONI


## Vitter

In [4]:
N = int(np.ceil(len(data)*0.01))
alpha = 0.5
remove = True

y = vitter.vitter_encode(data, N=N, alpha=alpha, remove=remove)
x = vitter.vitter_decode(y, N=N, alpha=alpha, remove=remove)

print("Vitter Compression rate for {} with N={} and alpha={}: {} bits/symbol".format(file_name, N, alpha, len(y)/len(data)))
print(''.join(x[:200]))

Vitter Compression rate for hamlet.txt with N=2071 and alpha=0.5: 4.480247682803723 bits/symbol
        HAMLET


        DRAMATIS PERSONAE


CLAUDIUS        king of Denmark. (KING CLAUDIUS:)

HAMLET  son to the late, and nephew to the present king.

POLONIUS        lord chamberlain. (LORD POLONI


## Arithmetic Coding

### Adaptive

In [5]:
N = int(np.ceil(len(data)*0.01))
alpha = 0.5

y = adarith.encode(data, N=N, alpha=alpha)
x = adarith.decode(y, N=N, alpha=alpha)

print("Adaptive Arithmetic Compression rate for {} with N={} and alpha={}: {} bits/symbol".format(file_name, N, alpha, len(y)/len(data)))
print(''.join(x[:200]))

Adaptive Arithmetic Compression rate for hamlet.txt with N=2071 and alpha=0.5: 4.474809093938823 bits/symbol
        HAMLET


        DRAMATIS PERSONAE


CLAUDIUS        king of Denmark. (KING CLAUDIUS:)

HAMLET  son to the late, and nephew to the present king.

POLONIUS        lord chamberlain. (LORD POLONI


### Contextual



In [6]:
y, transition, p0 = conarith.encode(data)
x = conarith.decode(y, transition, p0)

print("Contextual Arithmetic Compression rate for {}: {} bits/symbol".format(file_name, len(y)/len(data)))
print(''.join(x[:200]))

Contextual Arithmetic Compression rate for hamlet.txt: 3.3531556856437676 bits/symbol
        HAMLET


        DRAMATIS PERSONAE


CLAUDIUS        king of Denmark. (KING CLAUDIUS:)

HAMLET  son to the late, and nephew to the present king.

POLONIUS        lord chamberlain. (LORD POLONI
