Skip to content

Python use guide

Martsim edited this page Jan 26, 2018 · 3 revisions

Requirements

The code was tested with Python 2.7

Using Huffman coding

File: huffman.py

Functions:

  • hc_encode(text) - encodes the text and returns the codeword.
  • demo_dickens() - encodes the first 200 characters from a text file 'dickens', that must be in the same directory as the codefile. Prints the resulting code and the number of bytes it takes.

Run:

Either import the prementioned functions or comment in one of the tests. Then run from command line (or terminal) with command python huffman.py

Using ranged ANS coding

File: rANS.py

Note: When encoding symbols, the code migh fail to decode with more than 50 encoded symbols. This makes this implementation a bit unreliable. Using the Java implementation is recommended.

Functions:

Lower level:

  • encode(symbol, x) - encodes the symbol to the given x and returns the new x.
  • decode(x) - decodes a symbol and returns the new x.

Higher level:

  • encode_text_with_dict(text, possible x value) - calculates probabilities of symbols in given text and then encodes the text based on this. Returns the variable into which the text is encoded. Default starting x is 0.

Example:

  • demo_dickens() - requires text file 'dickens' to exist at the same directory. Encodes the first 200 characters of the file and presents the resulting code, and the number of bytes it takes to store.

Run:

Either import the prementioned functions or comment in one of the tests. Then run from command line (or terminal) with command python rANS.py

Comparing Huffman coding and rANS

File: compare.py

Function:

  • run_tests() - used for running tests. Additional possible parameters - block size for tests, file name for text source.

Run:

Run from command line (or terminal) with command python compare.py Requires text file 'dickens' in running directory. Can modify and use other files.