Skip to content
Browse files


  • Loading branch information...
1 parent 159c6dc commit 167577e506d2a915bb9b046f8f4f025ad41f9aa2 uberj committed
Showing with 27 additions and 0 deletions.
  1. +27 −0 Commpression/Arithmetic/README.mkd
27 Commpression/Arithmetic/README.mkd
@@ -0,0 +1,27 @@
+flags: -c Compress
+ -x De-compress
+./run -c <file_to_compress>
+./run -x <file_you_compresses>.t
+./run -c chr6.fa
+(creates a binary chr6.fa.t file)
+./run -x chr6.fa.t
+(prints file as it is decompressed)
+As of right now decompression is written to stdout. I stopped working on this because it was non-trivial to parrallelize. At the time other goals seemed more reasonable.
+Beware of compressing very large files, as this will take a while.
+Due to bad planning and not really knowing how an arithmetic coder worked, all the code is in model.c and is not very elegant. There are interesting edge cases that happen when you try to encode very large files with many symbols using a low bit count in your encoding buffer. I'd be curious to see how real world applications handle these cases. Do they partition the file to be compressed?
+Compression algorithms like COMPRESS (used in gzip) use a hybrid of compression algorithms (COMPRESS uses Huffman and lzw coding). Arithmetic is very much like Huffman coding and is probably similarly capable of being part of a hybrid compression solution. It is worth noting that Arithmetic coding is more efficient than Huffman coding.

0 comments on commit 167577e

Please sign in to comment.
Something went wrong with that request. Please try again.