Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Import to verison control, fix layout, fix misc bits.

  • Loading branch information...
commit 435ad3d36b3f15c36c17d7c7e76093e58d76a6ba 0 parents
@CloCkWeRX CloCkWeRX authored
374 Text/Huffman.php
@@ -0,0 +1,374 @@
+<?php
+
+// {{{ license
+
+/* vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4 foldmethod=marker: */
+//
+// +----------------------------------------------------------------------+
+// | PHP Version 4 |
+// +----------------------------------------------------------------------+
+// | Copyright (c) 1997-2002 The PHP Group |
+// +----------------------------------------------------------------------+
+// | This source file is subject to version 2.0 of the PHP license, |
+// | that is bundled with this package in the file LICENSE, and is |
+// | available at through the world-wide-web at |
+// | http://www.php.net/license/2_02.txt. |
+// | If you did not receive a copy of the PHP license and are unable to |
+// | obtain it through the world-wide-web, please send a note to |
+// | license@php.net so we can mail you a copy immediately. |
+// +----------------------------------------------------------------------+
+// | Authors: Markus Nix <mnix@docuverse.de> |
+// | David Holmes <exaton@free.fr> (original version) |
+// +----------------------------------------------------------------------+
+//
+
+// }}}
+
+
+/**
+ * This class is intented to perform Huffman static
+ * compression on files with a PHP script.
+ *
+ * Such compression is essentially useful for reducing
+ * the size of texts by about 43% ; it is at its best
+ * when working with data containing strong redundancies
+ * at the character level -- that is, the opposite of a
+ * binary file in which the characters would be spread
+ * over the whole ASCII alphabet.
+ *
+ * It is questionable whether anyone would want to do
+ * such an operation with PHP, when C implementations
+ * of much stronger and more versatile algorithms are
+ * readily avaible as PHP functions. The main drawback
+ * of this script class is slowness despite processing
+ * intensiveness (7 to 8 seconds to compress a 300Kb
+ * text, about 25 seconds to expand it back).
+ *
+ * USE AND FUNCTION REFERENCE :
+ *
+ * The 4 PHP files having been placed in the same directory, the only ones you
+ * have to include are compress.inc.php and/or expand.inc.php according to your
+ * needs.
+ *
+ * -----------------
+ * -- Compression --
+ * -----------------
+ *
+ * Once a CPRS_Compress object has been constructed, the following functions
+ * are available :
+ *
+ * + setFiles('path/to/source/file', 'path/to/destination/file'):
+ *
+ * This step is mandatory, as you give the paths to the file you want to
+ * compress, and the file you want the compressed output written to. These
+ * paths will be passed to the PHP fopen() function, see its reference for
+ * details. Note that the paths, if local, should be relative to the location
+ * of _your_ script, i.e. the one that has included this compression class.
+ *
+ * + setTimeLimit(int seconds):
+ *
+ * This step is optional. It allows you to force a certain timeout limit
+ * for the PHP script, presumably longer than the default configuration on
+ * your server, should the job take too long. It simply calls the PHP
+ * set_time_limit() function.
+ *
+ * + compress():
+ *
+ * This is the function that actually executes the job. It receives no
+ * parameters, and is of course obligatory.
+ *
+ * ---------------
+ * -- Expansion --
+ * ---------------
+ *
+ * Once a CPRS_Expand object has been constructed, the following functions
+ * are available :
+ *
+ * + setFiles('path/to/source/file', 'path/to/destination/file'):
+ *
+ * This step is mandatory, as you give the paths to the file containing the
+ * compressed data, and the file you want the expanded output written to. These
+ * paths will be passed to the PHP fopen() function, see its reference for
+ * details. Note that the paths, if local, should be relative to the location
+ * of _your_ script, i.e. the one that has included this compression class.
+ *
+ * + setTimeLimit(int seconds):
+ *
+ * This step is optional. It allows you to force a certain timeout limit
+ * for the PHP script, presumably longer than the default configuration on
+ * your server, should the job take too long. It simply calls the PHP
+ * set_time_limit() function.
+ *
+ * + expand():
+ *
+ * This is the function that actually executes the job. It receives no
+ * parameters, and is of course obligatory.
+ *
+ *
+ * EXTRA NOTICE:
+ *
+ * Please also note that some technical considerations apart from the core
+ * Huffman static algorithm have probably not been implemented after
+ * any standard in this class. That means that any other compressed file,
+ * even if you have reason to be certain that it was produced using the
+ * Huffman static algorithm, would in all probability not be usable as
+ * source file for data expansion with this class.
+ * In short, this class can very probably only restore what it itself
+ * compressed.
+ *
+ * Anyway, thanks for using ! No feedback would be ignored. Feel free
+ * to tell me how you came in contact with this class, why you're using
+ * it (if at liberty to do so), and to suggest any enhancements, or of
+ * course to point out any serious bugs.
+ *
+ * @package Text
+ */
+
+class Text_Huffman
+{
+ // {{{ properties
+ /**
+ * Carrier window for reading from input
+ * @access protected
+ */
+ protected $_icarrier;
+
+ /**
+ * Length of the input carrier at any given time
+ * @access protected
+ */
+ protected $_icarlen;
+
+ /**
+ * Carrier window for writing to output
+ * @access protected
+ */
+ protected $_ocarrier;
+
+ /**
+ * Length of the output carrier at any given time
+ * @access protected
+ */
+ protected $_ocarlen;
+
+ /**
+ * Boolean to check files have been passed
+ * @access protected
+ */
+ protected $_havefiles;
+
+ /**
+ * Character representing a Branch Node in Tree transmission
+ * @access protected
+ */
+ protected $_nodeChar;
+
+ /**
+ * The same, character version as opposed to binary string
+ * @access protected
+ */
+ protected $_nodeCharC;
+
+ /**
+ * Path to the input file
+ * @access protected
+ */
+ protected $_ifile;
+
+ /**
+ * Resource handle of the input file
+ * @access protected
+ */
+ protected $_ifhand;
+
+ /**
+ * Path to the output file
+ * @access protected
+ */
+ protected $_ofile;
+
+ /**
+ * Resource handle of the output file
+ * @access protected
+ */
+ protected $_ofhand;
+
+ /**
+ * Data eventually written to the output file
+ * @access protected
+ */
+ protected $_odata;
+
+ /**
+ * Array of Node objects
+ * @access protected
+ */
+ protected $_nodes;
+ // }}}
+
+
+ // {{{ constructor
+ /**
+ * Constructor
+ *
+ * @access public
+ */
+ public function __construct()
+ {
+ $this->_havefiles = false;
+ $this->_nodeChar = '00000111';
+ $this->_nodeCharC = chr(7);
+ $this->_odata = '';
+ $this->_nodes = array();
+ }
+ // }}}
+
+
+ /**
+ * setFiles() is called to specify the paths to the input and output files.
+ * Having set the relevant variables, it gets resource pointers to the files
+ * themselves.
+ *
+ * @throws Exception
+ * @access public
+ */
+ public function setFiles($ifile = '', $ofile = '')
+ {
+ if (trim($ifile) == '') {
+ throw new Exception('No input file provided.');
+ } else {
+ $this->_ifile = $ifile;
+ }
+
+ if (trim($ofile) == '') {
+ throw new Exception('No output file provided.');
+ } else {
+ $this->_ofile = $ofile;
+ }
+
+ // Getting resource handles to the input and output files
+
+ if (!($this->_ifhand = @fopen($this->_ifile, 'rb'))) {
+ throw new Exception('Unable to open input file.');
+ }
+
+ if (!($this->_ofhand = @fopen($this->_ofile, 'wb'))) {
+ throw new Exception('Unable to open output file.');
+ }
+
+ // Stating that files have been gotten
+ $this->_havefiles = true;
+
+ return true;
+ }
+
+
+ // protected methods
+
+ /**
+ * Bit-writing with a carrier: output every 8 bits
+ *
+ * @access protected
+ */
+ final protected function _bitWrite($str, $len)
+ {
+ // $carrier is the sequence of bits, in a string
+ $this->_ocarrier .= $str;
+ $this->_ocarlen += $len;
+
+ while ($this->_ocarlen >= 8)
+ {
+ $this->_odata .= chr(bindec(substr($this->_ocarrier, 0, 8)));
+ $this->_ocarrier = substr($this->_ocarrier, 8);
+ $this->_ocarlen -= 8;
+ }
+ }
+
+ /**
+ * Finalizing bit-writing, writing the data.
+ *
+ * @access protected
+ */
+ final protected function _bitWriteEnd()
+ {
+ // If carrier is not finished, complete it to 8 bits with 0's and write it out
+ // Adding n zeros is like multipliying by 2^n
+
+ if ($this->_ocarlen) {
+ $this->_odata .= chr(bindec($this->_ocarrier) * pow(2, 8 - $this->_ocarlen));
+ }
+
+ // Writing the whole output data to file.
+ fwrite($this->_ofhand, $this->_odata);
+ }
+
+ /**
+ * Bit-reading with a carrier: input 8 bits at a time.
+ *
+ * @access protected
+ */
+ final protected function _bitRead($len)
+ {
+ // Fill carrier 8 bits (1 char) at a time until we have at least $len bits
+
+ // Determining the number n of chars that we are going to have to read
+ // This might be zero, if the icarrier is presently long enough
+
+ $n = ceil(($len - $this->_icarlen) / 8);
+
+ // Reading those chars, adding each one as 8 binary digits to icarrier
+
+ for ($i = 0; $i < $n; $i++) {
+ $this->_icarrier .= $this->_decBinDig(ord(fgetc($this->_ifhand)), 8);
+ }
+
+ // Getting the portion of icarrier we want to return
+ // Then diminishing the icarrier of the returned digits
+
+ $ret = substr($this->_icarrier, 0, $len);
+ $this->_icarrier = substr($this->_icarrier, $len);
+
+ // Adding the adequate value to icarlen, taking all operations into account
+
+ $this->_icarlen += 8 * $n - $len;
+
+ return $ret;
+ }
+
+ /**
+ * Read 1 bit.
+ *
+ * @access protected
+ */
+ final protected function _bitRead1()
+ {
+ // Faster reading of just 1 bit
+ // WARNING : requires icarrier to be originally empty !
+ // NO keeping track of carrier length
+
+ if ($this->_icarrier == '') {
+ $this->_icarrier = $this->_decBinDig(ord(fgetc($this->_ifhand)), 8);
+ }
+
+ $ret = substr($this->_icarrier, 0, 1);
+ $this->_icarrier = substr($this->_icarrier, 1);
+
+ return $ret;
+ }
+
+ /**
+ * Returns the binary representation of $x as a string, over $n digits, with
+ * as many initial zeros as necessary to cover that.
+ *
+ * Note: $n has to be more digits than the binary representation of $x
+ * originally has!
+ *
+ * @access protected
+ */
+ final protected function _decBinDig($x, $n)
+ {
+ return substr(decbin(pow(2, $n) + $x), 1);
+ }
+}
+
+?>
402 Text/HuffmanCompress.php
@@ -0,0 +1,402 @@
+<?php
+
+// {{{ license
+
+/* vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4 foldmethod=marker: */
+//
+// +----------------------------------------------------------------------+
+// | PHP Version 4 |
+// +----------------------------------------------------------------------+
+// | Copyright (c) 1997-2002 The PHP Group |
+// +----------------------------------------------------------------------+
+// | This source file is subject to version 2.0 of the PHP license, |
+// | that is bundled with this package in the file LICENSE, and is |
+// | available at through the world-wide-web at |
+// | http://www.php.net/license/2_02.txt. |
+// | If you did not receive a copy of the PHP license and are unable to |
+// | obtain it through the world-wide-web, please send a note to |
+// | license@php.net so we can mail you a copy immediately. |
+// +----------------------------------------------------------------------+
+// | Authors: Markus Nix <mnix@docuverse.de> |
+// | David Holmes <exaton@free.fr> (original version) |
+// +----------------------------------------------------------------------+
+//
+
+// }}}
+
+
+require_once 'Text/Huffman.php';
+
+
+/**
+ * Huffman Compression Class
+ *
+ * @package Text
+ */
+
+class Text_HuffmanCompress extends Text_Huffman
+{
+ // {{{ properties
+ /**
+ * Size of the input file, in bytes
+ * @access protected
+ */
+ protected $_ifsize;
+
+ /**
+ * Array of letter occurrences
+ * @access protected
+ */
+ protected $_occ;
+
+ /**
+ * Index of the root of the Huffman tree
+ * @access protected
+ */
+ protected $_hroot;
+
+ /**
+ * Array of character codes
+ * @access protected
+ */
+ protected $_codes;
+
+ /**
+ * Array of character code lengths
+ * @access protected
+ */
+ protected $_codelens;
+ // }}}
+
+
+ // {{{ constructor
+ /**
+ * Constructor
+ *
+ * @access public
+ */
+ public function __construct()
+ {
+ parent::__construct();
+
+ // Initializing compression-specific variables
+ $this->_ocarrier = '';
+ $this->_ocarlen = 0;
+ }
+ // }}}
+
+
+ /**
+ * Perform compression.
+ *
+ * @access public
+ */
+ public function compress()
+ {
+ if (!$this->_havefiles) {
+ throw new Exception('Files not provided.');
+ }
+
+ // Counting letter occurrences in input file
+ $this->_countOccurrences();
+
+ // Converting occurrences into basic nodes
+ // The nodes array has been initialized, as it will be filled with dynamic incrementation
+ $this->_occurrencesToNodes();
+
+ // Construction of the Huffman tree
+ $this->_makeHuffmanTree();
+
+ // Constructing character codes
+ $this->_makeCharCodes();
+
+ // !! No need for 8 bits of nb of chars in alphabet ?? still use $this->nbchars ? NO
+ // !! No need for 8+5+codelen bits of chars & codes ?? still use $this->_codelens array ? YES
+
+ // Header : passing the Huffman tree with an automatically stopping algorithm
+ $this->_transmitTree();
+
+ // End of header : number of chars actually encoded, over 3 bytes
+ $this->_bitWrite($this->_decBinDig($this->_ifsize, 24), 24);
+
+ // Contents: compressed data
+ rewind($this->_ifhand);
+
+ while (($char = fgetc($this->_ifhand)) !== false) {
+ $this->_bitWrite($this->_codes[$char], $this->_codelens[$char]);
+ }
+
+ // Finalising output, closing file handles
+ $this->_bitWriteEnd();
+
+ fclose($this->_ofhand);
+ fclose($this->_ifhand);
+ }
+
+ /**
+ * setFiles() is called to specify the paths to the input and output files.
+ * It calls a parent function for its role, then sets some compression-
+ * specific variables concerning files.
+ *
+ * @access public
+ */
+ public function setFiles($ifile = '', $ofile = '')
+ {
+ // Calling the parent function for this role
+ parent::setFiles($ifile, $ofile);
+
+ // Setting compression-specific variables concerning files
+ $this->_ifsize = filesize($this->_ifile);
+ }
+
+ /**
+ * Show info on characters codes created from the Huffman tree.
+ *
+ * @access public
+ */
+ public function getSCodes()
+ {
+ // Sorting codes
+ arsort($this->_occ);
+
+ // Preparing informative $scodes array
+ foreach ($this->_occ as $char => $nbocc)
+ {
+ $tmp = '';
+
+ if (ord($char) >= 32) {
+ $schar = $char;
+ } else {
+ $schar = 'µ';
+ $tmp = ' (ASCII : ' . ord($char) . ')';
+ }
+
+ $nboccprefix = '';
+
+ for ($i = 0; $i < 6 - strlen($nbocc); $i++) {
+ $nboccprefix .= '0';
+ }
+
+ $occpercent = round($nbocc / $this->_ifsize * 100, 2);
+ $scodes[$schar] = '(' . $nboccprefix . $nbocc . ' occurences, or ' . $occpercent . '%) ' . $this->_codes[$char] . ' (code on ' . $this->_codelens[$char] . ' bits)' . $tmp;
+ }
+
+ return $scodes;
+ }
+
+ /**
+ * Calculate compression ration.
+ *
+ * @access public
+ */
+ public function getCompressionRatio()
+ {
+ // Simulating output file size
+ $csize = 0;
+
+ foreach ($this->_occ as $char => $nbocc)
+ $csize += $nbocc * $this->_codelens[$char];
+
+ $nbchars = count($this->_occ);
+
+ $csize += 16 * ($nbchars - 1); // For Huffman tree in header
+ $csize += 24; // For nb. chars to read
+
+ $csize = ceil($csize / 8);
+ $cratio = round($csize / $this->_ifsize * 100, 2);
+
+ return $cratio;
+ }
+
+
+ // private methods
+
+ /**
+ * Count character occurrences in the file, to identify information
+ * quantities and later construct the Huffman tree.
+ *
+ * @access private
+ */
+ private function _countOccurrences()
+ {
+ while (($char = fgetc($this->_ifhand)) !== false) {
+ if (!isset($this->_occ[$char])) {
+ $this->_occ[$char] = 1;
+ } else {
+ $this->_occ[$char]++;
+ }
+ }
+ }
+
+ /**
+ * Convert the character occurrences to basic Nodes of according weight.
+ *
+ * @access private
+ */
+ private function _occurrencesToNodes()
+ {
+ foreach ($this->_occ as $char => $nboccs)
+ {
+ $node = array(
+ '_char' => $char,
+ '_w' => $nboccs,
+ '_par' => -1,
+ '_child0' => -1,
+ '_child1' => -1,
+ '_lndone' => false
+ );
+
+ $this->_nodes[] = $node;
+
+ }
+ }
+
+ /**
+ * Get the index of the first node of lightest weight in the nodes array.
+ *
+ * @access private
+ */
+ private function _findLightestNode()
+ {
+ $minw_nodenum = -1;
+ $minw = -1;
+
+ foreach ($this->_nodes as $nodenum => $node)
+ {
+ if (!$node['_lndone'] && ($minw == -1 || $node['_w'] < $minw)) {
+ $minw = $node['_w'];
+ $minw_nodenum = $nodenum;
+ }
+ }
+
+ return $minw_nodenum;
+ }
+
+ /**
+ * Create the Huffman tree, after the following algorithm :
+ * - Find the two nodes of least weight (least info value)
+ * - Set each one's parent to the index a new node which has a weight equal to the sum of weights of the two
+ * - At the same time, specify the new nodes children as being the two lightest nodes
+ * - Eliminate the two lightest nodes from further searches for lightest nodes
+ *
+ * This carries on until there is only one node difference between nodes
+ * constructed and nodes done : the root of the tree.
+ *
+ * By following the tree from root down to leaf, by successive children 0 or
+ * 1, we can thereafter establish the code for the character.
+ *
+ * @access private
+ */
+ private function _makeHuffmanTree()
+ {
+ $nbnodes = count($this->_nodes);
+ $nbnodesdone = 0;
+
+ while ($nbnodesdone < $nbnodes - 1) {
+ // Find two lightest nodes and consider them done
+ for ($i = 0; $i < 2; $i++) {
+ $ln[$i] = $this->_findLightestNode();
+ $this->_nodes[$ln[$i]]['_lndone'] = true;
+ }
+
+ $nbnodesdone += 2;
+
+ // Link them with a parent node of sum weight
+ // (whose parent is as yet unknown ; in the case of root, it will stay with -1)
+ $node = array(
+ '_char' => '',
+ '_w' => $this->_nodes[$ln[0]]['_w'] + $this->_nodes[$ln[1]]['_w'],
+ '_par' => -1,
+ '_child0' => $ln[0],
+ '_child1' => $ln[1],
+ '_lndone' => false
+ );
+
+ $this->_nodes[] = $node;
+
+ $this->_nodes[$ln[0]]['_par'] = $nbnodes; // The number of nodes before incrementation is the index
+ $this->_nodes[$ln[1]]['_par'] = $nbnodes; // of the node which has just been created
+
+ $nbnodes++;
+ }
+
+ // Note that the last node is the root of the tree
+ $this->_hroot = $nbnodes - 1;
+ }
+
+ /**
+ * Read the Huffman tree to determine character codes.
+ *
+ * @access private
+ */
+ private function _makeCharCodes()
+ {
+ // Note : original alphabet is the keys of $occ
+ $i = 0;
+
+ foreach ($this->_occ as $char => $nbocc) {
+ $code = '';
+ $codelen = 0;
+
+ // Following tree back up to root
+ // (therefore _pre_positionning each new bit in the code)
+ // $this->nodes[$i] is the original Node of $char
+ $curnode = $i;
+
+ do {
+ $parnode = $this->_nodes[$curnode]['_par'];
+ $code = (($this->_nodes[$parnode]['_child0'] == $curnode)? '0' : '1') . $code;
+ $codelen++;
+ $curnode = $parnode;
+ } while ($curnode != $this->_hroot);
+
+ $this->_codes[$char] = $code;
+ $this->_codelens[$char] = $codelen;
+
+ $i++;
+ }
+ }
+
+ /**
+ * Transmit Huffman tree.
+ *
+ * @access private
+ */
+ private function _transmitTree()
+ {
+ // Launching the business, specifying that we are starting at root
+ $this->_transmitTreePart($this->_hroot, true);
+ }
+
+ /**
+ * Transmit the Huffman tree in header.
+ *
+ * @access private
+ */
+ private function _transmitTreePart($nodenum, $isroot)
+ {
+ // Transmitting current node representation, if we are not working with root (that's only the first time).
+ // Then looking at children if appropriate (gee that sounds bad).
+ $curnode = $this->_nodes[$nodenum];
+ $char = $curnode['_char'];
+
+ if ($char === '') {
+ // Branch Node
+ // Being root can only be in this case
+
+ if (!$isroot) {
+ $this->_bitWrite($this->_nodeChar, 8);
+ }
+
+ // Looking at children
+ $this->_transmitTreePart($curnode['_child0'], false);
+ $this->_transmitTreePart($curnode['_child1'], false);
+ } else {
+ // Leaf Node
+ // Just transmitting the char
+ $this->_bitWrite($this->_decBinDig(ord($char), 8), 8);
+ }
+ }
+}
+
+?>
206 Text/HuffmanExpand.php
@@ -0,0 +1,206 @@
+<?php
+
+// {{{ license
+
+/* vim: set expandtab tabstop=4 shiftwidth=4 softtabstop=4 foldmethod=marker: */
+//
+// +----------------------------------------------------------------------+
+// | PHP Version 4 |
+// +----------------------------------------------------------------------+
+// | Copyright (c) 1997-2002 The PHP Group |
+// +----------------------------------------------------------------------+
+// | This source file is subject to version 2.0 of the PHP license, |
+// | that is bundled with this package in the file LICENSE, and is |
+// | available at through the world-wide-web at |
+// | http://www.php.net/license/2_02.txt. |
+// | If you did not receive a copy of the PHP license and are unable to |
+// | obtain it through the world-wide-web, please send a note to |
+// | license@php.net so we can mail you a copy immediately. |
+// +----------------------------------------------------------------------+
+// | Authors: Markus Nix <mnix@docuverse.de> |
+// | David Holmes <exaton@free.fr> (original version) |
+// +----------------------------------------------------------------------+
+//
+
+// }}}
+
+
+require_once 'Text/Huffman.php';
+
+
+/**
+ * Huffman Expansion Class
+ *
+ * @package Text
+ */
+
+class Text_HuffmanExpand extends Text_Huffman
+{
+ // {{{ properties
+ /**
+ * Size of the output file, in bytes
+ * @access protected
+ */
+ protected $_ofsize;
+
+ /**
+ * For use in Huffman Tree reconstruction
+ * @access protected
+ */
+ protected $_ttlnodes;
+ // }}}
+
+
+ // {{{ constructor
+ /**
+ * Constructor
+ *
+ * @access public
+ */
+ public function __construct()
+ {
+ parent::__construct();
+
+ // Initializing expansion-specific variables
+ $this->_icarrier = '';
+ $this->_icarlen = 0;
+ }
+ // }}}
+
+
+ /**
+ * Perform expansion.
+ *
+ * @access public
+ */
+ public function expand()
+ {
+ if (!$this->_havefiles) {
+ throw new Exception('Files not provided.');
+ }
+
+ // From header: reading Huffman tree (with no weights, mind you)
+ $this->_reconstructTree();
+
+ // From header: number of characters to read (ie. size of output file)
+ $this->_ofsize = bindec($this->_bitRead(24));
+
+ // Reading bit-by-bit and generating output
+ $this->_readToMakeOutput();
+
+ // Writing the output and closing resource handles
+ fwrite($this->_ofhand, $this->_odata);
+
+ fclose($this->_ofhand);
+ fclose($this->_ifhand);
+ }
+
+
+ // private methods
+
+ /**
+ * Reconstruct the Huffman tree transmitted in header.
+ *
+ * @access private
+ */
+ private function _readTPForChild($par, $child, $childid, $charin)
+ {
+ // Creating child, setting right parent and right child for parent
+ $this->_nodes[$par][$child] = $childid;
+
+ $char = ($charin == $this->_nodeCharC)? '' : $charin;
+
+ $node = array(
+ '_char' => $char,
+ '_w' => 0,
+ '_par' => $par,
+ '_child0' => -1,
+ '_child1' => -1,
+ '_lndone' => false
+ );
+
+ $this->_nodes[$childid] = $node;
+
+ // Special business if we have a Branch Node
+ // Doing all of this for the child!
+ if ($char === '') {
+ $this->_readTreePart($childid);
+ }
+ }
+
+ /**
+ * @access private
+ */
+ private function _readTreePart($nodenum)
+ {
+ // Reading from the header, creating a child
+ $charin = fgetc($this->_ifhand);
+ $this->_readTPForChild($nodenum, '_child0', ++$this->_ttlnodes, $charin);
+
+ $charin = fgetc($this->_ifhand);
+ $this->_readTPForChild($nodenum, '_child1', ++$this->_ttlnodes, $charin);
+ }
+
+ /**
+ * @access private
+ */
+ private function _reconstructTree()
+ {
+ // Creating Root Node. Here root is indexed 0.
+ // It's parent is -1, it's children are as yet unknown.
+ // NOTE : weights no longer have the slightest importance here
+
+ $node = array(
+ '_char' => '',
+ '_w' => 0,
+ '_par' => -1,
+ '_child0' => -1,
+ '_child1' => -1,
+ '_lndone' => false
+ );
+
+ $this->_nodes[0] = $node;
+
+ // Launching the business
+ $this->_ttlnodes = 0; // Init value
+ $this->_readTreePart(0);
+ }
+
+ /**
+ * Reading the compressed data bit-by-bit and generating the output.
+ *
+ * Huffman Compression has unique-prefix property, so as soon as
+ * we recognise a code, we can assume the corresponding char.
+ * All adding up, by reading $ofsize chars from the file, we should get
+ * to the end of it !
+ *
+ * @access private
+ */
+ private function _readUntilLeaf($curnode)
+ {
+ if ($curnode['_char'] !== '') {
+ return $curnode['_char'];
+ }
+
+ if ($this->_bitRead1()) {
+ return $this->_readUntilLeaf($this->_nodes[$curnode['_child1']]);
+ }
+
+ return $this->_readUntilLeaf($this->_nodes[$curnode['_child0']]);
+ }
+
+ /**
+ * We follow the Tree down from Root with the successive bits read
+ * We know we have found the character as soon as we hit a leaf Node.
+ *
+ * @access private
+ */
+ private function _readToMakeOutput()
+ {
+ for ($i = 0; $i < $this->_ofsize; $i++) {
+ $this->_odata .= $this->_readUntilLeaf($this->_nodes[0]);
+ }
+ }
+}
+
+?>
19 docs/example_compress.php
@@ -0,0 +1,19 @@
+<?php
+
+set_time_limit( 60 );
+
+require 'Text/HuffmanCompress.php';
+
+
+$hc = new Text_HuffmanCompress();
+
+try {
+ $hc->setFiles("sample_orig.txt", "sample_compressed.txt");
+ $hc->compress();
+
+ echo "Done.";
+} catch (Exception $e) {
+ echo $e->getMessage();
+}
+
+?>
19 docs/example_expand.php
@@ -0,0 +1,19 @@
+<?php
+
+set_time_limit( 60 );
+
+require 'Text/HuffmanExpand.php';
+
+
+$he = new Text_HuffmanExpand();
+
+try {
+ $he->setFiles("sample_compressed.txt", "sample_expanded.txt");
+ $he->expand();
+
+ echo "Done.";
+} catch (Exception $e) {
+ echo $e->getMessage();
+}
+
+?>
10 docs/sample_orig.txt
@@ -0,0 +1,10 @@
+The year 1866 was marked by a bizarre development, an unexplained and downright inexplicable phenomenon that surely no one has forgotten. Without getting into those rumors that upset civilians in the seaports and deranged the public mind even far inland, it must be said that professional seamen were especially alarmed. Traders, shipowners, captains of vessels, skippers, and master mariners from Europe and America, naval officers from every country, and at their heels the various national governments on these two continents, were all extremely disturbed by the business.
+In essence, over a period of time several ships had encountered "an enormous thing" at sea, a long spindle-shaped object, sometimes giving off a phosphorescent glow, infinitely bigger and faster than any whale.
+The relevant data on this apparition, as recorded in various logbooks, agreed pretty closely as to the structure of the object or creature in question, its unprecedented speed of movement, its startling locomotive power, and the unique vitality with which it seemed to be gifted. If it was a cetacean, it exceeded in bulk any whale previously classified by science. No naturalist, neither Cuvier nor Lac�p�de, neither Professor Dumeril nor Professor de Quatrefages, would have accepted the existence of such a monster sight unseen -- specifically, unseen by their own scientific eyes.
+Striking an average of observations taken at different times -- rejecting those timid estimates that gave the object a length of 200 feet, and ignoring those exaggerated views that saw it as a mile wide and three long--you could still assert that this phenomenal creature greatly exceeded the dimensions of anything then known to ichthyologists, if it existed at all.
+Now then, it did exist, this was an undeniable fact; and since the human mind dotes on objects of wonder, you can understand the worldwide excitement caused by this unearthly apparition. As for relegating it to the realm of fiction, that charge had to be dropped.
+In essence, on July 20, 1866, the steamer Governor Higginson, from the Calcutta & Burnach Steam Navigation Co., encountered this moving mass five miles off the eastern shores of Australia. Captain Baker at first thought he was in the presence of an unknown reef; he was even about to fix its exact position when two waterspouts shot out of this inexplicable object and sprang hissing into the air some 150 feet. So, unless this reef was subject to the intermittent eruptions of a geyser, the Governor Higginson had fair and honest dealings with some aquatic mammal, until then unknown, that could spurt from its blowholes waterspouts mixed with air and steam.
+Similar events were likewise observed in Pacific seas, on July 23 of the same year, by the Christopher Columbus from the West India & Pacific Steam Navigation Co. Consequently, this extraordinary cetacean could transfer itself from one locality to another with startling swiftness, since within an interval of just three days, the Governor Higginson and the Christopher Columbus had observed it at two positions on the charts separated by a distance of more than 700 nautical leagues.
+Fifteen days later and 2,000 leagues farther, the Helvetia from the Compagnie Nationale and the Shannon from the Royal Mail line, running on opposite tacks in that part of the Atlantic lying between the United States and Europe, respectively signaled each other that the monster had been sighted in latitude 42 degrees 15' north and longitude 60 degrees 35' west of the meridian of Greenwich. From their simultaneous observations, they were able to estimate the mammal's minimum length at more than 350 English feet; this was because both the Shannon and the Helvetia were of smaller dimensions, although each measured 100 meters stem to stern. Now then, the biggest whales, those rorqual whales that frequent the waterways of the Aleutian Islands, have never exceeded a length of 56 meters--if they reach even that.
+One after another, reports arrived that would profoundly affect public opinion: new observations taken by the transatlantic liner Pereire, the Inman line's Etna running afoul of the monster, an official report drawn up by officers on the French frigate Normandy, dead-earnest reckonings obtained by the general staff of Commodore Fitz-James aboard the Lord Clyde. In lighthearted countries, people joked about this phenomenon, but such serious, practical countries as England, America, and Germany were deeply concerned.
+In every big city the monster was the latest rage; they sang about it in the coffee houses, they ridiculed it in the newspapers, they dramatized it in the theaters. The tabloids found it a fine opportunity for hatching all sorts of hoaxes. In those newspapers short of copy, you saw the reappearance of every gigantic imaginary creature, from "Moby Dick," that dreadful white whale from the High Arctic regions, to the stupendous kraken whose tentacles could entwine a 500-ton craft and drag it into the ocean depths. They even reprinted reports from ancient times: the views of Aristotle and Pliny accepting the existence of such monsters, then the Norwegian stories of Bishop Pontoppidan, the narratives of Paul Egede, and finally the reports of Captain Harrington -- whose good faith is above suspicion--in which he claims he saw, while aboard the Castilian in 1857, one of those enormous serpents that, until then, had frequented only the seas of France's old extremist newspaper, The Constitutionalist.
42 package.xml
@@ -0,0 +1,42 @@
+<?xml version="1.0" encoding="ISO-8859-1" ?>
+<!DOCTYPE package SYSTEM "http://pear.php.net/dtd/package-1.0">
+<package version="1.0">
+ <name>Text_Huffman</name>
+ <summary>Huffman compression</summary>
+ <description>Huffman compression is a lossless compression algorithm that is ideal for compressing textual data.</description>
+ <maintainers>
+ <maintainer>
+ <user>mnx</user>
+ <name>Markus Nix</name>
+ <email>mnix@docuverse.de</email>
+ <role>lead</role>
+ </maintainer>
+ </maintainers>
+ <release>
+ <version>0.2.0</version>
+ <date>2004-07-03</date>
+ <license>LGPL</license>
+ <state>beta</state>
+ <notes>Initial release.</notes>
+ <deps>
+ <dep type="php" rel="ge" version="5.0.0RC1" optional="no"/>
+ </deps>
+ <filelist>
+ <file role="php" baseinstalldir="Text" name="Huffman.php"/>
+ <file role="php" baseinstalldir="Text" name="HuffmanCompress.php"/>
+ <file role="php" baseinstalldir="Text" name="HuffmanExpand.php"/>
+ <file role="test" baseinstalldir="Text" name="test/example_compress.php"/>
+ <file role="test" baseinstalldir="Text" name="test/example_expand.php"/>
+ <file role="test" baseinstalldir="Text" name="test/sample_orig.txt"/>
+ </filelist>
+ </release>
+ <changelog>
+ <release>
+ <version>0.2.0b1</version>
+ <date>2004-08-08</date>
+ <state>beta</state>
+ <notes>Initial release.
+</notes>
+ </release>
+ </changelog>
+</package>
Please sign in to comment.
Something went wrong with that request. Please try again.