New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in Huffman algorithm #10864
Comments
comment:1
This happens when Sage is fed with a dictionnary where it expects a string.
This would take half a second to notice if only Sage was returning an error rather than work on a dictionary instead of a string. This patch should avoid this problem in the future Nathann |
Author: Nathann Cohen |
comment:5
Oh.. Thanks ! :-) |
comment:6
Minor: the doctest says " Feeding a dictionary instead of a string:: ", but then feeds it a list of strings, not a dictionary. |
comment:7
Ok, I see that I used the class in the wrong way :-) Wouldn't it be more user-friendly to change the Huffman constructor to accept only one unnamed argument, and then treat it according to type, e.g:
(Because of deprecation policy, I guess we would need to support the current interface in some way, which would make the code a bit messy, but that will only be for a year) |
comment:8
Hello ! Replying to @sagetrac-dsm:
Right. I just thought : "Let's feed it anything but a string"
I don't know what an unicode string is, and I will try to find it out immediately. How do you think we should filter it then ? Nathann |
comment:9
Replying to @nathanncohen:
I think the usual idiom in python 2 is "isinstance(s, basestring)" to allow both. |
comment:10
Hello !! This is an updated patch in which string has been replaced by basename, and an unique argument is expected by the constructor. This class being pretty new and not really advertised, perhaps we can do without backward compatibility for once ?.. I mean, this class is perhaps only useful to illustrate what the Huffman algorithm is (slow encoding in Python..), and updating a possibly uncompatible (if it exists somewhere in the world) should not take more than a few seconds Nathann |
Attachment: trac_10864.patch.gz |
comment:11
Nice patch. I agree with sidestepping the deprecation policy :-) Johan |
comment:12
Yep, works fine for me on 4.6.2, and code looks fine as well. Green light. |
comment:13
Replying to @johanrosenkilde:
I object to the positive review. Note that the documentation for the class |
comment:14
Argggggg !!! Sorry about that !!! I had been looking for occurrences of Nathann |
This comment has been minimized.
This comment has been minimized.
Attachment: trac_10864-2.patch.gz |
comment:16
Woops, sorry about that. Good thing you're awake Minh :-) I reread the whole thing with the new patch, and retested, rebuilt and redoc'ed. It seems alright now. |
comment:17
Replying to @johanrosenkilde:
Minh is always awake. Minh does not rest, and sees everything. Minh is like a better version of Chuck Nurris invented by Chuck Nurris Nathann |
Reviewer: Johan Sebastian Rosenkilde Nielsen |
Merged: sage-4.7.alpha3 |
There seems to be a bug in the Huffman build algorithm when given a frequency dictionary.
The following example results in a wrong encoding table: Let there be 10 symbols numbered 1,..,10 where number i occurs with probability i/55.
The Huffman table I end up with, manually running the algorithm is something like the following:
which has expected length 173/55.
The current Huffman-algorithm returns
which has expected length 175/55.
Apply :
CC: @nathanncohen @sagetrac-mvngu
Component: coding theory
Keywords: huffman
Author: Nathann Cohen
Reviewer: Johan Sebastian Rosenkilde Nielsen
Merged: sage-4.7.alpha3
Issue created by migration from https://trac.sagemath.org/ticket/10864
The text was updated successfully, but these errors were encountered: