Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Style hashes incomplete, causing loss of styles. #46

Closed
wants to merge 1 commit into from

Conversation

bazzisoft
Copy link

Style hashes are incomplete, not accounting for quite a few style attributes. This prevents usage of many style cominations as they are thrown out during style alignment.

@kevmo314
Copy link
Collaborator

I don't think this actually fixes anything. Certain items were left out of the hash function by design as an optimization, as python's dict first checks __hash__ then __eq__. This allows effectively a cheap pre-check for the hash table. This is not in violation of the definition of the hash function either.

Can you provide an example where style alignment fails?

@bazzisoft
Copy link
Author

Looking through the code again, I see what you're saying. I think the definition of the __eq__ functions are incorrect when Utility.YOLO = True, in Alignment, Borders, Font and Style.

What is the reason for having a different definition of __eq__ when YOLO = True?

Here is a short script that demonstrates the issue, and also shows some failing equality comparisons:

from pyexcelerate import Workbook, Style, Font, Utility

style1 = Style(font=Font(size=18, bold=False))
style2 = Style(font=Font(size=12, bold=True))
style3 = Style(font=Font(size=12, bold=False))

wb = Workbook()
ws = wb.new_sheet("Sheet1")
ws.set_cell_value(1, 1, 'This should be size 18 regular')
ws.set_cell_style(1, 1, style1)
ws.set_cell_value(2, 1, 'This should be size 12 bold')
ws.set_cell_style(2, 1, style2)
ws.set_cell_value(3, 1, 'This should be size 12 regular')
ws.set_cell_style(3, 1, style3)
wb.save("output.xlsx")


# Compare some styles
def compare(x, y):
    Utility.YOLO = False
    print('{} == {}: {}'.format(x, y, x == y))
    Utility.YOLO = True
    print('{} == {}: {}\n'.format(x, y, x == y))

compare(Style(font=Font(size=12)), Style(font=Font(size=18)))
compare(Style(font=Font(bold=False)), Style(font=Font(bold=True)))
compare(Font(size=12), Font(size=18))
compare(Font(bold=False), Font(bold=True))

@kevmo314
Copy link
Collaborator

Your example doesn't check the hash code though, which is half the point. The YOLO switch is meant to reduce the number of comparisons required by offloading half the comparisons to the hash function and the other half to __eq__ like a bloom filter. I wrote a blog post about this actually here. The reason there's a switch to enable this is because when YOLO is enabled, it violates python's definition of __eq__, which is not desirable. It conveniently exploits some duplicate execution behavior in python to improve performance.

I did notice that self._wrap_text was missing from both __eq__ and __hash__ so that's actually a bug, but the general principle of only checking half the properties is intentional by design.

@bazzisoft
Copy link
Author

I'm starting to appreciate the cleverness here, however my sample code above still produces a pretty significant bug. Given your explanation I can't quite figure why it's happening though.

If you run the code sample above and then open output.xlsx, you'll see that the text in Row 3 is actually 18pt instead of 12pt font size.

Do you have any ideas how this could be happening?

@kevmo314
Copy link
Collaborator

After some debugging I figured out why this is occurring, it's due to the hash function in Style only checking the hash for Font and Fill, so the equality constraints are never checked since Style is a parent class for these. That is, Font and other type classes needs to be checked for hash and equality, but the primitive properties can still satisfy the half hash half equality behavior.

I think a reasonable fix is to hash all the non-primitive children in Style and all other parent-containing classes as this retains the performance benefit. I'll get a fix for this in a bit, thanks for finding it, this is quite an obscure bug. :)

@kevmo314
Copy link
Collaborator

Pushed the fix to dev, I tested your code and it now works. I believe I covered all the cases where this could occur, but feel free to file another issue if you find a case that I missed.

@kevmo314 kevmo314 closed this Oct 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants