New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more styles for xlsxwriter #16149

Merged
merged 16 commits into from Oct 31, 2017

Conversation

Projects
None yet
3 participants
@jnothman
Contributor

jnothman commented Apr 26, 2017

I was surprised to find that despite the interchangeable representation of Excel styles, xlsxwriter did not have good style support.

I've not added direct tests for this functionality, but test some of it through test_styler_to_excel.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master --name-only -- '*.py' | flake8 --diff
  • whatsnew entry
@jnothman

This comment has been minimized.

Contributor

jnothman commented Apr 27, 2017

There is a consistent test failure, but not one I've managed to replicate locally.

@codecov

This comment has been minimized.

codecov bot commented Apr 27, 2017

Codecov Report

Merging #16149 into master will decrease coverage by <.01%.
The diff coverage is 89.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16149      +/-   ##
==========================================
- Coverage   90.83%   90.83%   -0.01%     
==========================================
  Files         159      159              
  Lines       50796    50809      +13     
==========================================
+ Hits        46143    46153      +10     
- Misses       4653     4656       +3
Flag Coverage Δ
#multiple 88.61% <89.65%> (-0.01%) ⬇️
#single 40.3% <3.44%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/excel.py 80.55% <89.65%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b80ed3...c935f5d. Read the comment docs.

@codecov

This comment has been minimized.

codecov bot commented Apr 27, 2017

Codecov Report

Merging #16149 into master will decrease coverage by 0.02%.
The diff coverage is 92.1%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16149      +/-   ##
==========================================
- Coverage   91.24%   91.21%   -0.03%     
==========================================
  Files         163      163              
  Lines       50091    50106      +15     
==========================================
+ Hits        45704    45706       +2     
- Misses       4387     4400      +13
Flag Coverage Δ
#multiple 89.02% <92.1%> (-0.01%) ⬇️
#single 40.23% <10.52%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/excel.py 80.39% <92.1%> (-0.01%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.41% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5959ee3...80ed56a. Read the comment docs.

@jnothman

This comment has been minimized.

Contributor

jnothman commented Apr 27, 2017

Finally passing tests after identifying a recently fixed openpyxl bug that got in the way

@jreback jreback added the IO Excel label Apr 27, 2017

if num_format_str is not None:
xl_format.set_num_format(num_format_str)
props['num_format'] = num_format_str

This comment has been minimized.

@jreback

jreback Apr 27, 2017

Contributor

can we move more of this logic out of here an into the formats dir somewhere?

This comment has been minimized.

@jnothman

jnothman Apr 27, 2017

Contributor

Do you mean explicitly moving the number format logic out? Yes, perhaps that's a worthwhile refactoring. I think we should also be calculating the number format from the display.precision config. For which reason, I believe all of those changes belong in a different PR.

Or are you talking about moving this mapping logic out? Well currently we assume nested style dicts as an interchange format, which are well-suited to openpyxl but need conversion for all writers. The stuff in formats/ should remain relatively writer-agnostic.

This comment has been minimized.

@jreback

jreback Apr 27, 2017

Contributor

ideally all of this logic would just be a single call here and the logic elsewhere.

This comment has been minimized.

@jnothman

jnothman Apr 27, 2017

Contributor

I don't think I've grokked your vision, given that this is writer specific. Do you mean that there should be more refactoring across writers? Except for this number formatting, it's already quite factored, as they each have different syntaxes for creating and formatting cells.

This comment has been minimized.

@jreback

jreback Apr 27, 2017

Contributor

yes i think think excel should be refactored into a subdir of writer code and style things should live there

maybe make an issue about this
it's a bit of work to split it then adding things like style should be easy

@jnothman

This comment has been minimized.

Contributor

jnothman commented Apr 29, 2017

I'm happy to make an issue aiming to refactor excel writing code. But how do you feel about this PR?

@jnothman jnothman referenced this pull request May 8, 2017

Closed

Use cssdecl package for resolving CSS #16170

4 of 4 tasks complete
@jreback

This comment has been minimized.

Contributor

jreback commented Jun 10, 2017

can you rebase.

jnothman added some commits Jun 14, 2017

@jnothman

This comment has been minimized.

Contributor

jnothman commented Jun 14, 2017

I've moved the what's new to 0.20.

style_dict = style_dict.copy()
style_dict['border'] = style_dict.pop('borders')
for src, dst in self.STYLE_MAPPING:

This comment has been minimized.

@jreback

jreback Jun 28, 2017

Contributor

so this only is triggered if there is styling (IOW this won't cause a perf issue for 'regular' excel)?

This comment has been minimized.

@jnothman

jnothman Aug 17, 2017

Contributor

A few lines above we return if style_dict is None; a few lines above that we return if num_format_str is None and style_dict is None. I think that is sufficient.

Btw, I think even the default to_excel has some styling of headers, so this function will always be called, but will be returned early where possible.

There are ways to make this faster, though:

  • store STYLE_MAPPING as a trie and descend recursively only where a prefix is matched.
  • flatten style_dict and store STYLE_MAPPING as a dict so that their keys match. But to be deterministic in case of multiple competing styles, STYLE_MAPPING would need to store the matched index, and the results would need to be sorted.

This comment has been minimized.

@jnothman

jnothman Aug 17, 2017

Contributor

I'm pushing a faster variant.

@@ -1609,6 +1609,68 @@ def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0,
startcol + cell.col,
val, style)
# Map from openpyxl-oriented styles to flatter xlsxwriter representation

This comment has been minimized.

@jreback

jreback Jun 28, 2017

Contributor

I think the code would be simpler to make this style formatting into a separate class (rather than have it live in functions sitting in the main excel code). can you refactor to make this cleaner.

@jreback

This comment has been minimized.

Contributor

jreback commented Aug 17, 2017

this looks reasonable, can you rebase

@jnothman

This comment has been minimized.

Contributor

jnothman commented Aug 17, 2017

Yes, sorry I've not managed to do the refactoring you'd like to see. I've been unsure what you would like, and have had my attentions elsewhere.

@jnothman

AppVeyor failure looks like someone else's problem.

style_dict = style_dict.copy()
style_dict['border'] = style_dict.pop('borders')
for src, dst in self.STYLE_MAPPING:

This comment has been minimized.

@jnothman

jnothman Aug 17, 2017

Contributor

A few lines above we return if style_dict is None; a few lines above that we return if num_format_str is None and style_dict is None. I think that is sufficient.

Btw, I think even the default to_excel has some styling of headers, so this function will always be called, but will be returned early where possible.

There are ways to make this faster, though:

  • store STYLE_MAPPING as a trie and descend recursively only where a prefix is matched.
  • flatten style_dict and store STYLE_MAPPING as a dict so that their keys match. But to be deterministic in case of multiple competing styles, STYLE_MAPPING would need to store the matched index, and the results would need to be sorted.
style_dict = style_dict.copy()
style_dict['border'] = style_dict.pop('borders')
for src, dst in self.STYLE_MAPPING:

This comment has been minimized.

@jnothman

jnothman Aug 17, 2017

Contributor

I'm pushing a faster variant.

@jreback

This comment has been minimized.

Contributor

jreback commented Aug 18, 2017

@jnothman

This comment has been minimized.

Contributor

jnothman commented Oct 16, 2017

Merge into 0.21 and avoid future what's new conflicts?

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 16, 2017

you need to rebase and some comments to respond

@jnothman

This comment has been minimized.

Contributor

jnothman commented Oct 16, 2017

Well, your comments suggested a refactor, and I've previously commented that I do not understand your vision of a refactor in this space, and certainly have not found capacity within my other FOSS commitments to to do a general code quality improvement here.

Since those comments, you announced "this looks reasonable". I'll have another very brief look at refactoring.

@jnothman

This comment has been minimized.

Contributor

jnothman commented Oct 16, 2017

I made the requested change as I interpret it. I do not find it improves code quality.

@jnothman

This comment has been minimized.

Contributor

jnothman commented Oct 29, 2017

I haven't understood the cause of the travis failure, apparently in lint.sh but without error message that I can see.

I've merged in master and moved what's new to the next version.

@jreback

This comment has been minimized.

Contributor

jreback commented Oct 29, 2017

Linting *.py
pandas/io/excel.py:1794:1: E305 expected 2 blank lines after class or function definition, found 1

you can check this locally via:

make lint-diff (or just directly run flake8 on that file)

@jnothman

This comment has been minimized.

Contributor

jnothman commented Oct 29, 2017

@TomAugspurger

The clipboard failure on travis looks unrelated.

All good @jreback?

@jreback jreback added this to the 0.22.0 milestone Oct 31, 2017

@@ -22,7 +22,7 @@ New features
Other Enhancements
^^^^^^^^^^^^^^^^^^
-
- Better support for ``Dataframe.style.to_excel()`` output with the ``xlsxwriter`` engine. (:issue:`16149`)

This comment has been minimized.

@jreback

jreback Oct 31, 2017

Contributor

ok for now. it might make sense to enhance the excel docs with what is better now? (or maybe a listing of styles that work). but for a followup.

@jreback jreback merged commit 5d096f7 into pandas-dev:master Oct 31, 2017

2 of 3 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@jreback

This comment has been minimized.

Contributor

jreback commented Oct 31, 2017

thanks @jnothman

peterpanmj added a commit to peterpanmj/pandas that referenced this pull request Oct 31, 2017

No-Stream added a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment