Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell Styles like background color #36

Closed
jokerslab opened this issue Oct 28, 2013 · 19 comments
Closed

Cell Styles like background color #36

jokerslab opened this issue Oct 28, 2013 · 19 comments
Labels

Comments

@jokerslab
Copy link

How do I get the background color of a cell?

@elad
Copy link

elad commented May 28, 2014

Very important for what I need to do. :) What can I do to help make this happen?

@SheetJSDev
Copy link
Contributor

@elad love your enthusiasm :)

As is usually the case, the hardest part is settling on a JS representation. For example, XLSB uses a bit field for representing certain properties whereas XLSX uses its own kinda-sorta-like-HTML-but-not-quite rich text format.

My initial thought was to save an HTML representation for each cell (and in fact, XLSX does generate HTML by parsing the rich text runs), but that makes the reverse process somewhat tricky (how do you handle CSS styles? Do you parse the CSS to figure out if the text is bold?).

NOTE: since the XLSX writer recomputes styles anyway, we don't need to stick to a style array or some other representation that is tightly coupled to the actual file representation

Extracting the style information is not too difficult, once we settle on a representation:

ECMA-376 (47MB)

MS-XLSB (41MB)

@elad what do you think is the best way to store this information? Note that we don't necessarily have to store a pretty format: we can (and should) write functions that "parse" the intermediate representation and give output.

@jokerslab @gcoonrod @artemryzhov since you raised issues on the matter, hopefully you can chime in as well :)

@elad
Copy link

elad commented May 28, 2014

(keep in mind I'm not very familiar with this stuff yet :)

I'm not sure HTML representation is the best way to do this. Some libraries seem to just expose the "raw" values for each field, a few examples:

http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.backgroundcolor.aspx

http://stackoverflow.com/questions/10756206/getting-cell-backgroundcolor-in-excel-with-open-xml-2-0

http://stackoverflow.com/questions/12043973/how-to-read-the-xlsx-color-infomation-by-using-openpyxl

By the way, I say "raw" because I printed the data in parse_sty_xml to see what's in it and I see that the relevant tags/attributes in the XML also appear in the different APIs. (What I'm still trying to figure out is what maps each cell's formatting to that style data...)

So it seems like the best way would be to at leas have each cell maybe have a style object that will contain the raw values... Makes sense?

@SheetJSDev
Copy link
Contributor

What I'm still trying to figure out is what maps each cell's formatting to that style data..

The overall cell style is linked to the cell's "s" attribute (page 1589 of ECMA-376 pdf i linked to). The relevant logic here is in the worksheet processing: https://github.com/SheetJS/js-xlsx/blob/master/bits/72_wsxml.js#L75-L80

...
            if(cell.s && styles.CellXf) {
                var cf = styles.CellXf[cell.s];
...

So it seems like the best way would be to at leas have each cell maybe have a style object that will contain the raw values... Makes sense?

If you want to see the raw value, it's already exposed by default in the (.r) field:

> require('xlsx').readFile('rich_text_stress.xlsx').Sheets.Sheet1.B13
{ v: 'this text is double accounting underlined sure enough',
  t: 's',
  r: '<r><t xml:space="preserve">this text is </t></r><r><rPr><u val="doubleAccounting"/><sz val="12"/><color theme="1"/><rFont val="Calibri"/><scheme val="minor"/></rPr><t>double accounting underlined</t></r><r><rPr><sz val="12"/><color theme="1"/><rFont val="Calibri"/><family val="2"/><scheme val="minor"/></rPr><t xml:space="preserve"> sure enough</t></r>',
  h: 'this text is <span style="">double accounting underlined</span><span style=""> sure enough</span>',
  w: 'this text is double accounting underlined sure enough' }

Unfortunately, XLS and XLSB and XLML use different representations :/ There are three ways around this:

  1. Convert everything to/from HTML

  2. Convert everything to/from the XLSX representation (in XLSX, it's already exposed as .r)

  3. Devise a new representation.

I'll think about it a bit more.

@elad do you know specifically what you need from the styles? In particular, do you need something like an HTML representation or just certain vitals (like background color, font, etc)? In the latter case, we probably could craft a short style object with the basic details

@elad
Copy link

elad commented May 28, 2014

First, thanks for being so quick on the replies! :)

I had a feeling the mapping happened through s although one of my rows had one cell with a different index (specifically, A1 through J1 had s="2" except for B1 which had s="3") so I wasn't 100% sure.

I think the example you provided works only for cell-specific styling. For example, my worksheet applies styling to the entire row, and this is what I get:

{ v: '2.4.2014',
  t: 's',
  r: '<t>2.4.2014</t>',
  h: '2.4.2014',
  w: '2.4.2014' }

The color used is Aqua, Accent 5, Lighter 40%, which I believe corresponds to this:

<fill>
    <patternFill patternType="solid">
        <fgColor theme="8" tint="0.39997558519241921"/>
        <bgColor indexed="64"/>
    </patternFill>
</fill>

What am I missing in order to access this styling through .r?

What I need from the styles is the background color. I suspect a lot of people use row colors to signify meaning that isn't otherwise conveyed through an actual column. In my case, the row color represents "type" and in order to import data from Excel to a database I need to figure out what the row color is.

I think having something simple like you suggest would cover the needs of most folks who are interested in this feature, and if not would serve as a great foundation to further expand.

@SheetJSDev
Copy link
Contributor

@elad row-level information is currently not made available via .r. :/

Let's settle on putting each cell's background in cell.s.bgcolor (and other one-offs, like foreground color, general bolding, will also be in the .s field).

Upon reflection, it requires a bit of work (because the themes are not currently processed). I will take a stab at it later today:

  1. The themes.xml file should be parsed to find the actual colors. Unfortunately, that is not currently done, but it would follow the same pattern as parsing styles (actually, somewhat easier since XLSB also uses themes.xml)

  2. You'll see the comment /* fills CT_Fills ? */ in parse_sty_xml. Here, the fills should be parsed (in the ECMA spec you'll see CT_Fills defined somewhere below -- for now, you can just focus on the patternFill, fgColor, bgColor). You can mirror the approach in cellXfs and numFmts

  3. in parse_cellXfs, when you see an <xf, check if it has a fillId. If it is nonzero, then add the fill object (just like how the number format is added)

  4. in parse_ws_xml there's a "formatting" comment. In the following block, it tries to find the cell format. At that point, add the fill information to the cell

@elad
Copy link

elad commented May 29, 2014

Okay, I did as you said - except for the themes.xml part, because I'm not yet sure how to do that - and this is what I get:

{ v: '2.4.2014',
  t: 's',
  r: '<t>2.4.2014</t>',
  h: '2.4.2014',
  w: '2.4.2014',
  s: 
   { patternType: 'solid',
     fgColor: { theme: 8, tint: 0.3999755851924192 },
     bgColor: { indexed: 64 } } }

So now the cell's s field has the relevant fill data as it appears in the raw XML I printed earlier. Is this what you meant?

Hopefully it is, in which case - what do we do about themes.xml? I assume it requires changes in parse_zip, but I see there are type-specific parsing routines there (parse_sst, parse_sty, parse_wb, etc.), does it require a similar parse_themes function to be written?

@elad
Copy link

elad commented May 29, 2014

Following up... It seems the answer to my question is "yes."

I looked and saw that dir.themes is an array:

themes: [ '/xl/theme/theme1.xml' ]

So I printed the contents of this XML file and found the clrScheme collection the spec mentions (page 1757), and indeed, at index 8 (counting from 0) was the RGB value of the color for my row, sans tint! Given the spec also shows how to calculate the final color from the RGB value + tint (pages 1757-1758), I think we're good to go.

I'm now writing parse_themes (a single function for parsing XML - no binary/XML differentiation because I understand that's not necessary). I'll soon fork this tree and push my changes so you could take a look.

@SheetJSDev
Copy link
Contributor

@elad it sounds like you have the right idea :) Looking forward to the PR

@elad
Copy link

elad commented May 29, 2014

I think I got it. I added basic support for parsing the theme and tested it. I also added some utility functions to provide the RGB color with the tint applied, to make it easier to get the actual color. It seems that the tinting algorithm is either incorrect or I'm missing something though because it doesn't work if I use the version from the specification verbatim. Also, there should probably be a lot of testing here because I'm sure my use case isn't the only one. :)

Output:

$ node parser
A1:
 { v: '2.4.2014',
  t: 's',
  r: '<t>2.4.2014</t>',
  h: '2.4.2014',
  w: '2.4.2014',
  s: 
   { patternType: 'solid',
     fgColor: { theme: 8, tint: 0.3999755851924192, rgb: '9ED2E0' },
     bgColor: { indexed: 64 } } }
color:
 { name: 'accent5', rgb: '4BACC6' }
$

This shows the data for the A1 cell, including the RGB value with the tint applied (9ED2E0), as well as the theme color scheme definition for index 8, which in this case is (correctly) Accent 5, 4BACC6.

Will submit a pull request shortly. Please note that at the very least it should be marked as experimental. :)

@SheetJSDev
Copy link
Contributor

@elad add an option cellStyles that defaults to false (see bits/84_defaults.js). If it is true, parse the themes file and populate the s field (so there should be a check in the parse_zip function as well as in the parse_ws_xml function).

@elad
Copy link

elad commented Oct 15, 2014

The code implementing the functionality requested by this issue has been merged, I think it's safe to close it.

@SheetJSDev
Copy link
Contributor

@elad we'll close once XLSB and ODS also use the same format

@elad
Copy link

elad commented Oct 15, 2014

Gotcha. If you manage to find a moment, please provide some status on date/style issues raised elsewhere - I'd like to help with the code but since I know you're working on a new version I'm afraid changes would be conflicting. :)

@tomasdev
Copy link

Wow! thanks @elad for this. My question now would be... how much effort should it be to add "XML Styles" compatibility?

Example:

Given I have

<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Arial Unicode MS" ss:Size="11" ss:Color="#000000"/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s58">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
</Style>
<Style ss:ID="s59">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<Borders/>
<Font ss:Bold="1"/>
<Interior ss:Color="#99CC00" ss:Pattern="Solid"/>
<NumberFormat/>
<Protection/>
</Style>
<Style ss:ID="s60">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<Borders/>
<Font ss:FontName="Calibri" ss:Size="14" ss:Bold="1"/>
<Interior ss:Color="#FF6600" ss:Pattern="Solid"/>
</Style>
<Style ss:ID="s61">
<Alignment ss:Horizontal="Left" ss:Vertical="Bottom"/>
<Font ss:FontName="Calibri" ss:Size="16" ss:Color="#FFFFFF" ss:Bold="1"/>
<Interior ss:Color="#333333" ss:Pattern="Solid"/>
</Style>
<Style ss:ID="s62">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<Font ss:Color="#FFFFFF" />
<Interior ss:Color="#333333" ss:Pattern="Solid"/>
</Style>
<Style ss:ID="s63">
<Alignment ss:Horizontal="Center" ss:Vertical="Bottom"/>
<NumberFormat ss:Format="0.0%"/>
</Style>
<Style ss:ID="s64" ss:Parent="s62">
<NumberFormat ss:Format="0.0%"/>
</Style>
</Styles>

I would like to keep those styles (or convert them) when reading an XML and writing an XLSX.

@ashish-agarwal24
Copy link

Can I get a demo on how these styles to be applied in the latest version?

@benwinding
Copy link

Is there no link on how the styles can be applied to cells?

@koundinyag25
Copy link

Hello All, it would be really awesome if there is an example of how to provide style metadata
let data = ['a','b'];
let style = ["bgColor: blue","bgColor: green"];
is there a way to achieve this? a code snippet would be of great help.
Thank you in advance

@SheetJSDev
Copy link
Contributor

We offer this in the Pro compendium. Since companies have paid for the features already, it would be unfair to them if we turned around and made it available as open source. We have a longer comment in a gist.

@SheetJS SheetJS locked and limited conversation to collaborators Feb 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

8 participants