Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] basic style operations #10

Merged
merged 42 commits into from
Jan 9, 2022
Merged

[WIP] basic style operations #10

merged 42 commits into from
Jan 9, 2022

Conversation

JanMarvin
Copy link
Owner

@JanMarvin JanMarvin commented Dec 28, 2021

A few basic functions to modify a sheet style

handle styles.xml

Functions to convert styles.xml nodes to R dataframe objects and back to xml nodes. Future styles functions should use these functions:

cellxfs_df <- openxlsx2:::read_xf(read_xml(wb$styles$cellXfs))
openxlsx2:::write_xf(cellxfs_df)

cellstylexfs_df <- openxlsx2:::read_xf(read_xml(wb$styles$cellStyleXfs))
openxlsx2:::write_xf(cellstylexfs_df)

numfmt_df <- openxlsx2:::read_numfmt(read_xml(wb$styles$numFmts))
openxlsx2:::write_numfmt(numfmt_df)

font_df <- openxlsx2:::read_font(read_xml(wb$styles$fonts))
openxlsx2:::write_font(font_df)

border_df <- openxlsx2:::read_border(read_xml(wb$styles$borders))
openxlsx2:::write_border(border_df)

fill_df <- openxlsx2:::read_fill(read_xml(wb$styles$fill))
openxlsx2:::write_fill(fill_df)

cellstyle_df <- openxlsx2:::read_cellStyle(read_xml(wb$styles$cellStyles))
openxlsx2:::write_cellStyle(cellstyle_df)

colors_df <- openxlsx2:::read_colors(read_xml(wb$styles$colors))
openxlsx2:::write_colors(colors_df)

# # missing
# extLst (not needed?)

new example functions to modify sheets

  • cloneSheetStyle copy style information from one sheet to another

    • basic style informations
    • inline styles
    • combined columns
    • col information (hidden, not shown)?
    • row informations (hidden rows)
  • cleanWorksheet remove data from sheet, could be numeric, character or style

    • remove images
    • remove other overlays

These are currently demo functions to show what is possible, but should motivate me to further look into the style functions.

example

library(openxlsx2)
xlsxFile <- system.file("extdata", "loadExample.xlsx", package = "openxlsx2")
wb <- loadWorkbook(xlsxFile)

# wb$sheet_names
# 
# cloneWorksheet(wb, "Clone1", "Sheet1")

addWorksheet(wb, "Clone1")

writeData(wb, "Clone1", mtcars, 1, 1)

# bug most likely due to dimensions everything outside the mtcars frame is blank
cloneSheetStyle(wb, "testing", "Clone1")

# perpare a few clones
cloneWorksheet(wb, "Clone2", "testing")
cloneWorksheet(wb, "Clone3", "testing")
cloneWorksheet(wb, "Clone4", "testing")
cloneWorksheet(wb, "Clone5", "testing")

# clean sheets
cleanSheet(wb, "Clone2", numbers = TRUE, characters = FALSE, styles = FALSE)
cleanSheet(wb, "Clone3", numbers = FALSE, characters = TRUE, styles = FALSE)
cleanSheet(wb, "Clone4", numbers = FALSE, characters = FALSE, styles = TRUE)
cleanSheet(wb, "Clone5", numbers = TRUE, characters = TRUE, styles = FALSE)

# wb <- update_cell(mtcars, wb, "Clone5", cell = "H10:R41")
writeData(wb, "Clone5", mtcars, 8, 10)
# fix style on this sheet
cloneSheetStyle(wb, "testing", "Clone5")

saveWorkbook(wb, "/tmp/test.xlsx", overwrite = TRUE)

summary

We now have basic functions in place that are capable of reading parts of styles.xml into different dataframe objects. When loading the workbook nothing is changed, the xml is simply imported as is into different nodes. If a modification is requested, the node can be converted into a dataframe. Once the modification is in placed, the dataframe can be converted into xml nodes and be assigned back.

@JanMarvin JanMarvin added the enhancement 😀 New feature or request label Dec 28, 2021

# copy entire attributes from original sheet to new sheet
wb$worksheets[[id_new]]$sheet_data$row_attr <-
wb$worksheets[[id_org]]$sheet_data$row_attr
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about rows that are not present in the to_sheet. Do we have to adjust the attributes list? Should test with some examples

@JanMarvin
Copy link
Owner Author

I am planning to get rid of the styleObjects element of the workbook. Creating these potentially large lists serves no benefit to me. Especially modifying it the way we did before.

I was looking for a certain functionality:

  • loading a workbook
  • saving the workbook while maintaining the same style
  • cloning workbooks in that sheet
  • modifying said cloned sheets
  • cloning styles from one sheet to another
  • saving that file

This is now possible.

@JanMarvin
Copy link
Owner Author

Ofc a few of the other functions should return, but for now the I simply removed them. A user should be able to modify a style or add a new style, but I want to do this without the struggle of the styleObjects.

@JanMarvin
Copy link
Owner Author

We have to understand the id handling of the styles xml, but that should not be impossible. Once that is done, all that is left are user functions to add style informations to the existing xml tables and add a value to the table in sheet_data$cc. That could save us from a lot of trouble dealing with the styleObjects.

// https://docs.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheet.cellformat?view=openxml-2.8.1

// openxml 2.8.1
Rcpp::CharacterVector nams = {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently these fields are hard coded. I've looked them up in the doc above, but ofc they could be simply pushed down as a character vector.

auto n = df_xf.nrow();
Rcpp::CharacterVector z(n);

Rcpp::CharacterVector xf_nams = {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, hard coded fields.

characters matching sheet_data
  std::ostringstream oss;
  worksheet.print(oss, " ", pugi::format_raw);
  z[itr](oss.str());

and something like this to write:
  xf.load_string()
fonts_df <- openxlsx2:::read_font(read_xml(wb$styles$fonts))
openxlsx2:::write_font(fonts_df)
@JanMarvin
Copy link
Owner Author

> xlsx <- system.file("extdata", "oxlsx2_sheet.xlsx", package = "openxlsx2")
> wb <- loadWorkbook(xlsx)
> 
> numFmts_df <- openxlsx2:::read_numfmt(read_xml(wb$styles$numFmts))
> numFmts_df
  formatCode numFmtId
0   mmm\\-yy      164
1     #,##0,      165
2         #0      166
3   0.00\\ %      167
4      0\\ %      168
> read_xml(openxlsx2:::write_numfmt(numFmts_df))
<numFmt formatCode="mmm\-yy" numFmtId="164" />
<numFmt formatCode="#,##0," numFmtId="165" />
<numFmt formatCode="#0" numFmtId="166" />
<numFmt formatCode="0.00\ %" numFmtId="167" />
<numFmt formatCode="0\ %" numFmtId="168" />

@JanMarvin
Copy link
Owner Author

Next TODO For border I assume attributes: diagonalDown, diagonalUp, outline, and children: bottom, diagonal, end, horizontal, start, top, vertical are enough. Could provide a node_writer(name, attributes, children).

@JanMarvin
Copy link
Owner Author

library(openxlsx2)

xlsx <- system.file("extdata", "oxlsx2_sheet.xlsx", package = "openxlsx2")
wb <- loadWorkbook(xlsx)

x <- wb_to_df(wb, sheet = "SUM", detectDates = TRUE, dims = "B5:I16")

cloneWorksheet(wb, clonedSheet = "SUM", sheetName = "CLONE")
cleanSheet(wb, "CLONE",
           numbers = TRUE,
           characters = FALSE,
           styles = FALSE,
           merged_cells = FALSE)

# create some new data
x$V1_abs <- 1:11*1e6
x$V1_rel <- seq(0,1, length.out=11)

x$V2_abs <- rep(2, 11)
x$V2_rel <- rep(0.2, 11)

x$Dollar <- rep(NA, 11)

x$V3_abs <- rep(3, 11)
x$V3_rel <- rep(0.3, 11)

# TODO might want to add some some kind of merge styles option
writeData(wb, "CLONE", x, startCol = 2, startRow = 5, colNames = TRUE)
cloneSheetStyle(wb, "SUM", "CLONE")

saveWorkbook(wb, file = "/tmp/test.xlsx", overwrite = TRUE)
openXL("/tmp/test.xlsx")

* read_border
* write_border
* read_fill & write_fill
* read_cellStyle & write_cellStyle
* read_tableStyle & write_tableStyle
* read_dxf & write_dxf
* read_colors & write_colors
…ation to the wb$styles nodes to actually change styles
@JanMarvin JanMarvin merged commit 52cf66f into main Jan 9, 2022
@JanMarvin JanMarvin deleted the basic_style branch January 9, 2022 18:27
@JanMarvin
Copy link
Owner Author

I think that much of the rcpp code can be simplified. Beyond that, there are many functions that can be further optimized. But now we have a few basic functions that can be used for implementation in userspace functions.

@JanMarvin JanMarvin mentioned this pull request Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 😀 New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants