Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Clear Content (delete data but keep formatting) #562

Closed
DavZim opened this issue Mar 24, 2023 · 11 comments
Closed

Feature Request: Clear Content (delete data but keep formatting) #562

DavZim opened this issue Mar 24, 2023 · 11 comments

Comments

@DavZim
Copy link

DavZim commented Mar 24, 2023

First, I love how this fixes many of the issues of openxlsx that I faced... Your work is highly appreciated.

It would be great to have a function that allows me to clear the contents, ie delete the data but keep the formatting.
At the moment I use this wrapper. But maybe there is a better way...

clear_content <- function(w, sheet, range) {
  st <- w$get_cell_style(sheet = 1, range)
  openxlsx2::delete_data(w, sheet = 1, cols = 1, rows = 8)
  w$set_cell_style(sheet = 1, dims = range, st)
}
@JanMarvin
Copy link
Owner

Hi @DavZim , please have a look at wb_clean_sheet() Is this what you want?

@DavZim
Copy link
Author

DavZim commented Mar 24, 2023

Very close, but I want to clean only a given range.
The use case it that I have a formatted sheet where I want to write multiple smaller and separated data to. Sometimes the data that I want to write to is smaller than the existing data, therefore I want to first clear the existing data and then write the new data.

@JanMarvin
Copy link
Owner

Got it, I can see if the existing function can be extended to include dims. This should be simple for data and styles, but a bit more tricky for merged cells iirc. I'll have a look later.

@DavZim
Copy link
Author

DavZim commented Mar 24, 2023

One use case that was reliable in producing bugs was to have one whole column with a given format (red background) with some values in rows 1-10.
Then with openxlsx2 clear the contents of that column in rows 5-100000.
I would expect the contents to be cleared in 5-10, still have formatting in the whole column, and have a used range in the sheet that still only goes to 10 and not 100k (as can be seen by the scroller on the right and on the filesize, hence the large amount of rows).
Does that make sense? Otherwise I am happy to throw together some example sheets.

Note that my solution on top fails for this use case (last part regarding used sizes).

@JanMarvin
Copy link
Owner

Hi again, I was traveling earlier and and my patience with github on the smartphone was limited. I am glad you find openxlsx2 useful. We have nothing but love for openxlsx and the developers who have allowed it to grow over the years into such a useful package, and of course hope we can do some justice to the big footprints.

I have opened a pull request in #563. Maybe you can have a look if this solves your issue?

I don't really get your latest reply. Could you craft a reproducible example or mail me one?

@JanMarvin
Copy link
Owner

One use case that was reliable in producing bugs was to have one whole column with a given format (red background) with some values in rows 1-10. Then with openxlsx2 clear the contents of that column in rows 5-100000. I would expect the contents to be cleared in 5-10, still have formatting in the whole column, and have a used range in the sheet that still only goes to 10 and not 100k (as can be seen by the scroller on the right and on the filesize, hence the large amount of rows). Does that make sense? Otherwise I am happy to throw together some example sheets.

Note that my solution on top fails for this use case (last part regarding used sizes).

Maybe you mean something like this:

# create example data
wb <- wb_workbook() %>% 
  wb_add_worksheet() %>% 
  wb_add_fill(dims = "D1:D11", color = wb_colour("red")) %>% 
  wb_add_data(x = matrix(1:100, 10, 10))

# apply style 1 as column color (this is only the visual)
wb$worksheets[[1]]$cols_attr <- "<col min=\"4\" max=\"4\" width=\"8.83203125\" style=\"1\"/>"

# remove the red numbers
wb$clean_sheet(dims = "D2:D10000", styles = FALSE)

## you see this (original dimensions A1:J11)
wb_to_df(wb)

## you expect this? (dimensions A1:J100000)
wb$add_data(dims = "A12", x = matrix(NA, nrow = 1e5-12, ncol = 10), na.strings = NULL, colNames = FALSE)
wb_to_df(wb)

In this case I would rather not change the current behavior. We had to jump through some hoops to deal with the openxml way of having uninitialized cells. If you need them and are not sure if they are available, you will have to create them yourself.

@DavZim
Copy link
Author

DavZim commented Mar 24, 2023

I can only agree to the respect towards the openxlsx authors and maintainers! Regardless, I am happy to have found this fork/project...

I have put together a MWE with the file below:

  • the source sheet (ex1) has a background from 7 to the max rows with data going from 4 to row 13 (see screenshot1)
  • the goal is to delete the data but not the contents from rows 7 til 100000 (or some other arbitrary large number). This also means that the used-range does not change, eg because we touched row 100k (eg we should still see the slider on the right as "large")
  • write some other data into some of these rows (see screenshot 2)

screenshot 1 with the original data
image

screenshot 2 with the deleted data
image

The file is attached here.

example.xlsx

From a code perspective, I would expect the following code to replicate screenshot2 and the target sheet.

library(openxslx2)

f <- "example.xlsx"
s <- "ex1"

wb <- wb_load(f)

# clear the data
wb$clean_sheet(sheet = s, dims = "A7:A10000", styles = FALSE)

# write some new data
wb$add_data(sheet = s, dims = "A7", x = matrix(c("a", "b", "c"), ncol = 1), colNames = FALSE)

# save the workbook
wb_open(wb)
wb_save(wb, "example_result.xlsx")

Unfortunately, I am (at the moment) not able to checkout the PR code above. But I hope this highlights what I mean, if not, please let me know!

For good measure, the counterfactual is the following, which highlights that the used range has been extended to row 100k
image

@JanMarvin
Copy link
Owner

If I understand you correctly, this should now be possible. I have recreated your example below:

library(openxlsx2)
  
# create example data
wb <- wb_workbook() %>% 
  wb_add_worksheet("ex1") %>% 
  wb_add_data(x = c(rep(NA, 3), 1:10), na.strings = NULL) %>% 
  wb_add_fill(dims = "A7:A13", color = wb_colour("red"))

# apply style from A7 as column color
# (this is a visual overlay for uninintialized cells)
style <- wb %>% wb_get_cell_style(dims = "A7")
wb$worksheets[[1]]$cols_attr <- 
  sprintf("<col min=\"1\" max=\"1\" width=\"8.83203125\" style=\"%s\"/>", style)

# clone, clean and update sheet
wb$clone_worksheet(new = "target")$
  clean_sheet(dims = "A7:A100000", styles = FALSE)$
  add_data(dims = "A7", x = letters[1:3])

# wb$open()

One issue remains, if you write more rows than were previously initialized in the worksheet, you have to manually set a style for these cells:

# clone, clean and update sheet
wb$clone_worksheet(new = "target2")$
  clean_sheet(dims = "A7:A100000", styles = FALSE)$
  add_data(dims = "A7", x = letters)

# we write a larger vector and have to set styles accordingly
dims_to_style <- sprintf("A7:A%s", 6 + nrow(as.data.frame(letters)))
wb$set_cell_style(dims = dims_to_style, style = style)

# wb$open()

@DavZim
Copy link
Author

DavZim commented Mar 25, 2023

That looks promising. I'll try the solution but it might take some time. Thank you for the quick responses! Highly appreciated!

@JanMarvin
Copy link
Owner

We have all the time in the world ;) I've merged the PR. A new CRAN release should follow next weekend.

@DavZim
Copy link
Author

DavZim commented Mar 25, 2023

Just to give you some quick feedback. That works as expected!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants