Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Merge repeated cells by one or more variables #341

Closed
wants to merge 8 commits into from

Conversation

darentsai
Copy link

@darentsai darentsai commented Apr 10, 2022

When data have one or more categorical variables, it's common to sort them and merge repeated cells together when creating an excel file. The argument mergeCols added in writeData() controls which columns need to be merged. Its order indicates how your data is grouped and sorted. An example using esoph data is demonstrated as follows:

df <- with(esoph, esoph[order(agegp, alcgp, tobgp), ])[1:30, ]
wb <- createWorkbook()
addWorksheet(wb, "esoph")
writeData(wb, "esoph", df, borders = "all", mergeCols = 1:3)
addStyle(wb, "esoph", createStyle(halign = "center", valign = "center"),
         rows = 1:(nrow(df)+1), cols = 1:ncol(df), gridExpand = TRUE, stack = TRUE)
saveWorkbook(wb, "writeDataExample.xlsx", overwrite = TRUE)

example

@JanMarvin
Copy link
Collaborator

Hi @darentsai , lets begin with "it's common": I have never seen someone doing something like this and I've seen people doing many strange things. Therefore I have do disagree that it's common ;)

Otherwise it looks good. Just a suggestion, maybe the sorting should also be done via writeData? So that we keep the input and output in sync. Are there any drawbacks that I am overlooking?

Final remarks:

  • a test for this would be awesome (something like your code snippet and for example some detection of how many merged cells the workbook contains) and
  • a rerun of roxygen.

@codecov-commenter
Copy link

codecov-commenter commented Apr 10, 2022

Codecov Report

Merging #341 (8286fca) into master (2ace59a) will decrease coverage by 0.08%.
The diff coverage is 7.69%.

@@            Coverage Diff             @@
##           master     #341      +/-   ##
==========================================
- Coverage   67.76%   67.67%   -0.09%     
==========================================
  Files          34       34              
  Lines        8931     8944      +13     
==========================================
+ Hits         6052     6053       +1     
- Misses       2879     2891      +12     
Impacted Files Coverage Δ
R/build_workbook.R 97.36% <ø> (ø)
R/writeData.R 82.90% <7.69%> (-5.44%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3ceeb84...8286fca. Read the comment docs.

@darentsai
Copy link
Author

darentsai commented Apr 11, 2022

@JanMarvin Thanks for your kind reply. My statements may make some misunderstanding. If the data written to an excel file is used for further analysis or modeling, this feature is not practical. But, if the data have been grouped, summarised by some categorical variables and going to be presented in a report, this feature can make the appearance of table much tidier.(I update a new example in my previous comment using another dataset.)

As for the sorting process, I also thought about doing it in writeData. In my opinion, a method that prints data to an external file should not change the input structure (e.g. order). Hence, I presume users have arranged data as they want before writing it out. mergeCols only works on how the data is rendered to excel.

@JanMarvin
Copy link
Collaborator

Alright, I agree on the sorting part. Give @ycphs or me a ping when to review (still like to see a test, even if it's just a tiny expect_silent(write...)).

@JanMarvin
Copy link
Collaborator

Hi @darentsai , what is the state of this? Ready and just missing a test or did you find some larger issue that is holding this pull request back?

@darentsai
Copy link
Author

@JanMarvin Thanks for the reminder. I have tried several kinds of data and adjusted different sets of arguments. I don't find anything wrong so far. I think it's ready to be used but lacks for a test. I'm sorry that I'm not familiar with the test mechanism of package development, so I haven't figured out a proper test example to examine this new feature.

Copy link

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Dec 30, 2023
@darentsai darentsai closed this by deleting the head repository May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants