Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML error if tab names are string that starts with an integer #124

Closed
pbbgss opened this issue Mar 26, 2024 · 3 comments · Fixed by #125
Closed

XML error if tab names are string that starts with an integer #124

pbbgss opened this issue Mar 26, 2024 · 3 comments · Fixed by #125
Labels
bug Something isn't working

Comments

@pbbgss
Copy link

pbbgss commented Mar 26, 2024

If any of the elements in the character vector passed to the tab_names argument of a11ytables::create_a11ytable() start with an integer it causes an XML issues when creating the xlsx workbook. When the file is opened it needs to be "repaired" This XML file does not appear to have any style information associated with it.

Not sure if is specific to the environment where I'm working (MoJ Analytical Platform) and it is definitely not a major issue but took me a while to figure out it was the tab names causing it. Also not sure if it is an {a11ytables} or {openxlsx} thing. Might it be worth a note in the documentation in case other people run into the issue or perhaps one for inclusion in the "automatic tab name correction"?

Really appreciate your work on this package!

Package versions:

  • {a11ytables} 0.3.0.90001
  • {openxlsx} 4.2.5.2

Example

This is the example from the {a11ytables} vignette with just the tab names changed

cover_list <- list(
  "Section 1" = c("First row of Section 1.", "Second row of Section 1."),
  "Section 2" = glue::glue("The only row of Section 2."),
  "Section 3" = c(
    "[Website](https://co-analysis.github.io/a11ytables/)",
    "[Email address](mailto:fake.address@a11ytables.com)"
  )
)

contents_df <- data.frame(
  "Sheet name" = c("Notes", "Table_1", "Table_2"),
  "Sheet title" = c(
    "Notes used in this workbook",
    "First Example Sheet",
    "Second Example Sheet"
  ),
  check.names = FALSE
)

notes_df <- data.frame(
  "Note number" = paste0("[note ", 1:3, "]"),
  "Note text" = c("First note.", "Second note.", "Third note."),
  check.names = FALSE
)

table_1_df <- data.frame(
  Category = LETTERS[1:10],
  "Numeric [note 1]" = 1:10,
  "Numeric suppressed" = c(1:4, "[c]", 6:9, "[x]"),
  "Numeric thousands" = abs(round(rnorm(10), 4) * 1e5),
  "Numeric decimal" = abs(round(rnorm(10), 5)),
  "This column has a very long name that means that the column width needs to be widened" = 1:10,
  Notes = c("[note 1]", rep(NA_character_, 4), "[note 2]", rep(NA_character_, 4)),
  check.names = FALSE
)

table_2_df <- data.frame(Category = LETTERS[1:10], Numeric = 1:10)

my_a11ytable <- 
  a11ytables::create_a11ytable(
    tab_titles = c("Cover", "Contents", "Notes", "1_Table 1", "1_Table_2"),
    sheet_types = c("cover", "contents", "notes", "tables", "tables"),
    sheet_titles = c(
      "The 'a11ytables' Demo Workbook",
      "Table of contents",
      "Notes",
      "Table 1: First Example Sheet",
      "Table 2: Second Example Sheet"
    ),
    blank_cells = c(
      rep(NA_character_, 3),
      "Blank cells indicate that there's no note in that row.",
      NA_character_
    ),
    custom_rows = list(
      NA_character_,
      NA_character_,
      "A custom row.",
      c(
        "First custom row [with a hyperlink.](https://co-analysis.github.io/a11ytables/)",
        "Second custom row."
      ),
      "A custom row."
    ),
    sources = c(
      rep(NA_character_, 3),
      "[The Source Material., 2024](https://co-analysis.github.io/a11ytables/)",
      "The Source Material, 2024."
    ),
    tables = list(cover_list, contents_df, notes_df, table_1_df, table_2_df)
  )

my_wb <- a11ytables::generate_workbook(my_a11ytable)

openxlsx::saveWorkbook(my_wb, "publication.xlsx", overwrite = TRUE)
@matt-dray
Copy link
Collaborator

Curious! Thank you for digging into it and reporting it.

To confirm: I can replicate it locally on my machine too. Odd, of course, because I'm pretty sure tab names can start with numbers when you create a workbook manually.

I think this should be an error. We could warn and automatically prepend with an underscore or something, but I think the change should be at the user's discretion. And for sure, a note in the function documentation, vignettes and website would be a good idea.

Cheers! I hope to do it this week.

@pbbgss
Copy link
Author

pbbgss commented Mar 27, 2024

Yea I had checked previously that I could have an Excel tab staring with a number and that it wold survive conversion to ODS.

MS Excel seemed to be able to recover the workbook OK despite the error but I decided it is probably better for accessibility/ general usability purposes to have "Table" in the tab name rather than just the table number anyway.

Thanks for taking a look at it so quickly!

@matt-dray
Copy link
Collaborator

Cracking, just merged and released v0.3.1, which should do the trick. Thanks again for filing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants