Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to get information on formatting properties of runs. #576

Merged
merged 4 commits into from
May 19, 2024

Conversation

trekonom
Copy link
Contributor

... and yet another one. (;

This PR adds a new parameter detailed to docx_summary which allows to get a detailed summary on formatting properties of paragraph runs.

The reprex illustrates the new functionality:

library(officer)

doc <- read_docx()

fpar_ <- fpar(
  ftext("Formatted ", prop = fp_text(bold = TRUE, color = "red")),
  ftext("paragraph ", prop = fp_text(
    shading.color = "blue"
  )),
  ftext("with multiple runs.",
    prop = fp_text(italic = TRUE, font.size = 20, font.family = "Arial")
  )
)

doc <- body_add_fpar(doc, fpar_, style = "Normal")

fpar_ <- fpar(
  "Unformatted ",
  "paragraph ",
  "with multiple runs."
)

doc <- body_add_fpar(doc, fpar_, style = "Normal")

doc <- body_add_par(doc, "Single Run", style = "Normal")

doc <- body_add_fpar(
  doc,
  fpar(
    "Single formatetd run ",
    fp_t = fp_text(bold = TRUE, color = "red")
  )
)

doc_sum <- docx_summary(doc, detailed = TRUE)

doc_sum$run
#> [[1]]
#>                  text  bold italic underline sz szCs  color shading
#> 1          Formatted   true  false      none 20   20 FF0000    <NA>
#> 2          paragraph  false  false      none 20   20 000000   clear
#> 3 with multiple runs. false   true      none 40   40 000000    <NA>
#>   shading_color shading_fill id
#> 1          <NA>         <NA>  1
#> 2          auto       0000FF  2
#> 3          <NA>         <NA>  3
#> 
#> [[2]]
#>                  text bold italic underline sz szCs color shading shading_color
#> 1        Unformatted  <NA>   <NA>      <NA> NA   NA  <NA>    <NA>          <NA>
#> 2          paragraph  <NA>   <NA>      <NA> NA   NA  <NA>    <NA>          <NA>
#> 3 with multiple runs. <NA>   <NA>      <NA> NA   NA  <NA>    <NA>          <NA>
#>   shading_fill id
#> 1         <NA>  1
#> 2         <NA>  2
#> 3         <NA>  3
#> 
#> [[3]]
#>         text bold italic underline sz szCs color shading shading_color
#> 1 Single Run <NA>   <NA>      <NA> NA   NA  <NA>    <NA>          <NA>
#>   shading_fill id
#> 1         <NA>  1
#> 
#> [[4]]
#>                    text bold italic underline sz szCs  color shading
#> 1 Single formatetd run  true  false      none 20   20 FF0000    <NA>
#>   shading_color shading_fill id
#> 1          <NA>         <NA>  1

@davidgohel
Copy link
Owner

thank you. I suggest one additional enhancement.

See https://github.com/davidgohel/officer/commits/trekonom-docx-summary-detail/

coco.docx

library(officer)

doc <- read_docx("coco.docx")
doc_sum <- docx_summary(doc, detailed = TRUE)

doc_sum$run

I modified these two lines so that w:b and w:i tags that can have missing val (meaning TRUE) can be read.

bold = val_child(node, "w:rPr/w:b", default = TRUE),
italic = val_child(node, "w:rPr/w:i", default = TRUE),
#' @importFrom xml2 xml_has_attr
val_child <- function(node, child_path, default = NULL) {
  child_node <- xml_child(node, child_path)
  if (inherits(child_node, "xml_missing")) return(NA_character_)
  if (!xml_has_attr(child_node, "val")) default
  else xml_attr(child_node, "val")
}

* Make bold and italic booleans. Account for 0/1 and off/on.
@trekonom
Copy link
Contributor Author

Thanks David. I just added some additional enhancements. For consistency I use val_child for all properties. To this end I added an attr= parameter. Additionally, properties bold and italic are now returned as booleans.

@davidgohel davidgohel merged commit 9461342 into davidgohel:master May 19, 2024
3 checks passed
@davidgohel
Copy link
Owner

thank you @trekonom !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants