Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove_slide doesn't remove images from ppt/media #565

Closed
SteffenWolfFA-ERF opened this issue Apr 8, 2024 · 4 comments
Closed

remove_slide doesn't remove images from ppt/media #565

SteffenWolfFA-ERF opened this issue Apr 8, 2024 · 4 comments

Comments

@SteffenWolfFA-ERF
Copy link

> library(officer)
> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] officer_0.6.3

Hello!

When removing a slide with remove_slide(), images of this slide aren't removed from the ppt/media folder. This results into unnecessary large .pptx files.

@davidgohel
Copy link
Owner

thanks, you can now use rm_images = TRUE. I think in some cases, some images are re-used in different slides. So it's a parameter that you can set to TRUE and not a forced behavior.

my_pres <- read_pptx("x.pptx")
my_pres <- remove_slide(my_pres, rm_images = TRUE)
print(my_pres, target = "y.pptx")

@SteffenWolfFA-ERF
Copy link
Author

Thanks for this fast fix! This is very welcome!

But I think it would make more sense to check for existing files that aren't referenced on any slides and remove them. This would be more general in my opinion and doesn't require an additional parameter.

Greetings!

@davidgohel
Copy link
Owner

That's a nice proposition! I will happy to validate your PR.

  • Please extend the work to docx, the same case needs to be resolved,
  • I think we also need to see if the signature of certain images is the same, in which case we need to remove duplicate images,
  • and apply the same recipe for external embedded files.

@SteffenWolfFA-ERF
Copy link
Author

Valid point. I will try to arrange some time for working on this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants