Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete slides #956

Open
luchux opened this issue Mar 13, 2024 · 3 comments
Open

Delete slides #956

luchux opened this issue Mar 13, 2024 · 3 comments

Comments

@luchux
Copy link

luchux commented Mar 13, 2024

I managed to remove slides from_page to_page (i.e. from_slide_number, to_slide_number).
It works. The problem I'm facing is that although i remove relationships, the small PPTx version, has the same MB weight than the original. I can't find the way to remove the memory usage of those elements unlinked.
If anybody could give me a hint, would be appreciated!

def _keep_slides_from_to(presentation, from_page, to_page):
    """Remove each slide position that is not in the range from_page to to_page"""
    idxs_to_remove = [
        pos
        for pos, slide in enumerate(presentation.slides._sldIdLst)
        if pos < from_page or pos > to_page
    ]
    xml_slides = presentation.slides._sldIdLst
    slides = list(xml_slides)
    rels = presentation.part.rels
    rel_ids_to_remove = [slides[idx].rId for idx in idxs_to_remove]

    for idx_to_remove in idxs_to_remove:
        slide_id = slides[idx_to_remove]
        xml_slides.remove(slide_id)

    # Remove the corresponding relationship
    for rel_id in rel_ids_to_remove:
        rels._rels.pop(rel_id)

    return presentation
@MartinPacker
Copy link

It strikes me that code hasn't actually removed any data - other than some of the XML. Essentially you've orphaned some parts.

I don't believe there's a general API for removing unwanted parts such as graphics from what is, after all, a zip file.

That is something I'd like to see - within python-pptx.

@scanny
Copy link
Owner

scanny commented Mar 14, 2024

@MartinPacker if you remove a relationship I believe you'll find that the orphaned part is not saved. Also, if you don't have or retain a reference to the part instance then it will be garbage collected.

So there's no real reason to somehow destroy an orphaned part, that should take care of itself.

One possible problem though is when a part is related-to by more than one other part. I vaguely remember this being the case in some instances, like maybe a slide-layout being referenced by a slide-master and also being referenced by slides that use it. So you would need to remove all those relationships to actually orphan the slide-layout part.

Other parts, like images in particular, can be referenced by multiple parts on purpose, like if you rubber-stamped copies of an image on say 20 different slides for visual effect, maybe a logo, that image should only be stored once, even though there are 20 relationships to it from other parts (slides or slide-layouts maybe in this case).

@MartinPacker
Copy link

Thank you @scanny; I wasn't aware of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants