Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract some transparent PNG images, discoloration occurs #670

Closed
mc373906408 opened this issue Sep 28, 2020 · 7 comments
Closed

Extract some transparent PNG images, discoloration occurs #670

mc373906408 opened this issue Sep 28, 2020 · 7 comments
Assignees

Comments

@mc373906408
Copy link

mc373906408 commented Sep 28, 2020

def recoveImage(xref, smake):
            def getimage(pix):
                if pix.colorspace.n != 4:
                    return pix
                tpix = fitz.Pixmap(fitz.csRGB, pix)
                return tpix

            pix1 = fitz.Pixmap(self.mu_Document, xref)
            pix2 = fitz.Pixmap(self.mu_Document, smake)


            pix = fitz.Pixmap(pix1)
            pix.setAlpha(pix2.samples)
            pix1 = pix2 = None

            return getimage(pix)

The original image:
image3

Extracted image:
image

@JorjMcKie
Copy link
Collaborator

The logic to recover the alpha channel is known to be incomplete.
I am still investigating which cases need to be differentiated - so I will remove the bug label and label this as enhancement.
In your special case, this should help:

def recoveImage(xref, smake):
    def getimage(pix):
        if pix.colorspace.n != 4:
            return pix
        tpix = fitz.Pixmap(fitz.csRGB, pix)
        return tpix

    pix1 = fitz.Pixmap(self.mu_Document, xref)
    pix2 = fitz.Pixmap(self.mu_Document, smake)
    pix = fitz.Pixmap(pix1, 1)  # add alpha channel
    ba = bytearray(pix2.samples)
    for i in range(len(ba)):
        if ba[i] > 0:
            ba[i] = 255
    pix.setAlpha(ba)
    pix1 = pix2 = None
    return getimage(pix)

@JorjMcKie JorjMcKie added enhancement and removed bug labels Sep 28, 2020
@mc373906408
Copy link
Author

I will temporarily use PIL to restore transparency, looking forward to improvement

@JorjMcKie
Copy link
Collaborator

JorjMcKie commented Sep 28, 2020

Ah, ok.
Can you please let me know which PIL feature was a help / solved the problem?

@mc373906408
Copy link
Author

mc373906408 commented Sep 28, 2020

from PIL import Image

             ...

            pix1 = fitz.Pixmap(self.mu_Document, xref)
            pix2 = fitz.Pixmap(self.mu_Document, smake)

            mode="RGB"
            if pix1.alpha>0:
                mode="RGBA"
            pix=Image.frombytes(mode,(pix1.irect[2],pix1.irect[3]),pix1.samples)
            mask=Image.frombytes("L",(pix2.irect[2],pix2.irect[3]),pix2.samples)
            tpix=Image.new("RGBA",pix.size)
            tpix.paste(pix,None,mask)
            bf=BytesIO()
            tpix.save(bf,"png")
            
            return bf.getvalue()

@JorjMcKie
Copy link
Collaborator

Great, thanks.
I regard this whole business as being outside PyMuPDF scope. The base C library, MuPDF, also offers no solution here.
The image extraction scripts in the PyMuPDF-Utilities repo are examples, no solutions, for which I can take on responsibility.
Nevertheless I will test your solution a bit more and then modify the script accordingly.
Why re-inventing the wheel if we have Pillow?

@mc373906408
Copy link
Author

OK

@JorjMcKie
Copy link
Collaborator

I have changed the example scripts extract-imga.py and extract-imgb.py to make use of PIL/Pillow in case of transparent images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants