Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need an example of apply watermark #110

Closed
StevenLOL opened this issue Jul 15, 2020 · 15 comments
Closed

Need an example of apply watermark #110

StevenLOL opened this issue Jul 15, 2020 · 15 comments
Labels

Comments

@StevenLOL
Copy link

StevenLOL commented Jul 15, 2020

hi,

Need an example of applying watermark.

Tested with following code, It doesn't work as I exptected, the "watermark" was not on the right location.

from reportlab.pdfgen import canvas
from pikepdf import Array, Dictionary, Name, Pdf, PdfMatrix, Stream


INPUT_PDF="./test3.pdf"
WATERMARK_PDF='./test4.pdf'
OUTPUT_PDF='./test3_test4.pdf'

def generate_watermark(msg,fileName,x=55,y=220):
    c = canvas.Canvas(fileName, bottomup=0)
    c.setFontSize(32)
    c.setFillColorCMYK(0, 0, 0, 0, alpha=0.7)
    c.rect(204, 199, 157, 15, stroke=0, fill=1)
    c.setFillColorCMYK(0, 0, 0, 100, alpha=0.7)
    c.drawString(x, y,msg )
    c.save()
   
# generate two pdfs
generate_watermark('file3',INPUT_PDF,100,100)
generate_watermark('file4',WATERMARK_PDF)


with pikepdf.open(INPUT_PDF) as input_pdf, \
            pikepdf.open(WATERMARK_PDF) as watermark_pdf, \
            open(OUTPUT_PDF, 'wb') as output_stream:
        
        # Create new output PDF
        output_pdf = pikepdf.new()


        for i in range(len(input_pdf.pages)):
            #load and insert watermark
            input_pdf.pages[i].page_contents_add(watermark_pdf.pages[0].Contents)
            input_pdf.pages[i].page_contents_coalesce()

        output_pdf.pages.extend(input_pdf.pages)
        output_pdf.save(output_stream)  # save to a new file
@StevenLOL StevenLOL changed the title Unable Need an example of apply watermark Jul 15, 2020
@jbarlow83
Copy link
Member

It's usually best to capture a page as a Form XObject:

dictpage = watermark_pdf.pages[0]
page = pikepdf.Page(dictpage)
formx = page.as_form_xobject()

Then attach it to input_pdf as a resource and draw it.

The reason it's not working is likely the alpha channel. Transparency, like so many PDF features, was awkwardly bolted on after the original spec. The alpha channel information goes into a /Resources /ExtGState object (extended graphics state), and the content stream will activate it. However, if you merge with the input page, you need to check its resource dictionary for name conflicts and possibly edit the content stream. It may be that reportlab put other interesting details in to ExtGState as well.

@pmg007
Copy link

pmg007 commented Dec 24, 2020

Hello @jbarlow83
I am trying to add a watermark to a pdf.
I have created a watermark pdf using reportlab and now I am trying to add that to the input pdf using page_contents_add almost just like mentioned in the code snippet above by Steven. I am having no luck getting it to watermark all pages, the first page which is simple one gets watermarked while others do not.
Referred following:
#110
#95
#42
#43

All of them are trying to use page_contents_add but have no luck. I see that it is recommended in some comments which say that we need to somehow add that as a resource or copy it into the input pdf in Resources section, but I am not sure how to achieve that. The usage is not clear. Could you please provide some examples or sample code snippet/template on how to do that?
Thanks.
References:
#42 (comment)
#110 (comment)

Also, @StevenLOL if you were able to get the pdf watermarked, any of your help is appreciated!

@jbarlow83
Copy link
Member

jbarlow83 commented Dec 24, 2020

Here's an improved, fully functioning example. This is also an improved over the past versions that combined the content streams. This one instead captures the watermark in a "form XObject" (sort of like a sub-page) to isolate it from the rest of the document and makes it less likely to alter the rest of the document.

In a future release I will add add_resource to pikepdf's codebase (or some variation thereof) which will simplify this.

import pikepdf
from pikepdf import Name, Pdf, Object, Dictionary
from reportlab.pdfgen import canvas
from typing import Optional

def generate_watermark(msg,xy):
    x, y = xy
    buf = BytesIO()
    c = canvas.Canvas(buf, bottomup=0)
    c.setFontSize(32)
    c.setFillColorCMYK(0, 0, 0, 0, alpha=0.7)
    c.rect(204, 199, 157, 15, stroke=0, fill=1)
    c.setFillColorCMYK(0, 0, 0, 100, alpha=0.7)
    c.drawString(x, y, msg)
    c.save()
    buf.seek(0)
    return buf

wm = generate_watermark('Watermark', (100, 100))
txt = generate_watermark('Document text', (200, 200))

with pikepdf.open(wm) as pdf_wm, pikepdf.open(txt) as pdf_txt:
    wm_page = pikepdf.Page(pdf_wm.pages[0])
    wm_formx = wm_page.as_form_xobject()

    formx = pdf_txt.copy_foreign(wm_formx)
    page = pdf_txt.pages[0]
    formx_page = pikepdf.Page(page)
    formx_name = formx_page.add_resource(formx, Name.XObject)
    
    draw_watermark_content_stream = pdf_txt.make_stream(b'q 1 0 0 1 0 0 cm %s Do Q' % formx_name)

    pdf_txt.pages[0].page_contents_add(draw_watermark_content_stream, prepend=True)
    pdf_txt.save('out.pdf')

@pmg007
Copy link

pmg007 commented Dec 28, 2020

Thanks @jbarlow83, I will try this out.

@pmg007
Copy link

pmg007 commented Jan 5, 2021

Hi,
I tried this template out and it has helped up to an extent, thanks! Although for some pdfs I see that I am not able to see the watermark, when I search using Cmd+F I can see that text is present at the bottom of all pages in the pdf but is not visible to the eye. Any hints or guess so as to what might be the cause for that?

@jbarlow83
Copy link
Member

jbarlow83 commented Jan 5, 2021 via email

@pmg007
Copy link

pmg007 commented Jan 6, 2021

Thanks for the quick reply. What do you mean by adding the watermark last? In the code template above, we are adding a watermark to the existing page afterward, or am I mistaken?

@jbarlow83
Copy link
Member

jbarlow83 commented Jan 6, 2021 via email

@pmg007
Copy link

pmg007 commented Jan 6, 2021

Got it thanks, it mostly works now except in one of the faulty PDFs where it shows a mirror image of the watermark on the next page. Anyways, thanks a lot @jbarlow83

@jbarlow83
Copy link
Member

I updated the example to demonstrate the use pikepdf.Page.add_resource which is available in pikepdf 2.3.0.

@pmg007
Copy link

pmg007 commented Jan 6, 2021

That's great! Thanks for the new release with add_resource in it!

@basileos
Copy link

Thank you very much, @jbarlow83 you rock!

@Sofa0908
Copy link

Thank you so much @jbarlow83, This helped out a lot in my case as well.

However I'm curious of the purpose and meaning of the b'q 1 0 0 1 0 0 cm %s Do Q' % formx_name in
draw_watermark_content_stream = pdf_txt.make_stream(b'q 1 0 0 1 0 0 cm %s Do Q' % formx_name)

I've checked the documentation on make_stream, but the examples or descriptions don't even come close to what you have written here. Could you maybe show me a pointer to where can I read more about this kind of usages and what do they mean respectively?

Please and thank you.

@jbarlow83
Copy link
Member

jbarlow83 commented Apr 20, 2021 via email

@jbarlow83
Copy link
Member

Now implemented officially https://pikepdf.readthedocs.io/en/latest/topics/overlays.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants