Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghostscript - completely REMOVE METADATA from pdf files 2 #117

Open
Geo-Van opened this issue Sep 27, 2023 · 2 comments
Open

Ghostscript - completely REMOVE METADATA from pdf files 2 #117

Geo-Van opened this issue Sep 27, 2023 · 2 comments

Comments

@Geo-Van
Copy link

Geo-Van commented Sep 27, 2023

Hello,
This post is similar with #114
BUT, it is modified according to new understanding after having examine and get advices from many experts.
The reason we post it to here, is to confirm that we understand the matter correct, because we are newbies to Ghostscript.
So we simplify the whole matter to the following, and we will appreciate your comments:

I have a pdf file (named input.pdf) and i want to convert it to a new pdf file (named output.pdf) using Ghostscript.
The reason/purpose i want to convert it is only one - to remove all its old metadata (classical and xmp).
So, i apply the command:

gsc.exe -o output.pdf -sDEVICE=pdfwrite input.pdf pdfmark.txt

With the following pdfmark.txt:
[ /Title ()
/Author ()
/Subject ()
/Creator ()
/ModDate ()
/Producer ()
/Keywords ()
/CreationDate ()
/DOCINFO pdfmark

Please, very kindly i ask just to confirm that the above command will do the job i want that is :
It will remove all its old metadata classical and xmp and the newly created file will have its new metadata but there will be no trace of the old file metadata.
(I know that the only exception is that the Producer name will be Ghostscript, and the Creation Date will be applied during the creation of the converted file and can not be changed), - and we are happy with this.

We will appreciate very much your comments regarding if the above command will do the job we want.
Thank you very much!

@jamie-lemon
Copy link
Collaborator

Hi there,

I think you could run the command on the PDF you want then try a service to quickly check the file metadata - e.g. https://app.pdf.co/request-tester ( you'll have to create account first, but its easy and you get free credits ), just run the pdf info command on the file , see: https://apidocs.pdf.co/02-pdf-info-reader and it will tell you what it sees.

Hope this helps.

@Geo-Van
Copy link
Author

Geo-Van commented Sep 28, 2023

Thank you.
I try it with EXIFTOOL, and it seems that it do the job - there are no metadata seen from the old pdf.
So, i think that the above Ghostscript command do the job we want.
Any other comments, please regarding the Ghostscript command used?
Will the command do the job we want?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants