Skip to content

jwatchdog/pdf-util

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PDF Compare Utility

MVN Dependency:

<dependency>
   <groupId>com.testautomationguru.pdfutil</groupId>
   <artifactId>pdf-util</artifactId>
   <version>0.0.2</version>
</dependency>

Getting pdfutil.jar

Build

Requires maven.

git clone https://github.com/jwatchdog/pdf-util.git
cd pdf-util
mvn package

Usage

  • Using command line
java -jar target/pdf-util.jar file1.pdf file2.pdf [Optional:image-destination-path [Optional: create PDF, default false]]

Examples

  • To generate diffs as PNG files
java -jar target/pdf-util.jar file1.pdf file2.pdf c:/Users/jwatchdog/output
  • To generate a PDF with diffs
java -jar target/pdf-util.jar file1.pdf file2.pdf c:/Users/jwatchdog/output true

Coding Examples

  • To get page count
import com.testautomationguru.utility.PDFUtil;

PDFUtil pdfUtil = new PDFUtil();
pdfUtil.getPageCount("c:/sample.pdf"); //returns the page count
  • To get page content as plain text
//returns the pdf content - all pages
pdfUtil.getText("c:/sample.pdf");

// returns the pdf content from page number 2
pdfUtil.getText("c:/sample.pdf",2);

// returns the pdf content from page number 5 to 8
pdfUtil.getText("c:/sample.pdf", 5, 8);

  • To extract attached images from PDF
//set the path where we need to store the images
 pdfUtil.setImageDestinationPath("c:/imgpath");
 pdfUtil.extractImages("c:/sample.pdf");

// extracts &amp; saves the pdf content from page number 3
pdfUtil.extractImages("c:/sample.pdf", 3);

// extracts &amp; saves the pdf content from page 2
pdfUtil.extractImages("c:/sample.pdf", 2, 2);

  • To store PDF pages as images
//set the path where we need to store the images
 pdfUtil.setImageDestinationPath("c:/imgpath");
 pdfUtil.savePdfAsImage("c:/sample.pdf");
  • To compare PDF files in text mode (faster – But it does not compare the format, images etc in the PDF)
String file1="c:/files/doc1.pdf";
String file1="c:/files/doc2.pdf";

// compares the pdf documents &amp; returns a boolean
// true if both files have same content. false otherwise.
pdfUtil.compare(file1, file2);

// compare the 3rd page alone
pdfUtil.compare(file1, file2, 3, 3);

// compare the pages from 1 to 5
pdfUtil.compare(file1, file2, 1, 5);
  • To exclude certain text while comparing PDF files in text mode
String file1="c:/files/doc1.pdf";
String file1="c:/files/doc2.pdf";

//pass all the possible texts to be removed before comparing
pdfutil.excludeText("1998", "testautomation");

//pass regex patterns to be removed before comparing
// \\d+ removes all the numbers in the pdf before comparing
pdfutil.excludeText("\\d+");

// compares the pdf documents &amp; returns a boolean
// true if both files have same content. false otherwise.
pdfUtil.compare(file1, file2);

// compare the 3rd page alone
pdfUtil.compare(file1, file2, 3, 3);

// compare the pages from 1 to 5
pdfUtil.compare(file1, file2, 1, 5);
  • To compare PDF files in Visual mode (slower – compares PDF documents pixel by pixel – highlights pdf difference & store the result as image)
String file1="c:/files/doc1.pdf";
String file1="c:/files/doc2.pdf";

// compares the pdf documents &amp; returns a boolean
// true if both files have same content. false otherwise.
// Default is CompareMode.TEXT_MODE
pdfUtil.setCompareMode(CompareMode.VISUAL_MODE);
pdfUtil.compare(file1, file2);

// compare the 3rd page alone
pdfUtil.compare(file1, file2, 3, 3);

// compare the pages from 1 to 5
pdfUtil.compare(file1, file2, 1, 5);

//if you need to store the result
pdfUtil.highlightPdfDifference(true);
pdfUtil.setImageDestinationPath("c:/imgpath");
pdfUtil.compare(file1, file2);
  • For example, I have 2 PDF documents which have exact same content except the below differences in the charts. pdf1 pdf2

The difference is shown as diff

Copyright Information Outdated

This repo contains changes originally authored by diptman as well as the base pdf-util from uselvvi. I, jwatchdog, have merely implemented stuff I needed on top of that. Work in progress.

About

PDF Compare Utility

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%