Skip to content
Docx4JSRUtil library helps you to search and replace text inside docx-Documents parsed by Docx4J.
Java
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src Added Log4J + NPE fix Mar 10, 2020
.gitignore gitignore Feb 27, 2020
.travis.yml Travis-YML Feb 27, 2020
LICENSE LICENSE + README Feb 27, 2020
README.md README Feb 27, 2020
demo_word_screenshot.png README with screenshot Feb 27, 2020
pom.xml Added Log4J + NPE fix Mar 10, 2020

README.md

Docx4JSRUtil - Search and replace util for Docx4J

Build Status

Docx4JSearchAndRreplaceUtil library helps you to search and replace text inside docx-Documents parsed by Docx4J.

What it does

alt text

When you write a placeholder inside a .docx-Document, e.g. ${NAME}, there is no guarantee this string lands "as is" inside the underlying XML. There can be style markup in between.

Therefore ${NAME} is most probably $ + { + NAME + }. This means we can't just do a simple replace on a Text-object (a Docx4J-Type).

Docx4JSRUtil solves this problem.

PS: This is not yet in maven central. (TODO)

Usage:

WordprocessingMLPackage template = WordprocessingMLPackage.load(new FileInputStream(new File("document.docx")));;

// that's it; you can now save `template`, export it as PDF or whatever you want to do
Docx4JSRUtil.searchAndReplace(template, Map.of(
        "${NAME}", "Philipp",
        "${SURNAME}", "Schuster",
        "${PLACE_OF_BIRTH}", "GERMANY"
));

How it works internally

  1. It retrieves the list of all Text-objects (in correct order) from Docx4J
  2. creates a "complete string" (Text-list reduced to a single string via concatenation)
  3. build lookup information to get from index in complete string to corresponding text object
  4. do search for placeholders in "complete string"
  5. build a List<ReplaceCommand> that is ordered from the last index in the "complete string" to the first (that's important to not invalidate indices of other ReplaceCommands during replacement!)
  6. figure out on which Text-objects changes has to be done and do the actual replacement

Limitations for place holders

Place holders can be any string pattern, it doesn't have to be ${}.

You can’t perform that action at this time.