Regular expressions that expand Adobe Acrobat's search-and-redact functions.
- P.O. boxes
- U.S. address number and street names
- ZIP and ZIP+4 codes
- State of Alaska trial court case numbers
Adobe Acrobat's redact tool can search for patterns; however, the functionality is limited. For example, the built-in English (US) patterns are:
- Social Security numbers
- email addresses
- phone numbers
- dates
- credit card numbers
Anyone unfortunate enough to have a light understanding of regex can add patterns. I have such an understanding. Read on and you might too.
📂 Open the search redact patterns file (after backing it up)
📝 Create or edit a set
💾 Save the edited file
🔏 Restart Acrobat and redact away
Rick Borstein wrote about creating and using custom redaction patterns on the now-defunct blog Acrolaw. A good chunk of this walk-through is duplicative of Rick's great primer, which seems to have been carried over to Adobe's general blog.
Redaction patterns are stored in XML files. These files begin with the prolog <?xml ... ?>
and end with tag </asf>
.
Before opening or editing the search redact patterns file, it's good to make a backup copy. You might save the original, unedited version as SearchRedactPatterns-backup.xml
.
macOS:
/Users/<username>/Library/Preferences/Acrobat/<version>/Redaction/<locale>/SearchRedactPatterns.xml
Windows Vista and newer:
\Users\<username>\AppData\Roaming\Adobe\Acrobat\<version>\Preferences\Redaction\<locale>\SearchRedactPatterns.xml
. The AppData folder is hidden by default. To navigate to it, type %AppData%
into the Windows Explorer.
Windows XP:
\Documents and Settings\<username>\Application Data\Adobe\Acrobat\<version>\Preferences\Redaction\<locale>\SearchRedactPatterns.xml
.
Note that there is a SearchRedactPatterns.xml
one level up in ...\<version>\Redaction\
, and changing that file won't make your patterns appear.
Locales
For each version of Acrobat on the machine, there is a pattern file for each locale that has been used. For example, if my computer has both Acrobat DC and Acrobat XI, /Adobe/Acrobat
will contain both /DC
and /11
. If I have used Acrobat XI to search in Japanese and United States locales, then /11
will contain /Redaction/JPN/SearchRedactPatterns.xml
and /Redaction/ENU/SearchRedactPatterns.xml
.
To change locale, or to force Acrobat to create the SearchRedactPatterns.xml
file you want to edit, open Acrobat and navigate to Preferences>Documents and select your desired locale from the dropdown menu under the Redaction heading.
You may also select from the redact search window "Choose different locale for patterns".
Each pattern is stored as a "set" with the following syntax:
<set name="Entry4">
<str name="displayName">
<val>Email Addresses</val>
</str>
<str name="regEx" translate="no">
<val>([a-zA-Z0-9_])([a-zA-Z0-9_\-\.])*@([a-zA-Z0-9\-])+\.([a-zA-Z\.]+)</val>
</str>
<str name="examples">
<val>This pattern will search for email addresses.
For example:
John.Doe@acme.com
John_Doe_1234@acme.gov
j-doe@marketing.acme.net</val>
</str>
</set>
Simply copy a set, such as the one above, and change the entry number, pattern name, pattern, and description, as highlighted below:
Pretty self-explanatory. Don't change the file's name or location. Do have a backup!
After saving the file, restart Acrobat. Open a PDF and choose the redact tool. When you search for a pattern, the dropdown menu now includes your new options!
- Save a copy of the original XML file before you begin tinkering.
- The only attribute value you need to change is the set number (e.g., "Entry5").
- In Acrobat 9 and up, different patterns exist for different countries and languages ("locales"). See locales above.
- I thought that Acrobat used a Perl regex engine and that you might be able to switch it to Java. This was based on documentation and forums on other Adobe products, specifically InDesign and ColdFusion. That ColdFusion page has a feature comparison table of the two engines. Another user said it uses JavaScript RegExp. If the differences matter to you, then you're probably beyond my help.