Skip to content

Automated Document Template Processing in R

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

JonathanConrad98/docket

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

The 'docket' package is designed to streamline document creation by allowing users to populate data from an R environment into templates created in third-party software such as 'Microsoft Word'.

Guide:

  1. Setup: Create a template document and insert flags to be populated with R data.

  2. Create a Dictionary: Call getDictionary("/path/to/template.docx") to create an empty dictionary.

  3. Fill Dictionary: Fill in the replacement dictionary values with data from R environment.

  4. Validate Dictionary (Optional): Use checkDictionary(dictionary) to ensure dictionary meets requirements.

  5. Process Document: Call docket("/path/to/template.docx", dictionary, "/path/to/output.docx") to create populated template document

  6. Creating Multiple Documents To create mutliple output documents from one template, use the "batch" version of the docket functions to improve processing efficiency.

Setup

Create a template in a third-party word processor. Identify where R data should be inserted and add a flag by enclosing a string with guillemet characters (Example: «FirstName», «LastName», «Address»).

Included Example Template


library(docket)

template_path <- system.file("template_document", "Template.docx", package="docket")

print(template_path)

Formatting and Repeated Values

For performance reasons, when creating a template document, there should only be one flag per instance of data to be populated to the final document.

Incorrect

  1. «Today» <- as.character(Sys.Date())
  2. «Today_2» <- as.character(Sys.Date())
  3. «Today_Italics» <- as.character(Sys.Date())
  4. «Author» <- "John Doe"
  5. «Author_Bold» <- "John Doe"
  6. «Author_Italics» <- "John Doe"

Correct

  1. «Today» <- as.character(Sys.Date())
  2. «Author» <- "John Doe"

'docket' replacement values inherit the formatting of the flags they replace and can be duplicated across a document.

library(knitr)
template_screenshot <- system.file("template_document", "Document_Template.png", package="docket")
include_graphics(template_screenshot)

Note how «Author» and «Date» appear twice with different formatting. To ensure formatting propagates into the final document, format the entire flag - including the guillemets («») - in the template.

Create a Dictionary

A dictionary is a two-column data frame with the first containing the flags identified in the template document, and the second containing the data to replace each flag. Use getDictionary() on a file path to generate that document's dictionary.

Example

myDictionary <- getDictionary(template_path)

print(myDictionary)

Fill Dictionary

getDictionary() returns an empty dictionary containing the identified flags in the template document in the first column, and the values to replace them in the second column. Assign data to the corresponding flag value to replace it in the finished document.

myDictionary <- getDictionary(template_path)

#Set dictionary values
myDictionary[1,2] <- Sys.getenv("USERNAME") #Author name
myDictionary[2,2] <- as.character(Sys.Date()) # Date report created
myDictionary[3,2] <- 123
myDictionary[4,2] <- 456
myDictionary[5,2] <- 789
myDictionary[6,2] <- sum(as.numeric(myDictionary[3:5,2]))

print(myDictionary)

OPTIONAL: Verify Dictionary Meets Requirements

Use checkDictionary() to ensure that the dictionary meets the requirements to be processed into the document.

Example

#check the dictionary to ensure it is valid
print(checkDictionary(myDictionary))

Dictionary Requirements

  1. The dictionary is a data frame consisting of 2 columns
  2. The first column is named "flag"
  3. Each entry in the first column is a string surrounded by guillemets («»)

Additionally, While it's not required for a dictionary to be generated by getDictionary(), dictionaries that do not meet these specifications will be rejected.

Null Values

Any flag not assigned a corresponding value will maintain its original flag in the output document. If a flag represents a null value, insert this as a string.

Incorrect

  1. «Character» <- "John Doe"
  2. «Title» <- NA
  3. «Items» <- NULL

Correct

  1. «Character» <- "John Doe"
  2. «Title» <- "NA"
  3. «Items» <- "NULL"

Process Document

Use docket() to replace the identified flags in the template with R data.

Example

output_path <- paste0(dirname(template_path), "/output document.docx")

#If docket accepts the input dictionary as valid, create a filled template

if (checkDictionary(myDictionary) == TRUE) {
  docket(template_path, myDictionary, output_path)
}

print(file.exists(output_path))

OPTIONAL: Inspect Output

Open the output document in a software like 'Microsoft Word', 'Wordpad', or 'OpenOffice Writer' and inspect the contents of the file.

print(output_path)
output_screenshot <- system.file("template_document", "Processed_Document.png", package="docket")
include_graphics(output_screenshot)

Generating Multiple Documents

To create multiple populated documents at once, use the 'batch' versions of the docket functions to significantly improve processing efficiency.

getBatchDictionary()

A batch dictionary is similar to a normal dictionary in that each row represents a flag value to be replaced. However, batch dictionaries also include the name of the output file in row 1, and additional columns containing the values for each document to be populated.

# Path to the sample template file included in the package
template_path <- system.file("batch_document", "batchTemplate.docx", package="docket")
output_paths <- as.list(paste0(dirname(template_path), paste0("/batch document", 1:5, ".docx")))

# Create a dictionary by using the getDictionary function on the sample template file
myBatchDictionary <- getBatchDictionary(template_path, output_paths)

checkBatchDictionary()

Checks that the batch dictionary meets the requirements of a normal dictionary and contains the output file names in row 1.

myBatchDictionary[2,2:ncol(myBatchDictionary)] <- Sys.getenv("USERNAME") #Author name
myBatchDictionary[3,2:ncol(myBatchDictionary)] <- as.character(Sys.Date())
myBatchDictionary[4,2:ncol(myBatchDictionary)] <- 123
myBatchDictionary[5,2:ncol(myBatchDictionary)] <- 456
myBatchDictionary[6,2:ncol(myBatchDictionary)] <- 789
myBatchDictionary[7,2:ncol(myBatchDictionary)] <- sum(as.numeric(myBatchDictionary[4:6,2]))

print(checkBatchDictionary(myBatchDictionary))

batchDocket()

Call batchDocket() with the template name and batch dictionary as arguments

#Create multiple populated documents
if (checkBatchDictionary(myBatchDictionary) == TRUE) {
 batchDocket(template_path, myBatchDictionary)
}

#Verify documents exist
for (i in 1:length(output_paths)) {
   if (file.exists(output_paths[[i]])) {
     print(paste("docket", i, "Successfully Created"))
  }
}

Cleanup

if (file.exists(output_path)) {
  file.remove(output_path)
}

for (i in 1:length(output_paths)) {
   if (file.exists(output_paths[[i]])) {
     file.remove(output_paths[[i]])
  }
}

Summary

Using flags, users can easily populate data from an R environment into templates built in third-party software. Since 'docket' replacement values inherit the formatting of the flags they replace, there is no need to specify attributes such as italics, text size, color, or highlighting.

About

Automated Document Template Processing in R

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages