Skip to content

Clean up personal information from doc file and convert doc file to PDF through out Microsoft Office (2007 and 2010) COM API

License

Notifications You must be signed in to change notification settings

astroza/office_service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

office_service

Office Service is a server for dealing with Microsoft Office Word files via a HTTP interface. Currently, it's able to convert a Word file to PDF and clean up personal information from Word file metadata. It uses Microsoft Office COM API to access an Word Application instance and do actions over the documents programmatically.

Motivation

OpenOffice provides an API called UNO for manipulating documents programmatically. A OpenOffice functionality is convert Word files to PDF but it fails in rendering process. The result is a bad looking PDF not like the original one.

Dependencies

How it works

HTTP Interface

There are two URL:

/to/pdf

It receives a word file (.doc or .docx) via POST method and send a PDF file as response. Example: convert.html

/cleanup/word

It receives a word file (.doc or .docx) via POST method send another word file without personal information (but identical to the first one). Example: cleanup.html

The request is sent the Work Queue for processing upon Microsoft Word, later a response is returned to the client.

Work Queue

Microsoft Word attends one request at a time. It was necessary to create a queue to deliver one work at a time. Under demand, the Word instance is kept alive and serves each request enqueued (get best performance). When the queue is empty the Word instance is destroyed for cleaning (get best stability)

About

Clean up personal information from doc file and convert doc file to PDF through out Microsoft Office (2007 and 2010) COM API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published