Office Service is a server for dealing with Microsoft Office Word files via a HTTP interface. Currently, it's able to convert a Word file to PDF and clean up personal information from Word file metadata. It uses Microsoft Office COM API to access an Word Application instance and do actions over the documents programmatically.
OpenOffice provides an API called UNO for manipulating documents programmatically. A OpenOffice functionality is convert Word files to PDF but it fails in rendering process. The result is a bad looking PDF not like the original one.
- Python (Lastest python 2.7 for windows http://python.org/download/)
- pip (http://stackoverflow.com/questions/4750806/how-to-install-pip-on-windows)
- Bottle (pip install bottle)
- Paste (pip install paste)
- pywin32 (http://sourceforge.net/projects/pywin32/files/pywin32/Build%20218/)
- A legal copy [ :-) ] of Microsoft Office 2007 (tested) or 2010 (untested)
There are two URL:
It receives a word file (.doc or .docx) via POST method and send a PDF file as response. Example: convert.html
It receives a word file (.doc or .docx) via POST method send another word file without personal information (but identical to the first one). Example: cleanup.html
The request is sent the Work Queue for processing upon Microsoft Word, later a response is returned to the client.
Microsoft Word attends one request at a time. It was necessary to create a queue to deliver one work at a time. Under demand, the Word instance is kept alive and serves each request enqueued (get best performance). When the queue is empty the Word instance is destroyed for cleaning (get best stability)