Deep Learning on Historical Manuscripts
Java Protocol Buffer
Deep learning

Dev Environment setup (by Arttu, for Ubuntu 14.10 & Eclipse Luna)

Disclaimer: this is approximately how I set this up. Your mileage might wary. Also, it should be totally possible to develop without Eclipse, but I find it quite useful.

  1. install Eclipse (
  • install Maven (sudo apt-get install maven)
  • in Eclipse, install Maven plugin
    • Help -> Install new software -> Luna -> search for Maven
    • select m2e, doesn't matter which one if you see many of them (e.g. in Collaboration)
  • clone the project from github (git clone
  • compile: in DeepManuscripts/deeplearning, do mvn package
  • import the project to Eclipse
    • File -> Import -> Maven -> Existing Maven Projects -> Next
    • select DeepManuscripts/deeplearning as the root directory
    • Finish
  • if you want to run the code locally, you also need to install Spark
  • to test if everything works:
    • create a file called test_in.txt, with content 123 456 789 (tabs, not spaces)
    • in DeepManuscripts/deeplearning, do (supposing Spark's bin is in $PATH) spark-submit --class --master local[1] target/DeepManuscriptLearning-0.0.1.jar test_in.txt test_out
    • it should execute, and you should end up with folder test_out with file part-00000 containing [123.0,456.0,789.0]

Protocol buffers

Before proceeding, please read the basic tutorial: It is written concisely.

.proto file contains the description/layout of your settings. This is the file you need to modify if you want to add/remove settings. The .proto file is compiled using the protoc tool which generates a .java file. The .java file contains all the classes generated from your .proto file definition.

  • Compiling the .proto file requires you to install the protoc packages. Please read the 'Compiling .proto files' section from the provided link
    • Note that you do not need to compile the .proto file if you do not want to add changes to it
    • If you modified the .proto file, for the changes to take effect run the command below in the deeplearning folder: protoc -I=src/main/java --java_out=src/ src/main/java/deep_model_settings.proto This will create a class in the src/main/java folder named You need to run mvn package again after you update
  • Modyfing the .proto file
    • The .proto file must be compiled after every change.
    • Every message is compiled into a Java class
    • A field of the message can be optional, required or repeated
    • Every field field of a message generates methods of the form ** hasField() is true if the field is present (it is always true for required fields but can be false for optional and repeated) ** getField() / getFieldList() returns the actual field message / a list of messages for repeated fields
  • Creating a .prototxt file //TODO
