Skip to content

OptimusCrime/master-thesis-code

Repository files navigation

Rorschach :: Code

Build Status

Code for the master thesis for Thomas Gautvedt.

Information

We are using Docker to make the installation of the system easy and platform independent. Below are the requirements for running the system, the setup process, as well as explaination for how to configure the system to run the tests and models after your wish.

Requirements

Note that Docker compose is included in the Docker Toolbox, so if you install that you do not need to install the standalone version.

We supply a Makefile to make it easier to use this setup. UNIX based systems should have Make installed by default. Windows systems need to install this manually, see section below. It is also possible to use this setup without Make, but you will have to type out some of the tasks manualy instead of relying on the provided shortcuts.

Additional Setup for Windows Machines

Make

Install the GNU make utility for Windows.

You now need to add this location to your system Path. Copy the location where the binary files were placed, by default this should be C:\Program Files (x86)\GnuWin32\bin. Open up the System Properties. On Windows 10 you can do this by right-clicking on the Windows icon in the left bottom corner and selecting System. Proceeed to click Advanced System Settings and change to the Advanced tab. Click the button Envionment Variables. Find Path in the bottom list, highlight it and click Edit... Click The New button in the new window and paste the path we previously found. Click Ok to close the windows.

Share drive

You must also make sure to share the drive on which the system files reside. If this is the C: drive, you must share this drive in the Docker for Windows application. This is done in the Settings for the Docker for Windows application, under the tab Shared Drives.

Setup

To start the system for the first time, type

make

This is a shortcut for the following commands

make build
make up
make prepare
make run

To stop the machine type

make stop

If you already have built and prepared the system you can start it up again with

make start

When you want to run the system to train/test, type

make run

If you want to SSH into the container, you can do this with

make bash

There are a few more shortcuts in the Makefile. Browse the file to see these.

If you are unable to use the make commands on your system is the best alternative to run each consecutive command found in the Makefile manually. The file and its content should be pretty self-explanatory.

The prepare Command

This command runs various sub commands to set up the system correctly:

  • Install all the requirements in requirements.txt.
  • Downloads wordlists and fonts from the resource repository.
  • Create a new config-file with a default setup.

Config File Explanation

The rorschach system parses first the default config file config/config-default.yaml, do not edit this file. You can override every single value in this (default) config by creating a new file and naming it config/config.yaml.

We provide a sample config file which is copied to the config/ directory. This config file runs the model named EncDecAtt in the thesis. We create datasets with sizes 10k, 1k, and 10k for training, validation, and testing sets.

To override a setting, be sure to have the same YAML structure in the other config file, as these are built hierarchical.

Data Location

The location where the place all the data is defined by the config file. By default this will be placed in three subdirectories in the data/ directory, but you can override this if you like.

There are three directories that separates the data:

  • data/: This directory holds the data related to the datasets. This includes pickled binary files, as well as all the words in the datasets encoded as json files. There are a few more files for various purposes, but those are not that important.
  • image/: This directory holds the images of the dataset that was generated (if specified in the config file). Three subdirectories holds the orignal, big, images (image/canvas/), the cropped images (image/cropped/), and finally the signatures (image/signatures/).
  • output/: Holds the output for each run of the system, either testing or training. Every run gets is own unique id (uid) that contains the date, time, and a random string of letters. See usage of this uid below. These directories contains various information, logs, and the weights for the models.

Handling uids

Say you have done training on a model and the uid for this run is "2017-05-16-11-58-31-jkavbo", if you want to continue, predict, or test on this run, you set the mode in the config file to reflect this, and when you run the system you are asked to input the uid that you wish to use. You can also specify the uid at the root level in the config yaml file. This prevents the system from prompting the user.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages