Readme (#9)

* Update README.md Slight improvment and adding a description of the input format. Adresses #8
MarvinTeichmann · Feb 16, 2017 · 3927ad3 · 3927ad3
1 parent a1bf8a4
commit 3927ad3
Show file tree

Hide file tree

Showing 2 changed files with 46 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -57,11 +57,30 @@ Run: `python train.py` to train a new model on the Kitti Data.
 
 If you like to understand the code, I would recommend looking at [demo.py](demo.py) first. I have documented each step as  	thoroughly as possible in this file.
 
-### Modifying Model & Train on your own data
 
-The model is controlled by the file `hypes/KittiSeg.json`. Modifying this file should be enough to train the model on your own data and adjust the architecture according to your needs. You can create a new file `hypes/my_hype.json` and train that architecture using:
+### Manage Data Storage
+
+KittiSeg allows to separate data storage from code. This is very useful in many server environments. By default, the data is stored in the folder `KittiSeg/DATA` and the output of runs in `KittiSeg/RUNS`. This behaviour can be changed by setting the bash environment variables: `$TV_DIR_DATA` and `$TV_DIR_RUNS`.
+
+Include  `export TV_DIR_DATA="/MY/LARGE/HDD/DATA"` in your `.profile` and the all data will be downloaded to `/MY/LARGE/HDD/DATA/data_road`. Include `export TV_DIR_RUNS="/MY/LARGE/HDD/RUNS"` in your `.profile` and all runs will be saved to `/MY/LARGE/HDD/RUNS/KittiSeg`
+
+### RUNDIR and Experiment Organization
+
+KittiSeg helps you to organize large number of experiments. To do so the output of each run is stored in its own rundir. Each rundir contains:
 
-`python train.py --hypes hypes/my_hype.json`
+* `output.log` a copy of the training output which was printed to your screen
+* `tensorflow events` tensorboard can be run in rundir
+* `tensorflow checkpoints` the trained model can be loaded from rundir
+* `[dir] images` a folder containing example output images. `image_iter` controls how often the whole validation set is dumped
+* `[dir] model_files` A copy of all source code need to build the model. This can be very useful of you have many versions of the model.
+
+To keep track of all the experiments, you can give each rundir a unique name with the `--name` flag. The `--project` flag will store the run in a separate subfolder allowing to run different series of experiments. As an example, `python train.py --project batch_size_bench --name size_5` will use the following dir as rundir:  `$TV_DIR_RUNS/KittiSeg/batch_size_bench/size_5_KittiSeg_2017_02_08_13.12`.
+
+Use the flag `--nosave` if you do not want to save all output in an rundir. This is very useful for debugging, if you are not interested in the actual output and you do not want to spam your `rundir`. `--nosave` will use the folder `$TV_DIR_RUNS/debug` as output. So you can still few the rundir, but it will be overwritten by the next `--nosave` run.
+
+### Modifying Model & Train on your own data
+
+The model is controlled by the file `hypes/KittiSeg.json`. Modifying this file should be enough to train the model on your own data and adjust the architecture according to your needs. A description of the expected input format can be found [here](inputs/inputs.md). I would advise to creat a new hype file `hypes/my_hype.json` for your input data and start trainining by running: `python train.py --hypes hypes/my_hype.json`
 
 
 
@@ -80,15 +99,7 @@ For advanced modifications, the code is controlled by 5 different modules, which
 Those modules operate independently. This allows easy experiments with different datasets (`input_file`), encoder networks (`architecture_file`), etc. Also see [TensorVision](http://tensorvision.readthedocs.io/en/master/user/tutorial.html#workflow) for a specification of each of those files.
 
 
-## Managing Folders
-
-By default, the data is stored in the folder `KittiSeg/DATA` and the output of runs in `KittiSeg/RUNS`. This behaviour can be changed by setting the bash environment variables: `$TV_DIR_DATA` and `$TV_DIR_RUNS`.
-
-Include  `export TV_DIR_DATA="/MY/LARGE/HDD/DATA"` in your `.profile` and the all data will be downloaded to `/MY/LARGE/HDD/DATA/data_road`. Include `export TV_DIR_RUNS="/MY/LARGE/HDD/RUNS"` in your `.profile` and all runs will be saved to `/MY/LARGE/HDD/RUNS/KittiSeg`
-
-For organizing multiple experiments the flags `--project` and `--name` are very helpful. 
 
-`python train.py --project batch_size_bench --name size_5` will save all training output to:  `$TV_DIR_RUNS/KittiSeg/batch_size_bench/size_5_KittiSeg_2017_02_08_13.12`.
 
 
 ## Utilize TensorVision backend

diff --git a/inputs/inputs.md b/inputs/inputs.md
@@ -0,0 +1,24 @@
+## How to train on your own data
+
+### Easy way
+
+The easiest way is to provide data in a similar way to the kitti data. To do that create files `train` and `val` similar to [train3.txt](../data/train3.txt). Each line of this file is supposed to contain a path to an image and a path to the corresponding ground truth. 
+
+The ground truth file is assumed to be an image. By default `red` is considered as `background` and `purple` as foreground. All other colours are considered as 'unknown', the loss from those pixels are ignored during training. You can configure those colours in the `hype` file by changing
+
+```
+  "data": {
+    "road_color" : [255,0,255],
+    "background_color" : [255,0,0]
+  },
+```
+
+
+### Hard way
+
+The disadvantage of the easy way is, that it only works for binary segmentation problems (i.e. two classes). The alternative is to write you own input producer and evaluation file. All other files are independent of the data. 
+
+In (kitti_seg_input.py)[kitti_seg_input.py] the actual data is loaded in the functions *_make_data_gen* and *_load_gt_file*. If you modify those you should be able to load any kind of dataset. 
+
+The eval file 'kitti_eval.py' is designed to utilize the original evaluation code provided by the kitti road detection benchmark. If you train on your own data with different evaluation metrics I recommend using your own evaluation code. 
+