Model inference #62

delhomer · 2018-04-27T16:45:27Z

This PR introduces a new module that will make the inference easier in further developments (for instance, in a web app).

Until now, inference was done only after training a model. It was possible to do inference only, by passing 0 to nb_epochs, or alternatively a number smaller than the checkpoint training step, if a backup exists. This was quite unpractical, as the program arguments are training-focused.

Here we can predict labels on a given image by simply entering the following command (as an example):

python deeposlandia/inference.py -D dataset -i path_to_image/image_name.png

Additionally, the prepare_folders function has been splitted into three more precise functions:

prepare_input_folder, which is called during dataset generation (cf deeposlandia/datagen.py).
prepare_prepro_folder which is called during dataset generation and training process: such directories are filled during the former and the images that they contain are scanned during the latter.
prepare_output_folder which is called when results are produced, either during training process (model backup creations) or during inference (model backup recovering).

…reation Control the input, preprocessed and output repositories separately; use a temp directory to do the tests

Split the legacy function `utils.prepare_folders` into 3 smaller function `utils.prepare_input_folder`, `utils.prepare_preprocessed_folder` and `utils.prepare_output_folder`. This modification makes the code more flexible, as we could need only one or two of them: *e.g.* if we create a dataset, it is useless to consider `output` subdirectory. This commit is test compliant.

…n functions Modify `deeposlandia/kerastrain.py` and `deeposlandia/datagen.py` scripts so as to consider the split of `utils.prepare_folders` function, and new way of returning folders in dataset creation and model training programs.

This commit creates a module that predicts labels on passed-as-argument images. The image argument can be a list, and it is regex-compliant; it is possible to pass several images with a path like `datapath/img_000*.png`. For instance the resulting labels are only printed onto console.

garaud · 2018-04-30T08:21:00Z

deeposlandia/inference.py

+    parser = add_instance_arguments(parser)
+    args = parser.parse_args()
+
+    image_paths = [item for sublist in [glob.glob(f) for f in args.image_paths]


You can turn these comprehension lists into generator expressions. Such as:

(item for sublist in (glob.glob(f) for f in args.image_paths) for item in sublist)

Moreover, it's not clear for me, at least when I read the first time, that you want to flatten the list of image files. Can I propose something like:

images_paths = (glob.glob(f) for f in args.image_paths) # flatten the result of [[image1, image2], [image10, image11]] image_paths = itertools.chain(*image_paths)

with the great package itertools

I agree, itertools is really convenient. However I'm not sure to want a iterator there.

The iterator is consumed in the first loop, when the x_test variable is built. But its items are not available any more in the last loop, when results are printed: how do we get the image filenames if such a structure is chosen?

garaud · 2018-04-30T08:21:32Z

deeposlandia/inference.py

+                                                      image_size,
+                                                      aggregate_value)
+
+    print( prepro_folder['training_config'] )


remove the print. Or replace it by a logging message.

Useless 'print' statement, it has been removed.

Before this commit, the checkpoint path was recovered as the max of checkpoints in the alphanumeric order. However it uses `os.listdir`, that keeps subdirectories. The checkpoint recovering may fail when the max item is a directory. The new implementation fixes this point, by considering only files.

Remove a useless print statement, and make the `image_paths` variable construction clearer.

delhomer added 5 commits April 27, 2018 11:11

tests: add a bunch of unit test to control the dataset repositories c…

35483a4

…reation Control the input, preprocessed and output repositories separately; use a temp directory to do the tests

training: remove useless call to prepare_input_folder into kerastrain.py

b690c82

delhomer requested a review from garaud April 27, 2018 16:45

delhomer mentioned this pull request Apr 27, 2018

Prepare 0.4 release #63

Closed

5 tasks

garaud suggested changes Apr 30, 2018

View reviewed changes

delhomer added 2 commits May 2, 2018 11:42

inference: consider code review comments

b2763ea

Remove a useless print statement, and make the `image_paths` variable construction clearer.

garaud approved these changes May 2, 2018

View reviewed changes

delhomer merged commit dbb2958 into master May 2, 2018

delhomer deleted the model_inference branch May 2, 2018 13:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model inference #62

Model inference #62

delhomer commented Apr 27, 2018

garaud Apr 30, 2018

delhomer May 2, 2018

garaud Apr 30, 2018

delhomer May 2, 2018

Model inference #62

Model inference #62

Conversation

delhomer commented Apr 27, 2018

garaud Apr 30, 2018

Choose a reason for hiding this comment

delhomer May 2, 2018

Choose a reason for hiding this comment

garaud Apr 30, 2018

Choose a reason for hiding this comment

delhomer May 2, 2018

Choose a reason for hiding this comment