Skip to content
This repository has been archived by the owner on Jun 3, 2020. It is now read-only.

Model inference #62

Merged
merged 7 commits into from
May 2, 2018
Merged

Model inference #62

merged 7 commits into from
May 2, 2018

Conversation

delhomer
Copy link
Contributor

This PR introduces a new module that will make the inference easier in further developments (for instance, in a web app).

Until now, inference was done only after training a model. It was possible to do inference only, by passing 0 to nb_epochs, or alternatively a number smaller than the checkpoint training step, if a backup exists. This was quite unpractical, as the program arguments are training-focused.

Here we can predict labels on a given image by simply entering the following command (as an example):

python deeposlandia/inference.py -D dataset -i path_to_image/image_name.png

Additionally, the prepare_folders function has been splitted into three more precise functions:

  • prepare_input_folder, which is called during dataset generation (cf deeposlandia/datagen.py).
  • prepare_prepro_folder which is called during dataset generation and training process: such directories are filled during the former and the images that they contain are scanned during the latter.
  • prepare_output_folder which is called when results are produced, either during training process (model backup creations) or during inference (model backup recovering).

…reation

Control the input, preprocessed and output repositories separately; use a temp directory to do the tests
Split the legacy function `utils.prepare_folders` into 3 smaller
function `utils.prepare_input_folder`,
`utils.prepare_preprocessed_folder` and
`utils.prepare_output_folder`. This modification makes the code more
flexible, as we could need only one or two of them: *e.g.* if we
create a dataset, it is useless to consider `output`
subdirectory. This commit is test compliant.
…n functions

Modify `deeposlandia/kerastrain.py` and `deeposlandia/datagen.py` scripts so as to consider the split of
`utils.prepare_folders` function, and new way of returning folders in
dataset creation and model training programs.
This commit creates a module that predicts labels on
passed-as-argument images. The image argument can be a list, and it is
regex-compliant; it is possible to pass several images with a path
like `datapath/img_000*.png`. For instance the resulting labels are
only printed onto console.
@delhomer delhomer requested a review from garaud April 27, 2018 16:45
@delhomer delhomer mentioned this pull request Apr 27, 2018
5 tasks
parser = add_instance_arguments(parser)
args = parser.parse_args()

image_paths = [item for sublist in [glob.glob(f) for f in args.image_paths]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can turn these comprehension lists into generator expressions. Such as:

(item for sublist in (glob.glob(f) for f in args.image_paths) for item in sublist)

Moreover, it's not clear for me, at least when I read the first time, that you want to flatten the list of image files. Can I propose something like:

images_paths = (glob.glob(f) for f in args.image_paths)
# flatten the result of [[image1, image2], [image10, image11]]
image_paths = itertools.chain(*image_paths)

with the great package itertools

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, itertools is really convenient. However I'm not sure to want a iterator there.

The iterator is consumed in the first loop, when the x_test variable is built. But its items are not available any more in the last loop, when results are printed: how do we get the image filenames if such a structure is chosen?

image_size,
aggregate_value)

print( prepro_folder['training_config'] )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the print. Or replace it by a logging message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless 'print' statement, it has been removed.

Before this commit, the checkpoint path was recovered as the max of
checkpoints in the alphanumeric order. However it uses `os.listdir`,
that keeps subdirectories. The checkpoint recovering may fail when the
max item is a directory. The new implementation fixes this point, by
considering only files.
Remove a useless print statement, and make the `image_paths`
variable construction clearer.
@delhomer delhomer merged commit dbb2958 into master May 2, 2018
@delhomer delhomer deleted the model_inference branch May 2, 2018 13:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants