#Mobile Real-time Portrait Segmentation

Follow the instructions to train and download the models.

You can also follow the same instructions to train and convert the models on your machine. If that is the case the following are the required libraries:

- Tensorflow-GPU with minimum v1.11.0 
- Tensorflow-CPU with minimum v1.11.0
- Python 2.7 or 3.5

First download the repository of the UNET. 

In [1]:
!git clone https://github.com/gallifilo/final-year-project/

Cloning into 'final-year-project'...
remote: Enumerating objects: 216, done.[K
remote: Counting objects: 100% (216/216), done.[K
remote: Compressing objects: 100% (150/150), done.[K
remote: Total 6213 (delta 77), reused 179 (delta 60), pack-reused 5997[K
Receiving objects: 100% (6213/6213), 507.45 MiB | 32.58 MiB/s, done.
Resolving deltas: 100% (669/669), done.
Checking out files: 100% (5174/5174), done.


The repository has already the Portraits dataset ready to use.

We can now manipulate the Portrait dataset by performing **image augmentation **.

With Image augmentation, we will have more images in our dataset, by flipping, cropping, adding grain to the images.

In [2]:
%cd final-year-project/unetTensorflowLite
!ls

/content/final-year-project/unetTensorflowLite
1.jpg		    main.py		    testEpochs32.csv
data_set	    MaskSquared1.jpg	    testImageDataset.py
epochsResults.txt   model_classes	    test_model.py
freeze_graph.py     model_classes_infer     tflite_create.txt
freezepbscript.txt  models		    tflite_test.py
images		    model_test_accuracy.py  tf_test.py
IncreaseImage.py    README.md		    trainEpochs32.csv
LICENSE		    ReleaseNew		    util
LossSet.csv	    runEvaluation.command
main_infer.py	    Squared1.jpg


In [0]:
!python3 IncreaseImage.py

To train run main.py and specify different arguments

*   `--model_id int` : specify which architecture you would like to train, 1 is Standard UNet Squared, 2 is Aspect Ratio, 3 is Half Conv and 4 is Bigger Strides
*   `--batchsize int`: the size of the batch
*  `--gpu`: if you would like to run the training on the GPU
* `--epoch int`: number of epochs
* `--trainrate float`: specifies how to split the training and testing data, default is 0.85, 85% training and 15% testing 
* `--augmentation`: if you have performed augmentation of the dataset this flag is required 






In [3]:
!python3 main.py --gpu --batchsize 32 --epoch 2 --model_id 2

[96, 128]
Model ID: 2
Squared: False
test: False
Loading original images........ Completed
Loading segmented images........ Completed
Casting to one-hot encoding... Done
palette
None
2019-04-08 00:37:52.671443: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-04-08 00:37:52.671851: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x31eb340 executing computations on platform Host. Devices:
2019-04-08 00:37:52.671890: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-04-08 00:37:52.832235: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-04-08 00:37:52.832788: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x31eadc0 executing computations on platform CUDA. Devices:
2019-04-08 00:37:52.832841: I tensorflow/compiler/

After the model is trained, we need to compress it and save it into a .pbtxt file called semanticsegmentation_person.pbtxt in the folder models.

To do this we run main_infer.py and specify the `--model_id` int argument with the specific model id as it was used for training. 1 is Standard UNet Squared, 2 is Aspect Ratio, 3 is Half Conv and 4 is Bigger Strides.


In [4]:
!python main_infer.py --model_id 2


For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

Instructions for updating:
Use keras.layers.conv2d instead.
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use keras.layers.batch_normalization instead.
Instructions for updating:
Use keras.layers.max_pooling2d instead.
Instructions for updating:
Use keras.layers.conv2d_transpose instead.
2019-04-08 00:40:17.910901: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-04-08 00:40:17.911194: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x1f1be40 executing computations on platform Host. Devices:
2019-04-08 00:40:17.911234: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-04-08 00:40:18.01144

Then we need to freeze the model which creates a GraphDef .pb file in the models folder

In [5]:
!python3 freeze_graph.py \
--input_graph=models/semanticsegmentation_person.pbtxt \
--input_checkpoint=models/deployfinal.ckpt \
--output_graph=models/semanticsegmentation_frozen_person_latest.pb \
--output_node_names=output/BiasAdd \
--input_binary=False


Instructions for updating:
Use tf.gfile.GFile.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-04-08 00:40:26.153410: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-04-08 00:40:26.153670: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x280d860 executing computations on platform Host. Devices:
2019-04-08 00:40:26.153710: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-04-08 00:40:26.247916: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-04-08 00:40:26.248500: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x280d440 executing computations on platform CUDA. Devices:
2019-04-08 00:40:26.248540: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor devi

Finally convert the GraphDef model into a .tflite model.

it is important to specify the input_shapes of the model: 

For model_id 1 the Standard UNet Square the input shape is `--input_shapes=1,128,128,3`

For model_id 2, 3 and 4 is `--input_shapes=1,128,96,3`

In [6]:
!tflite_convert \
--graph_def_file=models/semanticsegmentation_frozen_person_latest.pb  \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--output_file=models/semanticsegmentation_frozen_quantized_32_new.tflite \
--input_shapes=1,128,96,3 \
--inference_type=FLOAT \
--input_type=FLOAT \
--input_arrays=input \
--output_arrays=output/BiasAdd \
--post_training_quantize


2019-04-08 00:40:37.981578: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-04-08 00:40:37.982080: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x558f3e583760 executing computations on platform Host. Devices:
2019-04-08 00:40:37.982179: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-04-08 00:40:38.096574: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-04-08 00:40:38.097294: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x558f3e584aa0 executing computations on platform CUDA. Devices:
2019-04-08 00:40:38.097340: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2019-04-08 00:40:38.097871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:143

After the conversion the tflite file can be found under  `final-year-project/unetTensorflowLite/models` folder, with the name of semanticsegmentation_frozen_quantized_32_new.tflite 

You can right click and download it. 


You can also run a quick test of accuracy test with tflite model you have just generated.

make sure to use 
`--init_size 96 128` for model with id 1

or `--init_size 96 128 \ --no_squared `  for models with id 2, 3 and 4 


In [7]:
!python3 model_test_accuracy.py \
--model_path models/semanticsegmentation_frozen_quantized_32_new.tflite \
--tflite \
--init_size 96 128 \
--no_squared

(96, 128)
Squared: False
tflite: True
test: False
Loading original images........ Completed
Loading segmented images........ Completed
Casting to one-hot encoding... tcmalloc: large alloc 1569603584 bytes == 0x5fcc0000 @  0x7f31673df1e7 0x7f3165115a41 0x7f3165178c13 0x7f3165178cda 0x7f3165207cc4 0x7f31652080d0 0x506b39 0x502209 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x501b2e 0x591461 0x54b813 0x555421 0x5a730c 0x503073 0x507641 0x504c28 0x502540 0x502f3d 0x507641 0x502209 0x502f3d 0x506859 0x504c28 0x506393
Done
palette
None
2019-04-08 00:42:16.728870: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-04-08 00:42:16.729139: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2ab3340 executing computations on platform Host. Devices:
2019-04-08 00:42:16.729181: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
2019-04-08 00:42:16.825728: I tensorflow/stream_