# Training Models on Colab

Install TensorFlow and Numpy

In [None]:
!pip install --upgrade pip
!pip install --upgrade protobuf 

Requirement already up-to-date: pip in /usr/local/lib/python3.6/dist-packages (20.2.2)
Requirement already up-to-date: protobuf in /usr/local/lib/python3.6/dist-packages (3.13.0)


In [None]:
%tensorflow_version 1.15
import tensorflow as tf
print(tf.__version__)

!pip install numpy

`%tensorflow_version` only switches the major version: 1.x or 2.x.
You set: `1.15`. This will be interpreted as: `1.x`.


TensorFlow 1.x selected.
1.15.2


Check GPU status

In [None]:
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
   raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


In [None]:
# memory footprint support libraries/code
!ln -sf /opt/bin/nvidia-smi /usr/bin/nvidia-smi
!pip install gputil
!pip install psutil
!pip install humanize
import psutil
import humanize
import os
import GPUtil as GPU
GPUs = GPU.getGPUs()
# XXX: only one GPU on Colab and isn’t guaranteed
gpu = GPUs[0]
def printm():
 process = psutil.Process(os.getpid())
 print("Gen RAM Free: " + humanize.naturalsize( psutil.virtual_memory().available ), " | Proc size: " + humanize.naturalsize( process.memory_info().rss))
 print("GPU RAM Free: {0:.0f}MB | Used: {1:.0f}MB | Util {2:3.0f}% | Total {3:.0f}MB".format(gpu.memoryFree, gpu.memoryUsed, gpu.memoryUtil*100, gpu.memoryTotal))
printm() 

Gen RAM Free: 12.5 GB  | Proc size: 465.4 MB
GPU RAM Free: 14968MB | Used: 111MB | Util   1% | Total 15079MB


Mount Google Drive folder

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

# change to working tensorflow directory on the drive
%cd '/content/gdrive/My Drive/models/'

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
/content/gdrive/My Drive/models


Install protobuf and compile, install setup.py

In [None]:
!apt-get install protobuf-compiler python-pil python-lxml python-tk
!pip install Cython
%cd /content/gdrive/My Drive/models/research/
!protoc object_detection/protos/*.proto --python_out=.

import os
os.environ['PYTHONPATH'] += ':/content/gdrive/My Drive/models/research/:/content/gdrive/My Drive/models/research/slim'

!python setup.py build
!python setup.py install

Reading package lists... Done
Building dependency tree       
Reading state information... Done
protobuf-compiler is already the newest version (3.0.0-9.1ubuntu1).
python-lxml is already the newest version (4.2.1-1ubuntu0.1).
python-pil is already the newest version (5.1.0-1ubuntu0.3).
python-tk is already the newest version (2.7.17-1~18.04).
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.
/content/gdrive/My Drive/models/research
running build
running build_py
copying object_detection/protos/argmax_matcher_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/anchor_generator_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/box_predictor_pb2.py -> build/lib/object_detection/protos
copying object_detection/protos/calibration_pb2.py -> build/lib/object_detection/protos
copying object_detecti

Check remaining GPU time

In [None]:
import time, psutil
Start = time.time()- psutil.boot_time()
Left= 12*3600 - Start
print('Time remaining for this session is: ', Left/3600)

Time remaining for this session is:  11.311591857274374


Start training

In [None]:
!pip install tf_slim
%cd /content/gdrive/My Drive/models/research/object_detection
os.environ['PYTHONPATH'] += ':/content/gdrive/My Drive/models/research/:/content/gdrive/My Drive/models/research/slim'

!python train.py --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config --logtostderr

[1;30;43mLe flux de sortie a été tronqué et ne contient que les 5000 dernières lignes.[0m
I0830 10:58:32.461506 140698845296512 learning.py:512] global step 8815: loss = 0.9356 (0.755 sec/step)
INFO:tensorflow:global step 8816: loss = 1.0581 (0.556 sec/step)
I0830 10:58:33.019310 140698845296512 learning.py:512] global step 8816: loss = 1.0581 (0.556 sec/step)
INFO:tensorflow:global step 8817: loss = 0.6653 (0.463 sec/step)
I0830 10:58:33.484908 140698845296512 learning.py:512] global step 8817: loss = 0.6653 (0.463 sec/step)
INFO:tensorflow:global step 8818: loss = 0.9778 (0.557 sec/step)
I0830 10:58:34.044604 140698845296512 learning.py:512] global step 8818: loss = 0.9778 (0.557 sec/step)
INFO:tensorflow:global step 8819: loss = 0.9110 (0.685 sec/step)
I0830 10:58:34.731525 140698845296512 learning.py:512] global step 8819: loss = 0.9110 (0.685 sec/step)
INFO:tensorflow:global step 8820: loss = 0.6711 (1.464 sec/step)
I0830 10:58:36.208013 140698845296512 learning.py:512] global s

Export inference graph

In [None]:
#  .ckpt needs to be updated every time to match last .ckpt generated
#  .config needs to be updated when changing model
!python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/ssd_mobilenet_v1_pets.config --trained_checkpoint_prefix training/model.ckpt-11262 --output_directory new_graph

Instructions for updating:
Please use `layer.__call__` method instead.
W0830 11:27:34.137526 139632531666816 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0830 11:27:35.581749 139632531666816 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0830 11:27:35.620036 139632531666816 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0830 11:27:35.655674 139632531666816 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv befo

Zip file in Google Drive

In [2]:
!zip -r model_graph.zip new_graph


zip error: Nothing to do! (try: zip -r model_graph.zip . -i new_graph)
