Added Mikula finetuning script #132

oumayb · 2018-09-21T15:17:49Z

Notebook for mikula dataset

mathieuboudreau · 2018-09-21T15:27:50Z

@oumayb Could you maybe add the config files as well, so that we can see the differences between _from_scratch and _finetuned?

coveralls · 2018-09-21T22:46:46Z

Coverage remained the same at 82.738% when pulling c1f42a8 on dev_mikula into a7da6a1 on master.

oumayb · 2018-09-22T22:36:52Z

@mathieuboudreau sure!
I was thinking about where to add them, maybe I could also add the trained model for the Mikula data, along with the config files, like it's done with the existing TEM and SEM models?

mathieuboudreau · 2018-09-24T22:30:03Z

@oumayb I don't know if the trained models should all be included here – are the files large?

As for config files, it may be worth having a separate folder (e.g. axondeepseg/configs/) that we can sort/store the config files separately from the core source code (currently SEM/TEM config files are inside the source code folder axondeepseg/AxonDeepSeg/models/). @jcohenadad thoughts?

jcohenadad · 2018-09-25T00:20:55Z

I would not copy Mikula's model under the GH's repository, because (i) it is still preliminary and (ii) we should instead store them somewhere else (along with the config files, which I believe should be side-by-side with the model). New issue here: #133.

@oumayb could you please indicate where the model currently is?

alexfoias · 2018-09-25T13:10:31Z

@jcohenadad & @mathieuboudreau for the moment the trained models & config files are located in duke/projects/axondeepseg/20180921_mikula. Do you want me to create the OSF platform similar to the one for SCT ? Maybe we can also migrate the current models from here on OSF.

mathieuboudreau · 2018-11-05T18:53:47Z

@oumayb I ran your Jupyter Notebook with the data located in duke/projects/axondeepseg/20180921_mikula as suggest by @alexfoias. The "From Scratch" section trained fine, but I encountered an error in the "Finetuned" section. As you didn't specify an exact folder location for the TEM model you were training from, I tried both the one located in this repo (axondeepseg/AxonDeepSeg/models/default_TEM_model_v1/), and the one on duke (duke/projects/axondeepseg/baselines/baseline_tem512-7678/).

The errors I got were not always the same each time I tried running the training, differening slightly every time. See the following for three cases:

NotFoundError (see above for traceback): Key cconv-d2-c2/convolution/bn/moving_mean not found in checkpoint
NotFoundError (see above for traceback): Key cconv-d2-c2/convolution/bn/moving_variance not found in checkpoint
NotFoundError (see above for traceback): Key cconv-d3-c2/convolution/bn/beta not found in checkpoint

And see below for the full error logs:

Run 1


('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 2, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 2, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 2, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 2, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 2, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 2, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 2, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 2, 'Features: ', [128, 128])
('Size:', 3)
Total number of parameters to train: 1953219
INFO:tensorflow:Restoring parameters from TEM_model_v1/model.ckpt
---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1326     try:
-> 1327       return fn(*args)
   1328     except errors.OpError as e:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1305                                    feed_dict, fetch_list, target_list,
-> 1306                                    status, run_metadata)
   1307 

/usr/lib64/python3.6/contextlib.py in __exit__(self, type, value, traceback)
     87             try:
---> 88                 next(self.gen)
     89             except StopIteration:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
    465           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466           pywrap_tensorflow.TF_GetCode(status))
    467   finally:

NotFoundError: Key cconv-d2-c2/convolution/bn/moving_mean not found in checkpoint
	 [[Node: save/RestoreV2_42 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_42/tensor_names, save/RestoreV2_42/shape_and_slices)]]
	 [[Node: save/RestoreV2_141/_3 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_330_save/RestoreV2_141", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
<ipython-input-5-2736b74cfde9> in <module>
----> 1 train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)

~/neuropoly/github/axondeepseg/AxonDeepSeg/train_network.py in train_model(path_trainingset, path_model, config, path_model_init, save_trainable, gpu, debug_mode, gpu_per)
    327         if path_model_init:
    328             folder_restored_model = path_model_init
--> 329             saver.restore(session, folder_restored_model + "/model.ckpt")
    330 
    331             if save_trainable:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path)
   1558     logging.info("Restoring parameters from %s", save_path)
   1559     sess.run(self.saver_def.restore_op_name,
-> 1560              {self.saver_def.filename_tensor_name: save_path})
   1561 
   1562   @staticmethod

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1122     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1123       results = self._do_run(handle, final_targets, final_fetches,
-> 1124                              feed_dict_tensor, options, run_metadata)
   1125     else:
   1126       results = []

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1319     if handle is None:
   1320       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1321                            options, run_metadata)
   1322     else:
   1323       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1338         except KeyError:
   1339           pass
-> 1340       raise type(e)(node_def, op, message)
   1341 
   1342   def _extend_graph(self):

NotFoundError: Key cconv-d2-c2/convolution/bn/moving_mean not found in checkpoint
	 [[Node: save/RestoreV2_42 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_42/tensor_names, save/RestoreV2_42/shape_and_slices)]]
	 [[Node: save/RestoreV2_141/_3 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_330_save/RestoreV2_141", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op 'save/RestoreV2_42', defined at:
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 505, in start
    self.io_loop.start()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 422, in run_forever
    self._run_once()
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 1432, in _run_once
    handle._run()
  File "/usr/lib64/python3.6/asyncio/events.py", line 145, in _run
    self._callback(*self._args)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 1233, in inner
    self.run()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
    yielded = self.gen.send(value)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 357, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 267, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 534, in execute_request
    user_expressions, allow_stdin,
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2819, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2845, in _run_cell
    return runner(coro)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 67, in _pseudo_sync_runner
    coro.send(None)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3020, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3191, in run_ast_nodes
    if (yield from self.run_code(code, result)):
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-5-2736b74cfde9>", line 1, in <module>
    train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)
  File "/home/mabou_local/neuropoly/github/axondeepseg/AxonDeepSeg/train_network.py", line 283, in train_model
    saver = tf.train.Saver(tf.model_variables())
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1140, in __init__
    self.build()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1172, in build
    filename=self._filename)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 688, in build
    restore_sequentially, reshape)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
    [spec.tensor.dtype])[0])
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
    dtypes=dtypes, name=name)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key cconv-d2-c2/convolution/bn/moving_mean not found in checkpoint
	 [[Node: save/RestoreV2_42 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_42/tensor_names, save/RestoreV2_42/shape_and_slices)]]
	 [[Node: save/RestoreV2_141/_3 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_330_save/RestoreV2_141", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Run 2


('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 2, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 2, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 2, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 2, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 2, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 2, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 2, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 2, 'Features: ', [128, 128])
('Size:', 3)
Total number of parameters to train: 1953219
INFO:tensorflow:Restoring parameters from TEM_model_v1/model.ckpt
---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1326     try:
-> 1327       return fn(*args)
   1328     except errors.OpError as e:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1305                                    feed_dict, fetch_list, target_list,
-> 1306                                    status, run_metadata)
   1307 

/usr/lib64/python3.6/contextlib.py in __exit__(self, type, value, traceback)
     87             try:
---> 88                 next(self.gen)
     89             except StopIteration:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
    465           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466           pywrap_tensorflow.TF_GetCode(status))
    467   finally:

NotFoundError: Key cconv-d2-c2/convolution/bn/moving_variance not found in checkpoint
	 [[Node: save/RestoreV2_43 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_43/tensor_names, save/RestoreV2_43/shape_and_slices)]]

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
<ipython-input-8-2736b74cfde9> in <module>
----> 1 train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)

~/neuropoly/github/axondeepseg/AxonDeepSeg/train_network.py in train_model(path_trainingset, path_model, config, path_model_init, save_trainable, gpu, debug_mode, gpu_per)
    327         if path_model_init:
    328             folder_restored_model = path_model_init
--> 329             saver.restore(session, folder_restored_model + "/model.ckpt")
    330 
    331             if save_trainable:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path)
   1558     logging.info("Restoring parameters from %s", save_path)
   1559     sess.run(self.saver_def.restore_op_name,
-> 1560              {self.saver_def.filename_tensor_name: save_path})
   1561 
   1562   @staticmethod

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1122     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1123       results = self._do_run(handle, final_targets, final_fetches,
-> 1124                              feed_dict_tensor, options, run_metadata)
   1125     else:
   1126       results = []

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1319     if handle is None:
   1320       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1321                            options, run_metadata)
   1322     else:
   1323       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1338         except KeyError:
   1339           pass
-> 1340       raise type(e)(node_def, op, message)
   1341 
   1342   def _extend_graph(self):

NotFoundError: Key cconv-d2-c2/convolution/bn/moving_variance not found in checkpoint
	 [[Node: save/RestoreV2_43 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_43/tensor_names, save/RestoreV2_43/shape_and_slices)]]

Caused by op 'save/RestoreV2_43', defined at:
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 505, in start
    self.io_loop.start()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 422, in run_forever
    self._run_once()
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 1432, in _run_once
    handle._run()
  File "/usr/lib64/python3.6/asyncio/events.py", line 145, in _run
    self._callback(*self._args)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 1233, in inner
    self.run()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
    yielded = self.gen.send(value)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 357, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 267, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 534, in execute_request
    user_expressions, allow_stdin,
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2819, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2845, in _run_cell
    return runner(coro)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 67, in _pseudo_sync_runner
    coro.send(None)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3020, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3191, in run_ast_nodes
    if (yield from self.run_code(code, result)):
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-2736b74cfde9>", line 1, in <module>
    train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)
  File "/home/mabou_local/neuropoly/github/axondeepseg/AxonDeepSeg/train_network.py", line 283, in train_model
    saver = tf.train.Saver(tf.model_variables())
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1140, in __init__
    self.build()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1172, in build
    filename=self._filename)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 688, in build
    restore_sequentially, reshape)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
    [spec.tensor.dtype])[0])
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
    dtypes=dtypes, name=name)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key cconv-d2-c2/convolution/bn/moving_variance not found in checkpoint
	 [[Node: save/RestoreV2_43 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_43/tensor_names, save/RestoreV2_43/shape_and_slices)]]

Run 3


('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 2, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 2, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 2, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 2, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 2, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 2, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 2, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 2, 'Features: ', [128, 128])
('Size:', 3)
Total number of parameters to train: 1953219
INFO:tensorflow:Restoring parameters from TEM_model_v1/model.ckpt
---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1326     try:
-> 1327       return fn(*args)
   1328     except errors.OpError as e:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1305                                    feed_dict, fetch_list, target_list,
-> 1306                                    status, run_metadata)
   1307 

/usr/lib64/python3.6/contextlib.py in __exit__(self, type, value, traceback)
     87             try:
---> 88                 next(self.gen)
     89             except StopIteration:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
    465           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466           pywrap_tensorflow.TF_GetCode(status))
    467   finally:

NotFoundError: Key cconv-d3-c2/convolution/bn/beta not found in checkpoint
	 [[Node: save/RestoreV2_55 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_55/tensor_names, save/RestoreV2_55/shape_and_slices)]]
	 [[Node: save/RestoreV2_39/_233 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_560_save/RestoreV2_39", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
<ipython-input-26-2736b74cfde9> in <module>
----> 1 train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)

~/neuropoly/github/axondeepseg/AxonDeepSeg/train_network.py in train_model(path_trainingset, path_model, config, path_model_init, save_trainable, gpu, debug_mode, gpu_per)
    327         if path_model_init:
    328             folder_restored_model = path_model_init
--> 329             saver.restore(session, folder_restored_model + "/model.ckpt")
    330 
    331             if save_trainable:

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py in restore(self, sess, save_path)
   1558     logging.info("Restoring parameters from %s", save_path)
   1559     sess.run(self.saver_def.restore_op_name,
-> 1560              {self.saver_def.filename_tensor_name: save_path})
   1561 
   1562   @staticmethod

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1122     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1123       results = self._do_run(handle, final_targets, final_fetches,
-> 1124                              feed_dict_tensor, options, run_metadata)
   1125     else:
   1126       results = []

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1319     if handle is None:
   1320       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1321                            options, run_metadata)
   1322     else:
   1323       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

~/venv_ads2/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1338         except KeyError:
   1339           pass
-> 1340       raise type(e)(node_def, op, message)
   1341 
   1342   def _extend_graph(self):

NotFoundError: Key cconv-d3-c2/convolution/bn/beta not found in checkpoint
	 [[Node: save/RestoreV2_55 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_55/tensor_names, save/RestoreV2_55/shape_and_slices)]]
	 [[Node: save/RestoreV2_39/_233 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_560_save/RestoreV2_39", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op 'save/RestoreV2_55', defined at:
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 505, in start
    self.io_loop.start()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 422, in run_forever
    self._run_once()
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 1432, in _run_once
    handle._run()
  File "/usr/lib64/python3.6/asyncio/events.py", line 145, in _run
    self._callback(*self._args)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 1233, in inner
    self.run()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
    yielded = self.gen.send(value)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 357, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 267, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 534, in execute_request
    user_expressions, allow_stdin,
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2819, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2845, in _run_cell
    return runner(coro)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 67, in _pseudo_sync_runner
    coro.send(None)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3020, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3191, in run_ast_nodes
    if (yield from self.run_code(code, result)):
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-26-2736b74cfde9>", line 1, in <module>
    train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)
  File "/home/mabou_local/neuropoly/github/axondeepseg/AxonDeepSeg/train_network.py", line 283, in train_model
    saver = tf.train.Saver(tf.model_variables())
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1140, in __init__
    self.build()
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1172, in build
    filename=self._filename)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 688, in build
    restore_sequentially, reshape)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
    tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
    [spec.tensor.dtype])[0])
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
    dtypes=dtypes, name=name)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/mabou_local/venv_ads2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key cconv-d3-c2/convolution/bn/beta not found in checkpoint
	 [[Node: save/RestoreV2_55 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_55/tensor_names, save/RestoreV2_55/shape_and_slices)]]
	 [[Node: save/RestoreV2_39/_233 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_560_save/RestoreV2_39", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

@oumayb do you know what might be causing this error? Maybe @perone too?

oumayb · 2018-11-05T19:09:57Z

@mathieuboudreau this usually happens when we're finetuning a model using another one that doesn't have the exact same architecture, so I think the problem comes from the config files that were used. I'll look more into it and get back to you!

mathieuboudreau · 2018-11-05T19:11:08Z

@oumayb Ok that makes sense – I was wondering how we could have reused the TEM model to finetune the SEM one, which has a different architechture. Thanks!

oumayb · 2018-11-05T19:12:39Z

@mathieuboudreau yes exactly, they differ in the depth and in the number of convolutions per layer, which should be taken into account in the config files.

mathieuboudreau · 2018-11-05T19:14:05Z

@oumayb Ok! That wasn't clear to me when reading the Jupyter Notebook + using the files that @alexfoias suggested. If you kept the most up-to-date files somewhere, please let me know!

oumayb · 2018-11-05T19:21:14Z

@mathieuboudreau sure! I'll get back to you soon with the files

oumayb · 2018-11-11T22:24:56Z

@mathieuboudreau convolution_per_layer should be [2, 2, 2, 2] instead of [3, 3, 3, 3] when finetuning from the TEM model. I hope this works now!

mathieuboudreau · 2018-11-30T19:54:27Z

Thanks @oumayb ! That helped me get a little bit further, but now I'm encountering another error:


('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
('Layer: ', 0, ' Conv: ', 0, 'Features: ', [1, 16])
('Size:', 5)
('Layer: ', 0, ' Conv: ', 1, 'Features: ', [16, 16])
('Size:', 5)
('Layer: ', 1, ' Conv: ', 0, 'Features: ', [16, 32])
('Size:', 3)
('Layer: ', 1, ' Conv: ', 1, 'Features: ', [32, 32])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 0, 'Features: ', [32, 64])
('Size:', 3)
('Layer: ', 2, ' Conv: ', 1, 'Features: ', [64, 64])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 0, 'Features: ', [64, 128])
('Size:', 3)
('Layer: ', 3, ' Conv: ', 1, 'Features: ', [128, 128])
('Size:', 3)
Total number of parameters to train: 1552387
INFO:tensorflow:Restoring parameters from TEM_model_v1/model.ckpt
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-42-2736b74cfde9> in <module>
----> 1 train_model(path_trainingset=path_trainingset, path_model=path_model, config=config, path_model_init=path_model_init)

~/neuropoly/github/axondeepseg_mikula/AxonDeepSeg/train_network.py in train_model(path_trainingset, path_model, config, path_model_init, save_trainable, gpu, debug_mode, gpu_per)
    333 
    334             file = open(folder_restored_model + '/evolution.pkl', 'r')
--> 335             evolution_restored = pickle.load(file)
    336             last_epoch = evolution_restored["steps"][-1]
    337         # Else, initializing the variables

TypeError: a bytes-like object is required, not 'str'

It looks like it has trouble opening the evolution file of the old model that I'm trying to finetune from. You didn't indicate where the model you were finetuning from was located (just its name TEM_model_v1), so I simply tried using the baseline TEM one (with the green label) on duke (duke/projects/axondeepseg/baselines/baseline_tem5127678). I also tried using the default one in our repo (axondeepseg/AxonDeepSeg/models/default_TEM_model_v1), but since the project's gitignore file used to ignore evolution.pkl files, I can't finetune from it either since evolution.pkl is missing from that folder.

@oumayb or @perone have you encountered this issue? Not sure if this is a Python 2 to 3 conversion functionality difference (seems like file is not being loaded in byte mode by default?). I tried opening the file with these lines below instead on the command line, which works, but want to make sure that this is a bug before trying to fix it:

file = open(path_model_init + '/evolution.pkl', 'rb')
evolution_restored = pickle.load(file, encoding='bytes')

@oumayb which version of ADS were you using when doing your finetuning work, do you know? Were you using the Python 2 or Python 3 version?

mathieuboudreau · 2018-11-30T20:11:41Z

Hmm, loaded using the lines above doesn't seem to work correctly, as the field "steps" isn't loaded, which is called in the next line. There must be something that else I'm missing that you were doing @oumayb ?

mathieuboudreau · 2018-11-30T22:47:19Z

Ok, finally got it working. You must have been using the Python 2 version of ADS when originally training your finetuned models.

The old model which we're finetuning from was saved using Python 2, and there is a mismatch in how pickle deals with the pickling/unpickling files (and in particular, numpy arrays) between Python 2/3. I think I dealt with this once before while writing the unit tests, but the lines mentioned above were some of the few that don't have coverage yet, so this bug existed since the Python 3 upgrade.

I had to change these lines:

https://github.com/neuropoly/axondeepseg/blob/432af8265acc02401d8edd1e93e6e45b19471dbe/AxonDeepSeg/train_network.py#L334-L335

to this:

file = open(path_model_init + '/evolution.pkl', 'rb')
evolution_restored = pickle.load(file, encoding='latin1')

and then the training started successfully.

I'll have to make a similar change, write a test(s), and commit it. The lines above might be enough, I just need to check to make sure they would work correctly with an evolution file pickled in Python 3 as well.

jcohenadad · 2018-12-02T01:37:46Z

Thank you so much for taking care of that @mathieuboudreau!!! 👍

mathieuboudreau · 2019-11-11T14:01:04Z

I think this PR has gotten too stale to merge. In particular, I think @vs74's new Keras implementation means we'll need to use a different way to finetune. While we should still keep this PR in consideration moving forward, I'm going to close it for now.

Added Mikula finetuning script

280ad01

oumayb requested a review from jcohenadad September 21, 2018 15:17

oumayb requested a review from mathieuboudreau September 22, 2018 22:34

mathieuboudreau added 5 commits September 25, 2018 15:18

Increment version, set it dev

5d9003a

Merge branch 'master' into dev_mikula

e397dec

Merge branch 'master' into dev_mikula

4a2611e

Merge branch 'master' into dev_mikula

d80619b

Merge branch 'master' into dev_mikula

2a4c7f0

Merge branch 'master' into dev_mikula

880ecca

jcohenadad added the priority:LOW label Mar 13, 2019

vasudev-sharma added 2 commits July 27, 2019 15:20

Merge branch 'master' into dev_mikula

59de039

Merge branch 'master' into dev_mikula

c1f42a8

mathieuboudreau closed this Nov 11, 2019

vasudev-sharma deleted the dev_mikula branch November 4, 2020 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Mikula finetuning script #132

Added Mikula finetuning script #132

oumayb commented Sep 21, 2018

mathieuboudreau commented Sep 21, 2018

coveralls commented Sep 21, 2018 •

edited

Loading

oumayb commented Sep 22, 2018

mathieuboudreau commented Sep 24, 2018

jcohenadad commented Sep 25, 2018

alexfoias commented Sep 25, 2018 •

edited

Loading

mathieuboudreau commented Nov 5, 2018

oumayb commented Nov 5, 2018

mathieuboudreau commented Nov 5, 2018

oumayb commented Nov 5, 2018

mathieuboudreau commented Nov 5, 2018

oumayb commented Nov 5, 2018

oumayb commented Nov 11, 2018

mathieuboudreau commented Nov 30, 2018

mathieuboudreau commented Nov 30, 2018

mathieuboudreau commented Nov 30, 2018

jcohenadad commented Dec 2, 2018

mathieuboudreau commented Nov 11, 2019

Added Mikula finetuning script #132

Added Mikula finetuning script #132

Conversation

oumayb commented Sep 21, 2018

mathieuboudreau commented Sep 21, 2018

coveralls commented Sep 21, 2018 • edited Loading

oumayb commented Sep 22, 2018

mathieuboudreau commented Sep 24, 2018

jcohenadad commented Sep 25, 2018

alexfoias commented Sep 25, 2018 • edited Loading

mathieuboudreau commented Nov 5, 2018

oumayb commented Nov 5, 2018

mathieuboudreau commented Nov 5, 2018

oumayb commented Nov 5, 2018

mathieuboudreau commented Nov 5, 2018

oumayb commented Nov 5, 2018

oumayb commented Nov 11, 2018

mathieuboudreau commented Nov 30, 2018

mathieuboudreau commented Nov 30, 2018

mathieuboudreau commented Nov 30, 2018

jcohenadad commented Dec 2, 2018

mathieuboudreau commented Nov 11, 2019

coveralls commented Sep 21, 2018 •

edited

Loading

alexfoias commented Sep 25, 2018 •

edited

Loading