# 1.命名和域

## 命名变量和张量

每一次调用 tf.get_variable() 时，都需要为变量赋予一个新的唯一名称。实际上，图中的每个张量也需要一个唯一的名称。可以通过张量、操作和变量的 .name 属性访问该名称。绝大多数情况下，名称会自动创建；例如，一个常量节点会以 Const 命名，当创建更多的常量节点时，其名称将是 Const_1, Const_2等。还可以通过 name=的属性设置节点名称。列举后缀仍会自动添加：

In [1]:
import tensorflow as tf

In [2]:
a = tf.constant(0.)
b = tf.constant(1.)
c = tf.constant(2., name='cool_const')
d = tf.constant(3., name='cool_const')

In [3]:
print('a.name:',a.name)
print('b.name:',b.name)
print('c.name:',c.name)
print('d.name:',d.name)

a.name: Const:0
b.name: Const_1:0
c.name: cool_const:0
d.name: cool_const_1:0


虽然节点命名并非必要，但是在调试时非常有用。当 Tensorflow 代码崩溃时，error trace将指向一个特定的操作。如果有很多同类型的操作，那么很难确定是哪一个出了问题。而通过明确的命名每一个节点，可以获得信息详细的 error trace, 并更快地识别问题。

## 适用范围

随着图形的越来越复杂，手动命名所有的内容将变得愈加困难。Tensorflow 提供 tf.variable_scope对象，它通过将图分为更小的组块，使图更容易梳理。通过将一段图形创建代码封装在 with tf.variable_scope(scope_name):语句中，创建的所有节点都将自动以 scope_name字符串作为前缀，此外，这些作用域堆栈，在另一个范围内创建的作用域会简单地将前缀链接在一起，用斜杠分隔。

In [4]:
with tf.variable_scope('first_scope'):
    c = a + b
    d = tf.constant(2.,name='cool_const')
    coef1 = tf.get_variable('coef',[],initializer=tf.constant_initializer(2.))
    with tf.variable_scope('second_scope'):
        e = coef1 * d
        coef2 = tf.get_variable('coef',[],initializer=tf.constant_initializer(3.))
        f = tf.constant(1.)
        g = coef2 * f
        

In [5]:
print('a.name:',a.name)
print('b.name:',b.name)
print('c.name:',c.name)
print('d.name:',d.name)
print('e.name:',e.name)
print('f.name:',f.name)
print('g.name:',g.name)
print('coef1.name:',coef1.name)
print('coef2.name:',coef2.name)

a.name: Const:0
b.name: Const_1:0
c.name: first_scope/add:0
d.name: first_scope/cool_const:0
e.name: first_scope/second_scope/mul:0
f.name: first_scope/second_scope/Const:0
g.name: first_scope/second_scope/mul_1:0
coef1.name: first_scope/coef:0
coef2.name: first_scope/second_scope/coef:0


coef1 和coef2 因为作用域不同所以都可以命名为coef，但这是两个不同的变量

# 2.保存和加载

训练好的神经网络包括两个基本组成部分：  
1.已经学习过某些任务优化的网络权重  
2.说明如何利用权重获得结果的网络图

## 保存模型

当只有单个模型时，Tensorflow用于保存和加载的内置工具使用很方便：只需创建一个tf.train.Saver()，类似于tf.train.Optimizer， tf.train.Saver本身并不是一个节点，而是在已有的图上执行有用功能的更高级类别。

In [6]:
a = tf.get_variable('a',[])
b = tf.get_variable('b',[])
init = tf.global_variables_initializer()
saver = tf.train.Saver()
sess = tf.Session()
sess.run(init)


In [7]:
saver.save(sess,'./model/tfcp.model')

'./model/tfcp.model'

![image](./image/model.png)

1.checkpoint 实际上不需要重建模型，但如果在整个训练过程中保存了多个版本的模型，他会跟踪所有的内容  
2.tfcp.model.data-00000-of-00001 包含模型权重  
3.tfcp.model.meta 是模型的网络结构，包含重建图所需要的所有信息  
4.tfcp.model.index 是连接网络结构和模型权重的索引结构。用于数据文件中找到对应节点的参数

## 加载模型

第一步是要重新创建变量：变量名称、形状和类型都与保存时一致；  
第二步是创建与之前一样的tf.train.Saver,并调用restore函数。

In [9]:
# a = tf.get_variable('a',[])
# b = tf.get_variable('b',[])
# saver = tf.train.Saver()
# sess = tf.Session()
saver.restore(sess,'./model/tfcp.model')
sess.run([a,b])

INFO:tensorflow:Restoring parameters from ./model/tfcp.model


[0.68942225, 0.95850384]

这里变量不需要初始化，restore运算将值从文件中移动到会话变量中

## 选择变量

当一个tf.train.Saver程序初始化后，它会查看当前图并获取变量列表。我们可以用._var_list  属性检查

In [10]:
print(saver._var_list)

[<tf.Variable 'first_scope/coef:0' shape=() dtype=float32_ref>, <tf.Variable 'first_scope/second_scope/coef:0' shape=() dtype=float32_ref>, <tf.Variable 'a:0' shape=() dtype=float32_ref>, <tf.Variable 'b:0' shape=() dtype=float32_ref>]


## 加载修正模型

In [21]:
tf.reset_default_graph()

只加载模型的一部分

In [22]:
import tensorflow as tf
a = tf.get_variable('a', [])
init = tf.global_variables_initializer()
saver = tf.train.Saver()
sess = tf.Session()
sess.run(init)
saver.restore(sess, './model/tfcp.model')
sess.run(a)

INFO:tensorflow:Restoring parameters from ./model/tfcp.model


0.68942225

只加载变量a，不加载变量b，新建变量d

In [25]:
tf.reset_default_graph()

In [26]:
import tensorflow as tf
a = tf.get_variable('a', [])
d = tf.get_variable('d', [])
init = tf.global_variables_initializer()
saver = tf.train.Saver()
sess = tf.Session()
sess.run(init)
saver.restore(sess, './model/tfcp.model')


INFO:tensorflow:Restoring parameters from ./model/tfcp.model


NotFoundError: Key d not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
	 [[Node: save/RestoreV2/_3 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_8_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op 'save/RestoreV2', defined at:
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\ProgramData\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-26-eef43957cbe1>", line 5, in <module>
    saver = tf.train.Saver()
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1338, in __init__
    self.build()
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1347, in build
    self._build(self._filename, build_save=True, build_restore=True)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1384, in _build
    build_save=build_save, build_restore=build_restore)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 835, in _build_internal
    restore_sequentially, reshape)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 472, in _AddRestoreOps
    restore_sequentially)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 886, in bulk_restore
    return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1546, in restore_v2
    shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in create_op
    op_def=op_def)
  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

NotFoundError (see above for traceback): Key d not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
	 [[Node: save/RestoreV2/_3 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_8_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]


犯错在于d没有出现在checkpoint中

In [31]:
tf.reset_default_graph()

In [32]:
import tensorflow as tf
d = tf.get_variable('d', [])
init = tf.global_variables_initializer()
saver = tf.train.Saver(var_list={'a': d})
sess = tf.Session()
sess.run(init)
saver.restore(sess, './model/tfcp.model')
sess.run(d)

INFO:tensorflow:Restoring parameters from ./model/tfcp.model


0.68942225

## 模型检查

查看原始变量是如何命名的：tf.contrib.framework.list_variables()

In [33]:
print(tf.contrib.framework.list_variables('./model/tfcp.model'))

[('a', []), ('b', []), ('first_scope/coef', []), ('first_scope/second_scope/coef', [])]
