# variables

following along with [this page](https://www.tensorflow.org/programmers_guide/variables)

a variable is a way to "represent sharedm persistent state", which is not a bad idea

they are defined by the `tf.Variable` class, which is simply a mutable subclass of `tf.Tensor`. another important difference: most `tf.Tensor` objects exist only within a single `tf.Session.run` call, a variable exists outside them as well

In [1]:
import tensorflow as tf

import utils

  from ._conv import register_converters as _register_converters


## creating a variable

easy:

In [2]:
my_variable = tf.get_variable(
    name='my_variable',
    shape=[1, 2, 3]
)
my_variable

<tf.Variable 'my_variable:0' shape=(1, 2, 3) dtype=float32_ref>

In [3]:
utils.inspect(my_variable, init_global=True)

[[[ 1.0836813  -0.8812319   0.80602884]
  [-0.4607755  -0.9072799  -0.53925633]]]


other params for `tf.get_variable` include `dtype`, `initializer`, `regularizer`, blah blah blah full docs:

In [4]:
tf.get_variable?

In [5]:
my_int_variable = tf.get_variable(
    name='my_int_variable',
    shape=[1, 2, 3],
    dtype=tf.int32,
    initializer=tf.zeros_initializer
)
my_int_variable

<tf.Variable 'my_int_variable:0' shape=(1, 2, 3) dtype=int32_ref>

In [6]:
utils.inspect(my_int_variable, init_global=True)

[[[0 0 0]
  [0 0 0]]]


there are "many" initializers (full section on it below). a common one is to initialize to a known tensor:

In [7]:
other_variable = tf.get_variable(
    "other_variable",
    # note: no shape=, because shape is / must be inferred from initializer tensor
    dtype=tf.int32,
    initializer=tf.constant([23, 42])
)
other_variable

<tf.Variable 'other_variable:0' shape=(2,) dtype=int32_ref>

In [8]:
utils.inspect(other_variable, init_global=True)

[23 42]


### variable collections

we've seen and accessed collections before, but this is the first direct discussion of them. they write:

> Because disconnected parts of a TensorFlow program might want to create variables, it is sometimes useful to have a single way to access all of them. For this reason TensorFlow provides collections, which are named lists of tensors or other objects, such as `tf.Variable` instances.

so `collections` are ways for users to group shared variables together *a la* namespaces.

by default, everything goes in the following two collections:

In [10]:
tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)

[<tf.Variable 'my_variable:0' shape=(1, 2, 3) dtype=float32_ref>,
 <tf.Variable 'my_int_variable:0' shape=(1, 2, 3) dtype=int32_ref>,
 <tf.Variable 'other_variable:0' shape=(2,) dtype=int32_ref>]

In [11]:
tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)

[<tf.Variable 'my_variable:0' shape=(1, 2, 3) dtype=float32_ref>,
 <tf.Variable 'my_int_variable:0' shape=(1, 2, 3) dtype=int32_ref>,
 <tf.Variable 'other_variable:0' shape=(2,) dtype=int32_ref>]

it is possible to create a tensor that is not put into the `tf.GraphKeys.TRAINABLE_VARIABLES` by default; add it to the `tf.GraphKeys.LOCAL_VARIABLES` collection explicitly:

In [12]:
my_local = tf.get_variable(
    'my_local',
    shape=(),
    collections=[tf.GraphKeys.LOCAL_VARIABLES]
)

In [13]:
tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)

[<tf.Variable 'my_variable:0' shape=(1, 2, 3) dtype=float32_ref>,
 <tf.Variable 'my_int_variable:0' shape=(1, 2, 3) dtype=int32_ref>,
 <tf.Variable 'other_variable:0' shape=(2,) dtype=int32_ref>,
 <tf.Variable 'my_local:0' shape=() dtype=float32_ref>]

In [14]:
tf.get_collection(tf.GraphKeys.LOCAL_VARIABLES)

[<tf.Variable 'my_local:0' shape=() dtype=float32_ref>]

sooooo that doesn't work as advertised...

or pass flag `trainable=False`:

In [15]:
my_non_trainable = tf.get_variable(
    "my_non_trainable",
    shape=(),
    trainable=False
)

In [16]:
tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)

[<tf.Variable 'my_variable:0' shape=(1, 2, 3) dtype=float32_ref>,
 <tf.Variable 'my_int_variable:0' shape=(1, 2, 3) dtype=int32_ref>,
 <tf.Variable 'other_variable:0' shape=(2,) dtype=int32_ref>,
 <tf.Variable 'my_local:0' shape=() dtype=float32_ref>]

In [17]:
tf.get_collection(tf.GraphKeys.LOCAL_VARIABLES)

[<tf.Variable 'my_local:0' shape=() dtype=float32_ref>]

whereas that is doing what i expected. I wonder if `LOCAL_VARIABLES` are just not trained, even though they are in `TRAINABLE_VARIABLES`?

custom collections are obviously supported:

In [18]:
tf.add_to_collection('my_collection', my_local)

In [19]:
tf.get_collection('my_collection')

[<tf.Variable 'my_local:0' shape=() dtype=float32_ref>]

### device placement

variables can be pushed to specific devices using the `tf.device` context manager:

```python
with tf.device('/device:GPU:1'):
    v = tf.get_variable('v', [1])
```

this is a lot to remember and it's pretty important you get it right (what's the use of a GPU if you don't put the variables you use on it, or have them replicated across all cores?).

`tensorflow` provides `tf.train.replica_device_setter` to automatically place variables in parameter servers for you:

```python
cluster_spec = {
    "ps": ["ps0:2222", "ps1:2222"],
    "worker": ["worker0:2222", "worker1:2222", "worker2:2222"]
}
with tf.device(tf.train.replica_device_setter(cluster=cluster_spec)):
    v = tf.get_variable("v", shape=[20, 20])
```

## initializing variables

all variables must be initialized. for low-level / core api programs, you will need to do this explicitly. for higher-level apis (ex: `tf.contrib.slim`, `tf.estimator.Estimator`, `keras`), this is done for you automatically.

explicit initialization might be a good idea if

1. initialization is computationally expensive (e.g. reloading a checkpointed model)
1. you're seeking deterministic behavior for a randomly initialized value in a distributed setting

you can initialize all *trainable* variables using `tf.global_variables_initializer()`. in addition, every variable has an `initializer` attribute which is a per-variable initializer operation object.

it is also possible to see which variables exist that have not been initialized at any given point in the computation:

In [20]:
utils.inspect(tf.report_uninitialized_variables())

[b'my_variable' b'my_int_variable' b'other_variable' b'my_non_trainable'
 b'my_local']


compare this to the list when I allow for global variables to be initialized

In [21]:
utils.inspect(tf.report_uninitialized_variables(), init_global=True)

[b'my_local']


another tricky piece: `tf.global_variables_initializer` doesn't order variables in any particular way, so if there is codependence between them the user is required to initialize them in the correct order.

the recommendation here seems to be to rely on the tensor-esque initializer syntax, directly passing the formula for the dependent tensor in as the initializer and using the `initialized_value` method of each variable to chain them together:

In [22]:
v = tf.get_variable('v', shape=(), initializer=tf.zeros_initializer())

In [23]:
w = tf.get_variable('w', initializer=v.initialized_value() + 1)

## using variables

treat it like any ol' tensor

In [24]:
w = v + 1

assign values using the `.assign`, `.assign_add`, etc methods:

In [28]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    assignment = v.assign_add(1)
    print('v = {}'.format(v.eval()))
    print('w = {}'.format(w.eval()))
    sess.run(assignment)
    print('v = {}'.format(v.eval()))
    print('w = {}'.format(w.eval()))

v = 0.0
w = 1.0
v = 1.0
w = 2.0


most of the operations assocaited with these variables are special -- they are implemented in a way that makes GD / optimization easier.

finally, you can force an on-demand re-read:

In [32]:
with tf.Session() as sess:
    assignment = v.assign_add(1)
    with tf.control_dependencies([assignment]):
        w = v.read_value()

## sharing variable

there are two ways of passing variables around

1. explicitly: create a `python` variable in a shared `python` scope
1. implicitly: add the varialbe to a `tf.variable_scope` object

the second method is offered as a convenience (c.f. `tf.layer` and `tf.metrics` for examples of this method in use). these variable scopes play the same role as namespaces in base `python`: you will often want to create variable scopes to differentiate between variables that share similar names. the provided example is weights and biases in different layers of similar structure:

In [33]:
def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable(
        "weights", 
        kernel_shape,
        initializer=tf.random_normal_initializer()
    )
    
    # Create variable named "biases".
    biases = tf.get_variable(
        "biases",
        bias_shape,
        initializer=tf.constant_initializer(0.0)
    )
    
    conv = tf.nn.conv2d(
        input,
        weights,
        strides=[1, 1, 1, 1], 
        padding='SAME'
    )
    
    return tf.nn.relu(conv + biases)

In [34]:
input1 = tf.random_normal([1,10,10,32])
input2 = tf.random_normal([1,20,20,32])

In [35]:
# this will succeed because it's the first time we're
# creating the weights and biases vairables:
x = conv_relu(input1, kernel_shape=[5, 5, 32, 32], bias_shape=[32])

In [36]:
# this will fail because it's the second time:
x = conv_relu(x, kernel_shape=[5, 5, 32, 32], bias_shape = [32])

ValueError: Variable weights already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:

  File "<ipython-input-33-700f8a5e1cfb>", line 6, in conv_relu
    initializer=tf.random_normal_initializer()
  File "<ipython-input-35-2e6dcb6e8031>", line 3, in <module>
    x = conv_relu(input1, kernel_shape=[5, 5, 32, 32], bias_shape=[32])
  File "/usr/local/lib/python3.5/dist-packages/IPython/core/interactiveshell.py", line 2963, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)


we can fix this by creating the variables within a different scope:

In [40]:
def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
        
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

In [41]:
x = my_image_filter(input1)

sometimes we wish to re-use variables created in one scope later on. this can also be done in one of two ways. the first is with `reuse` of a context:

In [42]:
with tf.variable_scope("model"):
    output1 = my_image_filter(input1)
    
with tf.variable_scope("model", reuse=True):
    output2 = my_image_filter(input2)

the second is by explicitly taking the current scope (as aliased in the context manager expression) and calling the `reuse_variables` method:

In [44]:
with tf.variable_scope("model2") as scope:
    output1 = my_image_filter(input1)
    scope.reuse_variables()
    output2 = my_image_filter(input2)

check out the name here to see that `model2` and `conv2` get reused as we build up our layers::

In [47]:
output2

<tf.Tensor 'model2/conv2_1/Relu:0' shape=(1, 20, 20, 32) dtype=float32>

they also allow you to build a context from a previous variable scope object rather than just a name (good idea):

In [48]:
with tf.variable_scope("model3") as scope:
    output1 = my_image_filter(input1)

with tf.variable_scope(scope, reuse=True):
    output2 = my_image_filter(input2)

In [49]:
output2

<tf.Tensor 'model3_1/conv2/Relu:0' shape=(1, 20, 20, 32) dtype=float32>

# summary

this page was short and to the point

+ variables are mutable tensors that can pass state around throughout your program
+ they can be (and are, by default) put into *collections*
    + you may define your own
    + some default collections have pariticular influence in the greater architecture
        + e.g. every variable in the `tf.GraphKeys.TRAINABLE_VARIABLES` collection will be initilized via the `tf.global_variables_initializer()` method
+ speaking of which, variables must be initialized within each independent computation session
+ variables can be explicitly pushed to different devices (gpu, tpu, etc)
    + care must be taken to make sure they're on the right device
    + some helper functions exist to make this easier
    + be careful about dependencies between variables (dependent initialization must happen sequentially and doesn't do so by default)
+ variables are used just like tensors
+ variables can be shared
    + explicitly (simplest)
    + implicitly (via variable scopes)
        + this allows reuse of names and more generally readable / reusable code