# Tensorflow Basics

Source: https://github.com/vahidk/EffectiveTensorflow#tf_learn

Assume that we have samples from a curve (say $f(x) = 5x^2 + 3)$ and we want to estimate $f(x)$ based on these samples. We define a parametric function $g(x, w) = w0 x^2 + w1 x + w2$, which is a function of the input $x$ and latent parameters $w$, our goal is then to find the latent parameters such that $g(x, w) ≈ f(x)$. This can be done by minimizing the following loss function: $L(w) = ∑ (f(x) - g(x, w))^2$. Although there's a closed form solution for this simple problem, we opt to use a more general approach that can be applied to any arbitrary differentiable function, and that is using stochastic gradient descent. We simply compute the average gradient of $L(w)$ with respect to w over a set of sample points and move in the opposite direction.

In [3]:
import numpy as np
import tensorflow as tf

In [4]:
# Placeholders are used to feed values from python to TensorFlow ops. We define
# two placeholders, one for input feature x, and one for output y.
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

In [5]:
# Assuming we know that the desired function is a polynomial of 2nd degree, we
# allocate a vector of size 3 to hold the coefficients. The variable will be
# automatically initialized with random noise.
w = tf.get_variable("w", shape=[3, 1])

In [16]:
# We define yhat to be our estimate of y.
f = tf.stack([tf.square(x), x, tf.ones_like(x)], 1)
yhat = tf.squeeze(tf.matmul(f, w), 1)

Tensor("Shape:0", shape=(?,), dtype=int32)


In [7]:
# The loss is defined to be the l2 distance between our estimate of y and its
# true value. We also added a shrinkage term, to ensure the resulting weights
# would be small.
loss = tf.nn.l2_loss(yhat - y) + 0.1 * tf.nn.l2_loss(w)

In [8]:
# We use the Adam optimizer with learning rate set to 0.1 to minimize the loss.
train_op = tf.train.AdamOptimizer(0.1).minimize(loss)

In [17]:
def generate_data():
    """
    Generates data from the actual curve i.e. f(x)
    """
    x_val = np.random.uniform(-10.0, 10.0, size=100)
    y_val = 5 * np.square(x_val) + 3
    return x_val, y_val

In [18]:
sess = tf.Session()
# Since we are using variables we first need to initialize them.
sess.run(tf.global_variables_initializer())
for _ in range(1000):
    x_val, y_val = generate_data()
    _, loss_val = sess.run([train_op, loss], {x: x_val, y: y_val})
    # print(loss_val)
print(sess.run([w]))

[array([[  4.98046732e+00],
       [ -9.78034455e-04],
       [  4.13006067e+00]], dtype=float32)]


Which is a relatively close approximation to our parameters.

This is just tip of the iceberg for what TensorFlow can do. Many problems such as optimizing large neural networks with millions of parameters can be implemented efficiently in TensorFlow in just a few lines of code. TensorFlow takes care of scaling across multiple devices, and threads, and supports a variety of platforms.

# Difference between tf.Variable and tf.get_variable (TODO)

In [13]:
v1 = tf.Variable(10, name="name1")
v2 = tf.Variable(10, name="name1")
assert(v1 is not v2)
print(v1.name)  # name1:0
print(v2.name)  # name1_1:0

name1_2:0
name1_3:0
