# Basic models in tensorflow

### Phase 1: assemble our graph

Ｓtep 1： read in data

Ｓtep 2： create placeholders for inputs and labels

`tf.placeholder（dtype, shape=Ｎone, name=Ｎone）`

Ｓtep 3： create weight and bias

`tf.get_variable（name, shape=Ｎone, dtype=Ｎone, initializer=Ｎone,...)`

Step 4: inference

`Y_predicted = w * X + b`

Step 5: specify loss function

* Linear regression (L2 loss): 
```
loss = tf.square(Y - Y_predicted, name='loss')
```

* Logistic regression (cross-entropy loss): 
```
entropy = tf.nn.softmax_cross_entropy_with_logits(labels, logits)
loss = tf.reduce_mean(entropy)
```

Step 6: create optimizer

`optimizer =tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss)`

### Phase 2: Train our model
Step 1: initialize variables

`sess.run(tf.global_variables_initializer())`

Step 2: run optimizer

`_, loss_ = sess.run([optimizer, loss], feed_dict={X: x, Y: y})`



### write log files using a FileWriter

`writer = tf.summary.FileWriter('./graphs/linear_reg', tf.get_default_graph())`

In terminal
```
$ python <filename>.py
$ tensorboard --logdir='./graphs/linear_reg'
```

### TF control flow
E.g.: `tf.cond(pred, fn1, fn2, name=None)`
```
def huber_loss(labels, predictions, delta=14.0):
    residual = tf.abs(labels - predictions)
    def f1(): return 0.5 * tf.square(residual)
    def f2(): return delta * residual - 0.5 * tf.square(delta)
    return tf.cond(residual < delta, f1, f2)
```

### tf.data
Instead of doing inference with placeholders and feeding in data later, do inference directly with data

```
tf.data.Dataset
tf.data.Iterator
```

Store data in tf.data.Dataset

```
tf.data.Dataset.from_tensor_slices((features, labels))
tf.data.Dataset.from_generator(gen, output_types, output_shapes)
```
Can also create dataset from files
```
tf.data.TextLineDataset(filenames)
tf.data.FixedLengthRecorddataset(filenames)
tf.data.TFRecordDataset(filenames)
```
Iterate data
```python
iterator = dataset.make_one_shot_iterator() 
# Iterates through the dataset exactly once. No need to initialization.
iterator = dataset.make_initializable_iterator()
# Iterates through the dataset as many times as we want. Need to initialize with each epoch!
```
Examples:
``` python
iterator = dataset.make_initializable_iterator()
X, Y = iterator.get_next()
...
for i in range(100):
    sess.run(iterator.initializer)
    total_loss = 0
    try:
        while True:
            sess.run([optimizer])
    except tf.errors.OutOfRangeError:
        pass
```
Handling data in tensorflow
```python
dataset = dataset.shuffle(1000) # return 1000 random samples
dataset = dataset.repeat(100) # repeats this dataset 100 times
dataset = dataset.batch(128) # combines 128 consecutive elements of this dataset into batches
dataset = dataset.map(lambda x: tf.one_hot(x, 10)) # convert to one_hot vector of 10-bit
```

Should we always use tf.data?
* For prototyping, feed dict can be faster and easier to write (pythonic)
* tf.data is tricky to use when you have complicated preprocessing or multiple data sources
* NLP data is normally just a sequence of integers. In this case, transferring the data over to GPU is pretty quick, so the speedup of tf.data isn't that large


How to separate train and test data?
```python
# define iterator
iterator = tf.data.Iterator.from_structure(train_data.output_types, train_data.output_shapes)
img, label = iterator.get_next()
# define initializers
train_init = iterator.make_initializer(train_data)
test_init = iterator.make_initializer(test_data)

with tf.Session() as sess:
    ...
    for i in range(n_epochs):
        # train the model
        sess.run(train_init) # run train data initializer
        try:
            while True:
                _, l = sess.run([optimizer, loss]) # fetch optimizer/loss
        except tf.errors.OutOfRangeError:
            pass
        # test the model
        sess.run(test_init) # run test data initializer
        try:
            while True:
                sess.run(accuracy) # fetch accuracy
        except tf.errors.OutOfRangeError:
            pass                
```

### Optimizers
Session looks at all `trainable` variables that loss depends on and update them
```
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss)
_, l = sess.run([optimizer, loss], feed_dict={X: x, Y:y})

```