## Tensorflow中的Logistic回归

数据集：MNIST Database

每个图像都是一个28x28阵列，被展平为1-d张量，大小为784

X: 手写数字图像
Y: 数字值

任务：识别图中的数字

模型：Y_predicted = softmax(X * w + b)

损失函数(交叉熵损失): -log(Y_predicted)

1.处理数据

In [1]:
import utils
import tensorflow as tf
import time
learning_rate = 0.01
batch_size = 128
n_epochs = 30
n_train = 60000
n_test = 10000

mnist_folder = 'data/mnist'
utils.download_mnist(mnist_folder) 
train, val, test = utils.read_mnist(mnist_folder, flatten=True)

train_data = tf.data.Dataset.from_tensor_slices(train)
train_data = train_data.shuffle(10000) #
train_data = train_data.batch(batch_size)

test_data = tf.data.Dataset.from_tensor_slices(test)
test_data = test_data.batch(batch_size)

data/mnist/train-images-idx3-ubyte.gz already exists
data/mnist/train-labels-idx1-ubyte.gz already exists
data/mnist/t10k-images-idx3-ubyte.gz already exists
data/mnist/t10k-labels-idx1-ubyte.gz already exists


2.创建一个迭代器并确定如何初始化它。

In [2]:
iterator = tf.data.Iterator.from_structure(train_data.output_types,train_data.output_shapes)
img,label = iterator.get_next()
train_init = iterator.make_initializer(train_data)
test_init = iterator.make_initializer(test_data)

3.并生成模型的参数w和b。设置形状以适合img大小。然后，w被初始化为具有均值0和标准差方差0.01的正态分布，并且b被初始化为0。

In [3]:
w = tf.get_variable(name='weight', shape=(784,10), initializer=tf.random_normal_initializer(0,0.01))
b = tf.get_variable(name='bias', shape=(1,10), initializer=tf.zeros_initializer())

4.定义logit和softmax函数并定义损失函数。

In [4]:
logits = tf.matmul(img,w) + b

entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=label, name='entropy')
loss = tf.reduce_mean(entropy, name = 'loss')

5.优化器使用Adam优化器。

In [5]:
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)

6.定义预测操作，确认预测是否正确，以及精度计算操作。

In [6]:
preds = tf.nn.softmax(logits)
correct_preds = tf.equal(tf.argmax(preds, 1), tf.argmax(label, 1))
accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))

7.现在让我们可视化及定义session内容。

In [7]:
writer = tf.summary.FileWriter('./graphs/logreg', tf.get_default_graph())

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
if 'session' in locals() and session is not None:
    print('Close interactive session')
    session.close()
with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
    start_time = time.time()
    # 初始化变量
    sess.run(tf.global_variables_initializer())
    # 训练
    for i in range(n_epochs):
        sess.run(train_init) # drawing samples from train_data
        total_loss = 0
        n_batches = 0
        try:
            while True:
                _, l = sess.run([optimizer, loss])
                total_loss += l
                n_batches += 1
        except tf.errors.OutOfRangeError:
            pass
        print('Average loss epoch {0}: {1}'.format(i, total_loss/n_batches))
    print('Total time: {0} seconds'.format(time.time() - start_time))

    # test the model
    sess.run(test_init) # drawing samples from test_data
    total_correct_preds = 0
    try:
        while True:
            accuracy_batch = sess.run(accuracy)
            total_correct_preds += accuracy_batch
    except tf.errors.OutOfRangeError:
        pass

    print('Accuracy {0}'.format(total_correct_preds/n_test))
writer.close()

Average loss epoch 0: 0.3655088067747826
Average loss epoch 1: 0.2940110842155856
Average loss epoch 2: 0.286005012431117
Average loss epoch 3: 0.27987676671771117
Average loss epoch 4: 0.27525817329107327
Average loss epoch 5: 0.27111212872835094
Average loss epoch 6: 0.26818984052123024
Average loss epoch 7: 0.26815499448499014
Average loss epoch 8: 0.266046321981175
Average loss epoch 9: 0.26476025451407875
Average loss epoch 10: 0.2631033247293428
Average loss epoch 11: 0.2619289387623931
Average loss epoch 12: 0.26319200791591824
Average loss epoch 13: 0.25912395159518997
Average loss epoch 14: 0.2600291490208271
Average loss epoch 15: 0.26063522691296975
Average loss epoch 16: 0.25603776347152024
Average loss epoch 17: 0.2581001413596231
Average loss epoch 18: 0.25983193417967754
Average loss epoch 19: 0.2560788070912971
Average loss epoch 20: 0.2563442733918512
Average loss epoch 21: 0.25523489374060965
Average loss epoch 22: 0.2543823371619679
Average loss epoch 23: 0.255280771