# 使用MoXing实现手写数字图像识别应用

  &#160;&#160;本内容主要介绍，如何使用MoXing实现手写数字图像的训练、测试应用。

### 1. 准备数据

  &#160;&#160;下载MNIST数据集，解压缩之后上传至OBS桶中。具体操作如下：
**步骤 1**  &#160; &#160; 下载MNIST数据集， 数据集文件说明如下：
- t10k-images-idx3-ubyte.gz：验证集，共包含10000个样本。<a href = "https://dls-obs.obs.cn-north-1.myhwclouds.com/mnist_example/mnist_data/t10k-images-idx3-ubyte.gz">下载数据</a>
- t10k-labels-idx1-ubyte.gz：验证集标签，共包含10000个样本的类别标签。<a href = "https://dls-obs.obs.cn-north-1.myhwclouds.com/mnist_example/mnist_data/t10k-labels-idx1-ubyte.gz">下载数据</a>
- train-images-idx3-ubyte.gz：训练集，共包含60000个样本。<a href = "https://dls-obs.obs.cn-north-1.myhwclouds.com/mnist_example/mnist_data/train-images-idx3-ubyte.gz">下载数据</a>
- train-labels-idx1-ubyte.gz：训练集标签，共包含60000个样本的类别标签。<a href = "https://dls-obs.obs.cn-north-1.myhwclouds.com/mnist_example/mnist_data/train-labels-idx1-ubyte.gz">下载数据</a>
.gz数据无需解压，参考<a href = "https://support.huaweicloud.com/usermanual-dls/dls_01_0040.html">“上传业务数据”</a>章节内容，分别上传至华为云OBS桶 （假设OBS桶路径为：s3://zzy/zzy/data/mnist/）。

### 2. 训练模型

  &#160;&#160;通过import加载moxing的tensorflow模块 moxing.tensorflow 

In [1]:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import moxing.tensorflow as mox
import os
from __future__ import print_function
from __future__ import unicode_literals

**说明 1**  &#160; &#160; 函数 tf.flags.DEFINE_string('data_url', 's3://zzy/zzy/data/mnist', 'Dir of dataset')的第二个参数为数据路径。
                  函数tf.flags.DEFINE_string('train_url', 's3://obs-dls-mnist-example/log/ ', 'Train Url') 的第二个参数为日志以及生产模型的存储路径。

In [2]:
tf.flags.DEFINE_string('data_url', 's3://zzy/zzy/data/mnist', 'Dir of dataset')
tf.flags.DEFINE_string('train_url', 's3://obs-dls-mnist-example/log/', 'Train Url')

flags = tf.flags.FLAGS

work_directory = flags.data_url
filenames = ['train-images-idx3-ubyte.gz','train-labels-idx1-ubyte.gz','t10k-images-idx3-ubyte.gz',
             't10k-labels-idx1-ubyte.gz']

for filename in filenames:
  filepath = os.path.join(work_directory, filename)
  if not mox.file.exists(filepath):
    raise ValueError('MNIST dataset file %s not found in %s' % (filepath, work_directory))

  &#160;&#160;训练的main函数包含三个部分，输入定义、模型定义和运行。

1） 输入函数：input_fn(run_mode, **kwargs) 用户可以根据自己的输入编写。本例中通过迭代的方式从数据集中取数据。


2） 模型定义：def model_fn(inputs, run_mode, **kwargs): 模型结构定义函数，返回 mox.ModelSpec(），用户作业模式定义返回值。
但需要满足如下条件：

 &#160;&#160; For run_mode == ModeKeys.TRAIN: `loss` is required.
  
  &#160;&#160;  For run_mode == ModeKeys.EVAL: `log_info` is required.
  
  &#160;&#160;  For run_mode == ModeKeys.PREDICT: `output_info` is required.
  
  &#160;&#160;  For run_mode == ModeKeys.EXPORT: `export_spec` is required.
  

3） 执行训练： mox.run(），训练的过程中可指定optimizer的一些设置，训练batch的大小等，设置内容如下：


 &#160;&#160; 输入函数， input_fn: An input_fn defined by user. Allows tfrecord or python data. Returns  input tensor list.
 
 &#160;&#160;  模型函数， model_fn: A model_fn defined by user. Returns `mox.ModelSpec`.
  
  &#160;&#160; optimizer定义， optimizer_fn: An optimizer_fn defined by user. Returns an optimizer.
  
  &#160;&#160; 运行模式选择， run_mode: Only takes mox.ModeKeys.TRAIN or mox.ModeKeys.EVAL or mox.ModeKeys.PREDICT
  
  &#160;&#160; batch大小设置， batch_size: Mini-batch size.
  
 &#160;&#160;  是否自动化batch， auto_batch: If True, an extra dimension of batch_size will be expanded to the first
                     dimension of the return value from `get_split`. Default to True.
                     
  &#160;&#160; 日志以及checkpoint保存位置， log_dir: The directory to save summaries and checkpoints.
  
  &#160;&#160; 最大数量，  max_number_of_steps: Maximum steps for each worker.
                          
  &#160;&#160; 日志打印， log_every_n_steps: Step period to print logs to std I/O.
     
  &#160;&#160; 是否输出模型， export_model: True or False. Where to export model after running the job.



In [3]:
def main(*args):
  mnist = input_data.read_data_sets(flags.data_url, one_hot=True)


  # define the input dataset, return image and label
  def input_fn(run_mode, **kwargs):
    def gen():
      while True:
        yield mnist.train.next_batch(50)
    ds = tf.data.Dataset.from_generator(
        gen, output_types=(tf.float32, tf.int64),
        output_shapes=(tf.TensorShape([None, 784]), tf.TensorShape([None, 10])))
    return ds.make_one_shot_iterator().get_next()


  # define the model for training or evaling.
  def model_fn(inputs, run_mode, **kwargs):
    x, y_ = inputs
    W = tf.get_variable(name='W', initializer=tf.zeros([784, 10]))
    b = tf.get_variable(name='b', initializer=tf.zeros([10]))
    y = tf.matmul(x, W) + b
    cross_entropy = tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
    predictions = tf.argmax(y, 1)
    correct_predictions = tf.equal(predictions, tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
    export_spec = mox.ExportSpec(inputs_dict={'images': x}, outputs_dict={'predictions': predictions}, version='model')
    return mox.ModelSpec(loss=cross_entropy, log_info={'loss': cross_entropy, 'accuracy': accuracy},
                         export_spec=export_spec)


  mox.run(input_fn=input_fn,
          model_fn=model_fn,
          optimizer_fn=mox.get_optimizer_fn('sgd', learning_rate=0.01),
          run_mode=mox.ModeKeys.TRAIN,
          batch_size=50,
          auto_batch=False,
          log_dir=flags.train_url,
          max_number_of_steps=1000,
          log_every_n_steps=10,
          export_model=mox.ExportKeys.TF_SERVING)

if __name__ == '__main__':
  tf.app.run(main=main)

Extracting s3://zzy/zzy/data/mnist/train-images-idx3-ubyte.gz
Extracting s3://zzy/zzy/data/mnist/train-labels-idx1-ubyte.gz
Extracting s3://zzy/zzy/data/mnist/t10k-images-idx3-ubyte.gz
Extracting s3://zzy/zzy/data/mnist/t10k-labels-idx1-ubyte.gz
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Restoring parameters from s3://obs-dls-mnist-example/log/model.ckpt-1000
INFO:tensorflow:Running will end at step: 1000
INFO:tensorflow:Saving checkpoints for 1000 into s3://obs-dls-mnist-example/log/model.ckpt.
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:Restoring parameters from s3://obs-dls-mnist-example/log/model.ckpt-1000
INFO:tensorflow:SavedModel written to: s3://obs-dls-mnist-example/log/model/saved_model.pb


SystemExit: 

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


### 3. 预测



   &#160;&#160;在上面训练的基础上，我们可以直接用训练的模型进行预测作业。如读取OBS桶中的数字图片进行识别（假设位置：'s3://obs-dls-mnist-example/7.jpg'）。input_fn 对输入图片进行简单处理，得到网络允许的输入tensor；model_fn定义一个预测内容，同时，还需定义一个对输出处理的函数output_fn，我们在改函数里对输出进行一个打印输出。
 
  还需在 mox.run()函数中加入如下参数：
  
  &#160;&#160;   输出函数 output_fn: A callback with args of results from sess.run.
   
  &#160;&#160; 模型加载位置 checkpoint_path: Directory or file path of ckpt to restore when `run_mode` is 'evaluation'.
                          Useless when `run_mode` is 'train'.

In [7]:
def predict(*args):
  def input_fn(run_mode, **kwargs):
    image = tf.read_file('s3://obs-dls-mnist-example/7.jpg')
    img = tf.image.decode_jpeg(image, channels=1)
    img = tf.image.resize_images(img, [28, 28], 0)
    img = tf.reshape(img, [784])
    return img

  def model_fn(inputs, run_mode, **kwargs):
    x = inputs
    W1 = tf.get_variable(name='W', initializer=tf.zeros([784, 10]))
    b1 = tf.get_variable(name='b', initializer=tf.zeros([10]))
    y = tf.matmul(x, W1) + b1
    predictions = tf.argmax(y, 1)
    return mox.ModelSpec(output_info={'predict': predictions})

  def output_fn(outputs):
    for output in outputs:
      result = output['predict']
      print("The result：",result)

  mox.run(input_fn=input_fn,
          model_fn=model_fn,
          output_fn=output_fn,
          run_mode=mox.ModeKeys.PREDICT,
          batch_size=1,
          auto_batch=False,
          max_number_of_steps=1,
          output_every_n_steps=1,
          checkpoint_path=flags.train_url)
predict()

INFO:tensorflow:Restoring parameters from s3://obs-dls-mnist-example/log/model.ckpt-1000
The result： [7]
INFO:tensorflow:	[1 examples]


通过预测，我们能够看到结果输出。



更多内容请参考<a href ="https://github.com/huaweicloud/dls-example/blob/master/Using%20MoXing%20to%20Create%20a%20MNIST%20Dataset%20Recognition%20Application/README.md">“MNIST_EXAMPLE”</a>。



