#  基于MindSpore实现LSTM算法 
本实验基于MindSpore构建LSTM模型，输出模型的预测结果和状态信息的形状。
## 1 实验目的
1.通过实验了解LSTM算法

2.基于MindSpore中实现LSTM算法
## 2 LSTM算法原理介绍
LSTM四个函数层与具体介绍如下：

(1)第一个函数层：遗忘门
    ![jupyter](./Figures/fig001.png)
 
对于上一时刻LSTM中的单元状态，一些信息可能会随着时间的流逝而过时。为了不让过多记忆影响神经网络对现在输入的处理，我们应该选择性遗忘一些在之前单元状态中的分量——这个工作就交给了遗忘门。

每一次输入一个新的输入，LSTM会先根据新的输入和上一时刻的输出决定遗忘之前的哪些记忆——输入和上一步的输出会整合为一个单独的向量，然后通过sigmoid神经层，最后点对点的乘在单元状态上。因为sigmoid 函数会将任意输入压缩到 (0,1) 的区间上，我们可以非常直觉的得出这个门的工作原理 —— 如果整合后的向量某个分量在通过sigmoid层后变为0，那么显然单元状态在对位相乘后对应的分量也会变成0，换句话说，遗忘了这个分量上的信息；如果某个分量通过sigmoid层后为1，单元状态会“保持完整记忆”。不同的sigmoid输出会带来不同信息的记忆与遗忘。通过这种方式，LSTM可以长期记忆重要信息，并且记忆可以随着输入进行动态调整。下面的公式可以用来描述遗忘门的计算，其中f_t就是sigmoid神经层的输出向量：

$f_t=σ(W_f∙[h_(t-1),x_t ]+b_f)$

（2）第二个、第三个函数层：记忆门
记忆门是用来控制是否将在t时刻（现在）的数据并入单元状态中的控制单位。首先，用tanh函数层将现在的向量中的有效信息提取出来，然后使用sigmoid函数来控制这些记忆要放多少进入单元状态。这两者结合起来就可以做到：

 ![jupyter](./Figures/fig002.png)
 
从当前输入中提取有效信息；对提取的有效信息做出筛选，为每个分量做出评级(0 ~ 1)，评级越高的最后会有越多的记忆进入单元状态。下面的公式可以分别表示这两个步骤在LSTM中的计算：

$C_{t}^{'}=tanh⁡(W_c∙[h_{(t-1)},x_t ]+b_c)$

$i_t=σ(W_i∙[h_{(t-1)},x_t ]+b_i)$

（3）第四个函数层：输出门

输出门就是LSTM单元用于计算当前时刻的输出值的神经层。输出层会先将当前输入值与上一时刻输出值整合后的向量用sigmoid函数提取其中的信息，然后，会将当前的单元状态通过tanh函数压缩映射到区间(-1, 1)中，将经过tanh函数处理后的单元状态与sigmoid函数处理后的单元状态，整合后的向量点对点的乘起来就可以得到LSTM在 t时刻的输出。

LSTM模型是由时刻的输入词$X_{t}$ ，细胞状态$C_{t}$，临时细胞状态$\widetilde{C_{t}} $，隐层状态$h_{t}$，遗忘门$f_{t}$，记忆门$i_{t}$，输出门$ o_{t}$组成。LSTM的计算过程可以概括为:通过对细胞状态中信息遗忘和记忆新的信息使得对后续时刻计算有用的信息得以传递，而无用的信息被丢弃，并在每个时间步都会输出隐层状态$h_{t}$ ，其中遗忘、记忆与输出由通过上个时刻的隐层状态$h_{t-1}$和当前输入$X_{t}$计算出来的遗忘门$f_{t}$，记忆门$ i_{t}$，输出门$o_{t}$来控制。

LSTM总体框架图如下：
 ![jupyter](./Figures/fig003.png)


## 3 实验环境
### 实验环境要求
在动手进行实践之前，需要注意以下几点：
* 确保实验环境正确安装，包括安装MindSpore。安装过程：首先登录[MindSpore官网安装页面](https://www.mindspore.cn/install)，根据安装指南下载安装包及查询相关文档。同时，官网环境安装也可以按下表说明找到对应环境搭建文档链接，根据环境搭建手册配置对应的实验环境。
* 推荐使用交互式的计算环境Jupyter Notebook，其交互性强，易于可视化，适合频繁修改的数据分析实验环境。
* 实验也可以在华为云一站式的AI开发平台ModelArts上完成。
* 推荐实验环境：MindSpore版本=MindSpore 2.0；Python环境=3.7


|  硬件平台 |  操作系统  | 软件环境 | 开发环境 | 环境搭建链接 |
| :-----:| :----: | :----: |:----:   |:----:   |
| CPU | Windows-x64 | MindSpore2.0 Python3.7.5 | JupyterNotebook |[MindSpore环境搭建实验手册第二章2.1节和第三章3.1节](./MindSpore环境搭建实验手册.docx)|
| GPU CUDA 10.1|Linux-x86_64| MindSpore2.0 Python3.7.5 | JupyterNotebook |[MindSpore环境搭建实验手册第二章2.2节和第三章3.1节](./MindSpore环境搭建实验手册.docx)|
| Ascend 910  | Linux-x86_64| MindSpore2.0 Python3.7.5 | JupyterNotebook |[MindSpore环境搭建实验手册第四章](./MindSpore环境搭建实验手册.docx)|

## 4 数据处理
IMDB是一个与国内豆瓣比较类似的与电影相关的网站，而本次实验用到的数据集是这个网站中的一些用户评论。IMDB数据集共包含50000项影评文字，训练数据和测试数据各25000项，每一项影评文字都被标记为正面评价或负面评价，所以本实验可以看做一个二分类问题。IMDB数据集官网：[Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/)。

方式一，从斯坦福大学官网下载aclImdb_v1.tar.gz并解压。

方式二，从华为云OBS中下载aclImdb_v1.tar.gz并解压。

同时，我们要下载GloVe文件，并在文件glove.6B.300d.txt开头处添加新的一行400000 300，即总共读取400000个单词，每个单词用300维度的词向量表示。 修改glove.6B.300.txt如下:

400000 300

the -0.071549 0.093459 0.023738 -0.090339 0.056123 0.32547…

确定评价标准：
作为典型的分类问题，情感分类的评价标准可以比照普通的分类问题处理。常见的精度（Accuracy）、精准度（Precision）、召回率（Recall）和F_beta分数都可以作为参考。

精度（Accuracy）=分类正确的样本数目/总样本数目

精准度（Precision）=真阳性样本数目/所有预测类别为阳性的样本数目

召回率（Recall）=真阳性样本数目/所有真实类别为阳性的样本数目

F1分数=(2∗Precision∗Recall)/(Precision+Recall)

在IMDB这个数据集中，正负样本数差别不大，可以简单地用精度（accuracy）作为分类器的衡量标准。

## 5 模型构建
（1）导入Python库&模块并配置运行信息

导入MindSpore模块和辅助模块，设置MindSpore上下文，如执行模式、设备等。

代码如下：

In [1]:
#导入MindSpore中的nn模块
import mindspore.nn as nn
import os
import mindspore.common.dtype as mstype
from mindspore import dataset as ds
import math
from mindspore import Parameter
from mindspore import ParameterTuple
from mindspore import Tensor
import gensim
from itertools import chain
from mindspore.common.initializer import initializer
import numpy as np
#导入MindSpore中的ops模块
import mindspore.ops as ops
#导入MindSpore中ops模块的operations类
from mindspore.ops import operations as P
#导入MindSpore中的Model
from mindspore.train import Model
#配置当前执行环境
from mindspore import context
from mindspore.train import Accuracy
from mindspore import Tensor, nn, context
from mindspore.train import LossMonitor, CheckpointConfig, ModelCheckpoint, TimeMonitor
from mindspore import load_param_into_net, load_checkpoint
from mindspore.mindrecord import FileWriter

（2）定义参数变量

device_target：指定Ascend或CPU/GPU环境。
pre_trained：预加载CheckPoint文件。
preprocess：是否预处理数据集，默认为否。
aclimdb_path：数据集存放路径。
glove_path：GloVe文件存放路径。
preprocess_path：预处理数据集的结果文件夹。
ckpt_path：CheckPoint文件路径。
train_url：预处理数据集拷贝出来的存放路径。

In [2]:
#使用EasyDict库创建LSTM的配置文件
from easydict import EasyDict
# as edict
# LSTM CONFIG LSTM的配置项
lstm_cfg = EasyDict({
    'num_classes': 2,
    'learning_rate': 0.1,
    'momentum': 0.9,
    'num_epochs': 20,   
    'batch_size': 64,
    'embed_size': 300,
    'num_hiddens': 100,
    'num_layers': 2,
    'bidirectional': True,
    'save_checkpoint_steps': 390,
    'keep_checkpoint_max': 10,
    'vocab_size':6,
})
#训练参数
args_train = EasyDict({
    'preprocess': 'true',
    'aclimdb_path': "./aclImdb",
    'glove_path':"./glove",
    'preprocess_path': "./preprocessed",
    'ckpt_path': "./",
    'pre_trained': None,
    'device_target': "CPU",
})
#测试参数
args_test =EasyDict({
    'preprocess': 'false',
    'aclimdb_path': "./aclImdb",
    'glove_path':  "./glove",
    'preprocess_path':"./preprocessed",
    'ckpt_path': "./lstm-20_390.ckpt",
    'pre_trained': None,
    'device_target':  "CPU",
})

（3）数据的读取与处理

对文本数据集进行处理，包括编码、分词、对齐、处理GloVe原始数据，使之能够适应网络结构。

In [3]:
class ImdbParser():
    """
    parse aclImdb data to features and labels.
    sentence->tokenized->encoded->padding->features
    """

    def __init__(self, imdb_path, glove_path, embed_size=300):
        self.__segs = ['train', 'test']
        self.__label_dic = {'pos': 1, 'neg': 0}
        self.__imdb_path = imdb_path
        #self.__glove_dim = embed_size
        self.__glove_dim = 300
        self.__glove_file = os.path.join(glove_path, 'glove.6B.' + str(self.__glove_dim) + 'd.txt')

        # properties
        self.__imdb_datas = {}
        self.__features = {}
        self.__labels = {}
        self.__vacab = {}
        self.__word2idx = {}
        self.__weight_np = {}
        self.__wvmodel = None

    def parse(self):
        """
        parse imdb data to memory
        """
        self.__wvmodel = gensim.models.KeyedVectors.load_word2vec_format(self.__glove_file)

        for seg in self.__segs:
            self.__parse_imdb_datas(seg)
            self.__parse_features_and_labels(seg)
            self.__gen_weight_np(seg)

    def __parse_imdb_datas(self, seg):
        """
        load data from txt
        """
        data_lists = []
        for label_name, label_id in self.__label_dic.items():
            sentence_dir = os.path.join(self.__imdb_path, seg, label_name)
            for file in os.listdir(sentence_dir):
                with open(os.path.join(sentence_dir, file), mode='r', encoding='utf8') as f:
                    sentence = f.read().replace('\n', '')
                    data_lists.append([sentence, label_id])
        self.__imdb_datas[seg] = data_lists

    def __parse_features_and_labels(self, seg):
        """
        parse features and labels
        """
        features = []
        labels = []
        for sentence, label in self.__imdb_datas[seg]:
            features.append(sentence)
            labels.append(label)

        self.__features[seg] = features
        self.__labels[seg] = labels

        # update feature to tokenized
        self.__updata_features_to_tokenized(seg)
        # parse vacab
        self.__parse_vacab(seg)
        # encode feature
        self.__encode_features(seg)
        # padding feature
        self.__padding_features(seg)

    def __updata_features_to_tokenized(self, seg):
        tokenized_features = []
        for sentence in self.__features[seg]:
            tokenized_sentence = [word.lower() for word in sentence.split(" ")]
            tokenized_features.append(tokenized_sentence)
        self.__features[seg] = tokenized_features

    def __parse_vacab(self, seg):
        # vocab
        tokenized_features = self.__features[seg]
        vocab = set(chain(*tokenized_features))
        self.__vacab[seg] = vocab

        word_to_idx = {word: i + 1 for i, word in enumerate(vocab)}
        word_to_idx['<unk>'] = 0
        self.__word2idx[seg] = word_to_idx

    def __encode_features(self, seg):
        """ encode word to index """
        word_to_idx = self.__word2idx['train']
        encoded_features = []
        for tokenized_sentence in self.__features[seg]:
            encoded_sentence = []
            for word in tokenized_sentence:
                encoded_sentence.append(word_to_idx.get(word, 0))
            encoded_features.append(encoded_sentence)
        self.__features[seg] = encoded_features

    def __padding_features(self, seg, maxlen=500, pad=0):
        """ pad all features to the same length """
        padded_features = []
        for feature in self.__features[seg]:
            if len(feature) >= maxlen:
                padded_feature = feature[:maxlen]
            else:
                padded_feature = feature
                while len(padded_feature) < maxlen:
                    padded_feature.append(pad)
            padded_features.append(padded_feature)
        self.__features[seg] = padded_features

    def __gen_weight_np(self, seg):
        """
        generate weight by gensim
        """
        weight_np = np.zeros((len(self.__word2idx[seg]), self.__glove_dim), dtype=np.float32)
        for word, idx in self.__word2idx[seg].items():
            if word not in self.__wvmodel:
                continue
            word_vector = self.__wvmodel.get_vector(word)
            weight_np[idx, :] = word_vector

        self.__weight_np[seg] = weight_np

    def get_datas(self, seg):
        """
        get features, labels, and weight by gensim.
        """
        features = np.array(self.__features[seg]).astype(np.int32)
        labels = np.array(self.__labels[seg]).astype(np.int32)
        weight = np.array(self.__weight_np[seg])
        return features, labels, weight

定义创建数据集函数lstm_create_dataset，创建训练集ds_train和验证集ds_eval。

定义convert_to_mindrecord函数将数据集格式转换为MindRecord格式，便于MindSpore读取。

In [4]:
def lstm_create_dataset(data_home, batch_size, repeat_num=1, training=True):
    """Data operations."""
    ds.config.set_seed(1)
    data_dir = os.path.join(data_home, "aclImdb_train.mindrecord0")
    if not training:
        data_dir = os.path.join(data_home, "aclImdb_test.mindrecord0")

    data_set = ds.MindDataset(data_dir, columns_list=["feature", "label"], num_parallel_workers=4)

    # apply map operations on images
    data_set = data_set.shuffle(buffer_size=data_set.get_dataset_size())
    data_set = data_set.batch(batch_size=batch_size, drop_remainder=True)
    data_set = data_set.repeat(count=repeat_num)

    return data_set

函数_convert_to_mindrecord中weight.txt为数据预处理后自动生成的weight参数信息文件。

In [5]:
def _convert_to_mindrecord(data_home, features, labels, weight_np=None, training=True):
    """
    convert imdb dataset to mindrecord dataset
    """
    if weight_np is not None:
        np.savetxt(os.path.join(data_home, 'weight.txt'), weight_np)

    # write mindrecord
    schema_json = {"id": {"type": "int32"},
                   "label": {"type": "int32"},
                   "feature": {"type": "int32", "shape": [-1]}}

    data_dir = os.path.join(data_home, "aclImdb_train.mindrecord")
    if not training:
        data_dir = os.path.join(data_home, "aclImdb_test.mindrecord")

    def get_imdb_data(features, labels):
        data_list = []
        for i, (label, feature) in enumerate(zip(labels, features)):
            data_json = {"id": i,
                         "label": int(label),
                         "feature": feature.reshape(-1)}
            data_list.append(data_json)
        return data_list

    writer = FileWriter(data_dir, shard_num=4)
    data = get_imdb_data(features, labels)
    writer.add_schema(schema_json, "nlp_schema")
    writer.add_index(["id", "label"])
    writer.write_raw_data(data)
    writer.commit()


def convert_to_mindrecord(embed_size, aclimdb_path, preprocess_path, glove_path):
    """
    convert imdb dataset to mindrecord dataset
    """
    parser = ImdbParser(aclimdb_path, glove_path, embed_size)
    parser.parse()

    if not os.path.exists(preprocess_path):
        print(f"preprocess path {preprocess_path} is not exist")
        os.makedirs(preprocess_path)

    train_features, train_labels, train_weight_np = parser.get_datas('train')
    _convert_to_mindrecord(preprocess_path, train_features, train_labels, train_weight_np)

    test_features, test_labels, _ = parser.get_datas('test')
    _convert_to_mindrecord(preprocess_path, test_features, test_labels, training=False)

(4)模型构建

定义需要单层LSTM小算子堆叠的设备类型。

In [6]:
STACK_LSTM_DEVICE = ["CPU"]

定义lstm_default_state函数来初始化网络参数及网络状态。

In [7]:
def lstm_default_state(batch_size, hidden_size, num_layers, bidirectional):
    """init default input."""
    num_directions = 2 if bidirectional else 1
    h = Tensor(np.zeros((num_layers * num_directions, batch_size, hidden_size)).astype(np.float32))
    c = Tensor(np.zeros((num_layers * num_directions, batch_size, hidden_size)).astype(np.float32))
    return h, c

对于不同平台，定义stack_lstm_default_state函数来初始化小算子堆叠需要的初始化网络参数及网络状态。

针对不同的场景，自定义单层LSTM小算子堆叠，来实现多层LSTM大算子功能

In [8]:
def stack_lstm_default_state(batch_size, hidden_size, num_layers, bidirectional):
    num_directions = 2 if bidirectional else 1
    h_state = np.zeros((num_layers * num_directions, batch_size, hidden_size), dtype=np.float32)
    c_state = np.zeros((num_layers * num_directions, batch_size, hidden_size), dtype=np.float32)
    return Tensor(h_state, mstype.float32), Tensor(c_state, mstype.float32)

In [28]:
class StackLSTM(nn.Cell):
    """
    Stack multi-layers LSTM together.
    """

    def __init__(self,
                 input_size,
                 hidden_size,
                 num_layers=1,
                 has_bias=True,
                 batch_first=False,
                 dropout=0.0,
                 bidirectional=False):
        super(StackLSTM, self).__init__()
        self.num_layers = num_layers
        self.batch_first = batch_first
        self.transpose = P.Transpose()

        # direction number
        num_directions = 2 if bidirectional else 1

        # input_size list
        input_size_list = [input_size]
        for i in range(num_layers - 1):
            input_size_list.append(hidden_size * num_directions)

        # layers
        layers = []
        for i in range(num_layers):
            layers.append(nn.LSTM(input_size=input_size_list[i],
                                      hidden_size=hidden_size,
                                      has_bias=has_bias,
                                      batch_first=batch_first,
                                      bidirectional=bidirectional,
                                      dropout=dropout))

        # weights
        weights = []
        for i in range(num_layers):
            # weight size
            weight_size = (input_size_list[i] + hidden_size) * num_directions * hidden_size * 4
            if has_bias:
                bias_size = num_directions * hidden_size * 4
                weight_size = weight_size + bias_size

            # numpy weight
            stdv = 1 / math.sqrt(hidden_size)
            w_np = np.random.uniform(-stdv, stdv, (weight_size, 1, 1)).astype(np.float32)

            # lstm weight
            weights.append(Parameter(initializer(Tensor(w_np), w_np.shape), name="weight" + str(i)))

        #
        self.lstms = layers
        self.weight = ParameterTuple(tuple(weights))

    def construct(self, x, hx):
        """construct"""
        if self.batch_first:
            x = self.transpose(x, (1, 0, 2))
        h, c = hx
        hn = cn = None
        for i in range(self.num_layers):
            x, hn, cn, _, _ = self.lstms[i](x, h[i], c[i], self.weight[i])
        if self.batch_first:
            x = self.transpose(x, (1, 0, 2))
        return x, (hn, cn)

使用cell方法，定义SentimentNet网络。

In [31]:
class SentimentNet(nn.Cell):
    """Sentiment network structure."""

    def __init__(self,
                 vocab_size,
                 embed_size,
                 num_hiddens,
                 num_layers,
                 bidirectional,
                 num_classes,
                 weight,
                 batch_size):
        super(SentimentNet, self).__init__()
        # Mapp words to vectors
        self.embedding = nn.Embedding(vocab_size,
                                      embed_size,
                                      embedding_table=weight)
        self.embedding.embedding_table.requires_grad = False
        self.trans = P.Transpose()
        self.perm = (1, 0, 2)

        if context.get_context("device_target") in STACK_LSTM_DEVICE:
            # stack lstm by user
            self.encoder = StackLSTM(input_size=embed_size,
                                     hidden_size=num_hiddens,
                                     num_layers=num_layers,
                                     has_bias=True,
                                     bidirectional=bidirectional,
                                     dropout=0.0)
            self.h, self.c = stack_lstm_default_state(batch_size, num_hiddens, num_layers, bidirectional)
        elif context.get_context("device_target") == "GPU":
            # standard lstm
            self.encoder = nn.LSTM(input_size=embed_size,
                                   hidden_size=num_hiddens,
                                   num_layers=num_layers,
                                   has_bias=True,
                                   bidirectional=bidirectional,
                                   dropout=0.0)
            self.h, self.c = lstm_default_state(batch_size, num_hiddens, num_layers, bidirectional)
        else:
            self.encoder = StackLSTMAscend(input_size=embed_size,
                                           hidden_size=num_hiddens,
                                           num_layers=num_layers,
                                           has_bias=True,
                                           bidirectional=bidirectional)
            self.h, self.c = stack_lstm_default_state_ascend(batch_size, num_hiddens, num_layers, bidirectional)

        self.concat = P.Concat(1)
        self.squeeze = P.Squeeze(axis=0)
        if bidirectional:
            self.decoder = nn.Dense(num_hiddens * 4, num_classes)
        else:
            self.decoder = nn.Dense(num_hiddens * 2, num_classes)

    #def construct(self, inputs):
        # input：(64,500,300)
     #   embeddings = self.embedding(inputs)
      #  embeddings = self.trans(embeddings, self.perm)
       # output, _ = self.encoder(embeddings, (self.h, self.c))
        # states[i] size(64,200)  -> encoding.size(64,400)
        #encoding = self.concat((self.squeeze(output[0:1:1]), self.squeeze(output[499:500:1])))
        #outputs = self.decoder(encoding)
        #return outputs
    def construct(self, inputs, h, c):
        embeddings = self.embedding(inputs)
        outputs, (h, c) = self.lstm(embeddings, (h, c))
        outputs = self.dropout(outputs)
        output = self.fc(outputs[-1, :, :])  # Use the last time step's output for classification
        return output
    def init_hidden_state(self, batch_size):
        h = Tensor(np.zeros((self.num_layers * (self.bidirectional + 1), batch_size, self.num_hiddens)).astype(np.float32))
        c = Tensor(np.zeros((self.num_layers * (self.bidirectional + 1), batch_size, self.num_hiddens)).astype(np.float32))
        return h, c

调用convert_to_mindrecord函数执行数据集预处理。
转换成功后会在preprocess目录下生成MindRecord文件，通常该操作在数据集不变的情况下，无需每次训练都执行。

名称包含aclImdb_train.mindrecord的为转换后的MindRecord格式的训练数据集。
名称包含aclImdb_test.mindrecord的为转换后的MindRecord格式的测试数据集。
weight.txt为预处理后自动生成的weight参数信息文件。

In [11]:
#if args_train.device_target == 'Ascend':
 #   cfg = lstm_cfg_ascend
#else:
cfg = lstm_cfg
if args_train.preprocess == "true":
    print("============== Starting Data Pre-processing ==============")
    convert_to_mindrecord(cfg.embed_size, args_train.aclimdb_path, args_train.preprocess_path, args_train.glove_path)



实例化SentimentNet，创建网络。

In [32]:
cfg = lstm_cfg
embedding_table = np.loadtxt(os.path.join(args_train.preprocess_path, "weight.txt")).astype(np.float32)
if args_train.device_target == 'Ascend':
    pad_num = int(np.ceil(cfg.embed_size / 16) * 16 - cfg.embed_size)
    if pad_num > 0:
        embedding_table = np.pad(embedding_table, [(0, 0), (0, pad_num)], 'constant')
    cfg.embed_size = int(np.ceil(cfg.embed_size / 16) * 16)
cfg.embed_size = 300 
network = SentimentNet(vocab_size=embedding_table.shape[0],
                        embed_size=cfg.embed_size,
                        num_hiddens=cfg.num_hiddens,
                        num_layers=cfg.num_layers,
                        bidirectional=cfg.bidirectional,
                        num_classes=cfg.num_classes,
                        weight=Tensor(embedding_table),
                        batch_size=cfg.batch_size)

## 6 模型训练
根据建立的LSTM模型，对模型进行训练：

In [33]:
import numpy as np
steps_per_epoch=5
def get_lr(global_step, lr_init, lr_end, lr_max, warmup_epochs, total_epochs):
    warmup_steps = warmup_epochs * steps_per_epoch 
    total_steps = total_epochs * steps_per_epoch  
    lr_each_step = []
    for i in range(total_steps):
        if i < warmup_steps:
            lr = (lr_max - lr_init) / warmup_steps * i + lr_init  
        else:
            lr = lr_max - (lr_max - lr_end) * (i - warmup_steps) / (total_steps - warmup_steps)  # 学习率衰减
        lr_each_step.append(lr)
    current_step = global_step - 1
    lr = lr_each_step[current_step]
    return lr
if args_train.pre_trained:
    load_param_into_net(network, load_checkpoint(args_train.pre_trained))
ds_train = lstm_create_dataset(args_train.preprocess_path, cfg.batch_size, 1)
loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean')
cfg.dynamic_lr=True
if cfg.dynamic_lr:
    lr = Tensor(get_lr(global_step=5,lr_init=0.001, lr_end=0.01, lr_max=0.1,warmup_epochs=5,
                        total_epochs=5))
else:
    lr = cfg.learning_rate
opt = nn.Momentum(network.trainable_params(), lr, cfg.momentum)
loss_cb = LossMonitor()
model = Model(network, loss, opt, {'acc': Accuracy()})
print("============== Starting Training ==============")
config_ck = CheckpointConfig(save_checkpoint_steps=cfg.save_checkpoint_steps,
                                 keep_checkpoint_max=cfg.keep_checkpoint_max)
ckpoint_cb = ModelCheckpoint(prefix="lstm", directory=args_train.ckpt_path, config=config_ck)
time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())
if args_train.device_target == "CPU":
    model.train(cfg.num_epochs, ds_train, callbacks=[time_cb, ckpoint_cb, loss_cb], dataset_sink_mode=False)
else:
    model.train(cfg.num_epochs, ds_train, callbacks=[time_cb, ckpoint_cb, loss_cb])
print("============== Training Success ==============")



TypeError: construct() missing 2 required positional arguments: 'h' and 'c'

## 7 模型测试
根据处理后的测试数据以及建立的LSTM模型，对模型进行测试：

In [None]:
ds_eval = lstm_create_dataset(args_test.preprocess_path, cfg.batch_size, training=False)
print("============== Starting Testing ==============")
param_dict = load_checkpoint('lstm-20_390.ckpt')
load_param_into_net(network, param_dict)
if args_test.device_target == "CPU":
    acc = model.eval(ds_eval, dataset_sink_mode=False)
else:
    acc = model.eval(ds_eval)
print("============== {} ==============".format(acc))

## 8 实验总结
本实验介绍了LSTM算法的原理，并按照步骤基于MindSpore实现了LSTM算法，并包含了对模型的测试部分，用于验证模型在随机输入数据上的预测结果和状态信息。