# 概述与环境准备

我们利用来自 (huggingface)[https://huggingface.co/] 的 [transfer-learning-conv-ai](https://github.com/huggingface/transfer-learning-conv-ai)

基于经过 [pytorch-transformers](https://github.com/huggingface/transformers) 包装和改装的 [GPT2](https://github.com/openai/gpt-2) 进行多轮机器对话

## Conda 环境

在 Conda 环境中执行

### 新建 Conda 环境

准备 Conda 环境，命名为 `transfer-learning-conv-ai`

**如果 Conda 环境已经存在，不要重复创建！**

In [1]:
%conda create --yes --name transfer-learning-conv-ai ipython ipywidgets nb_conda

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/liuxy/miniconda3/envs/transfer-learning-conv-ai

  added / updated specs:
    - ipython
    - ipywidgets
    - nb_conda


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    prompt_toolkit-3.0.0       |             py_0         237 KB  defaults
    pyrsistent-0.15.6          |   py37h7b6447c_0          93 KB  defaults
    python-dateutil-2.8.1      |             py_0         224 KB  defaults
    setuptools-42.0.1          |           py37_0         670 KB  defaults
    ------------------------------------------------------------
                                           Total:         1.2 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      anaconda/pkgs/main/linux-64::_libgcc_mutex-0.1-main
  attrs              anaconda/pkgs/main/noarch::

### 指定笔记本的 Kernel 为上述 Conda 环境

现在，指定该笔记本的 Kernel 为上述 Conda 环境。

**这个目录中的其它笔记本，也都应使用上述的 Conda 环境所在 Kernel。**

### 安装依赖软件包

#### CUDA 相关软件包

如 PyTorch, TensorFlow 等，应根据 GPU 和 CUDA 的情况选择不同的版本

检查 CUDA 版本：

In [2]:
cuda_version_file = '/usr/local/cuda/version.txt'
with open(cuda_version_file) as fp:
    cuda_version_text = fp.read().strip()
print(f'CUDA VERSION: {cuda_version_text}')

cuda_version_mayor_minor = '.'.join(cuda_version_text.split()[-1].split('.')[:2])
print(f'MAYOR.MINOR: {cuda_version_mayor_minor}')

CUDA VERSION: CUDA Version 10.0.130
MAYOR.MINOR: 10.0


执行 conda 安装：

In [6]:
%conda install --yes -c defaults -c conda-forge -c pytorch tensorflow pytorch::pytorch pytorch::torchvision cudatoolkit={cuda_version_mayor_minor} pytorch::ignite tensorboardX

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/liuxy/miniconda3/envs/transfer-learning-conv-ai

  added / updated specs:
    - cudatoolkit=10.0
    - pytorch::ignite
    - pytorch::pytorch
    - pytorch::torchvision
    - tensorboardx
    - tensorflow


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    protobuf-3.10.1            |   py37he6710b0_0         708 KB  defaults
    ------------------------------------------------------------
                                           Total:         708 KB

The following NEW packages will be INSTALLED:

  _tflow_select      anaconda/pkgs/main/linux-64::_tflow_select-2.3.0-mkl
  absl-py            anaconda/pkgs/main/linux-64::absl-py-0.8.1-py37_0
  astor              anaconda/pkgs/main/linux-64::astor-0.8.0-py37_0
  blas               anaconda/pkgs/main/linux-6

### 其它基础软件包

In [7]:
%conda install --yes -c defaults -c conda-forge boto3 requests tqdm regex sacremoses

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/liuxy/miniconda3/envs/transfer-learning-conv-ai

  added / updated specs:
    - boto3
    - regex
    - requests
    - sacremoses
    - tqdm


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    joblib-0.14.0              |             py_0         201 KB  defaults
    sacremoses-0.0.35          |             py_0         448 KB  conda-forge
    tqdm-4.39.0                |             py_0          52 KB  defaults
    ------------------------------------------------------------
                                           Total:         701 KB

The following NEW packages will be INSTALLED:

  asn1crypto         anaconda/pkgs/main/linux-64::asn1crypto-1.2.0-py37_0
  boto3              anaconda/pkgs/main/noarch::boto3-1.9.234-py_0
  botocore           anaconda/p

In [8]:
%pip install sentencepiece

Looking in indexes: https://mirrors.aliyun.com/pypi/simple
Collecting sentencepiece
  Using cached https://mirrors.aliyun.com/pypi/packages/e8/cf/7089b87fdae8f47be81ce8e2e6377b321805c4648f2eb12fbd2987388dac/sentencepiece-0.1.83-cp37-cp37m-manylinux1_x86_64.whl
Installing collected packages: sentencepiece
Successfully installed sentencepiece-0.1.83
Note: you may need to restart the kernel to use updated packages.


### 安装 huggingface 的 pytorch-transformers

[transfer-learning-conv-ai](https://github.com/huggingface/transfer-learning-conv-ai) 依赖于 [pytorch-transformers](https://github.com/huggingface/transformers)，且我们对 [pytorch-transformers](https://github.com/huggingface/transformers) 进行了小幅度的修改，所以，需要从我们自己的代码安装这个包：

In [1]:
%pip install -e /home/kangzh/transformers

Looking in indexes: https://mirrors.aliyun.com/pypi/simple
Obtaining file:///home/kangzh/transformers
Installing collected packages: transformers
  Running setup.py develop for transformers
Successfully installed transformers
Note: you may need to restart the kernel to use updated packages.


### 重启 Kernel

Note: you may need to restart the kernel to use updated packages.

## 执行 transfer-learning-conv-ai 项目中的程序

同样，我们小幅度修改了 [transfer-learning-conv-ai](https://github.com/huggingface/transfer-learning-conv-ai) 。它可执行程序，而不是“包”的形式存在，我们需要切换到其所在目录。

### CD

In [1]:
%cd /home/kangzh/transfer-learning-conv-ai

/home/kangzh/transfer-learning-conv-ai


### Interact

In [None]:
%run interact.py \
    --model_type gpt2_cn \
    --model_checkpoint ./model_checkpoint_117 \
    --dataset_cache ./dataset_cache_GPT2Tokenizer_cn/cache  \
    --min_length 125 \
    --max_length 1000  \
    --temperature 0.6 \
    --top_p 0.9


To use data.metrics please install scikit-learn. See https://scikit-learn.org/stable/index.html
INFO:/home/kangzh/transfer-learning-conv-ai/interact.py:Namespace(dataset_cache='./dataset_cache_GPT2Tokenizer_cn/cache', dataset_path='', device='cuda', max_history=2, max_length=1000, min_length=125, model_checkpoint='./model_checkpoint_117', model_type='gpt2_cn', no_sample=False, seed=42, temperature=0.6, top_k=0, top_p=0.9)
INFO:/home/kangzh/transfer-learning-conv-ai/interact.py:Get pretrained model and tokenizer
INFO:/home/kangzh/transfer-learning-conv-ai/interact.py:load tokenizer....

INFO:transformers.tokenization_utils:Model name './model_checkpoint_117' not found in model shortcut name list (). Assuming './model_checkpoint_117' is a path or url to a directory containing tokenizer files.
INFO:transformers.tokenization_utils:Didn't find file ./model_checkpoint_117/merges.txt. We won't load it.
INFO:transformers.tokenization_utils:Didn't find file ./model_checkpoint_117/added_tokens.j

>>>  哦哦




▁嗨!你最近有什么烦恼吗?可以发私信给我,我很乐意帮助你哦!


>>>  是这样的，我失眠比较严重，还脱发。


▁你好,我是小媒~给楼主温暖的抱抱~楼主最近的状态确实有点不好,脱发,脱发是正常的,但是楼主可以先去检查一下自己的身体,看看有没有其他的身体不适。如果身体上的不适已经影响到了自己的正常生活,楼主可以先去医院检查一下自己的身体,看看有没有其他的身体不适,如果身体上的不适已经影响到了自己的生活,楼主可以先去医院检查一下自己的身体。祝好~?
