Skip to content

Commit

Permalink
(I)Document Refinement (#358)
Browse files Browse the repository at this point in the history
(1)training installation guide refined
(2)training quick start guide refined
  • Loading branch information
Gyx-One committed Jun 26, 2021
1 parent 67bfd49 commit 6664bab
Show file tree
Hide file tree
Showing 2 changed files with 91 additions and 45 deletions.
90 changes: 64 additions & 26 deletions docs/markdown/install/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,52 +2,90 @@

## Prerequisites
* [anaconda3](https://www.anaconda.com/products/individual)
Anaconda is used to create virtual environment that facilitates building the running environment and ease the complexity of library depedencies. Here we mainly use it to create virtual python environment and install cuda run-time libraries.
* [CUDA](https://developer.nvidia.com/cuda-downloads)

CUDA enviroment is essential to run deep learning neural networks on GPUs. The CUDA installation packages to download should match your system and your NVIDIA Driver version.
## Configure environment
Hyperpose training library can be directly used by putting Hyperpose in the directory and import.
But it has to install the prerequist environment to make it available.
There are two ways to install hyperpose python training library.

The following instructions have been tested on the environments below:
All the following instructions have been tested on the environments below:
* Ubuntu 18.04, Tesla V100-DGXStation, Nvidia Driver Version 440.33.01, CUDA Verison=10.2
* Ubuntu 18.04, Tesla V100-DGXStation, Nvidia Driver Version 410.79, CUDA Verison=10.0
* Ubuntu 18.04, TITAN RTX, Nvidia Driver Version 430.64, CUDA Version=10.1
* Ubuntu 18.04, TITAN Xp, Nvidia Driver Version 430.26, CUDA Version=10.2

Before all, we recommend you to create anaconda virtual environment first, which could handle the possible conflicts between the libraries you already have in your computers and the libraries hyperpose need to install, and also handle the dependencies of the cudatoolkit and cudnn library in a very simple way.
To create the virtual environment, run the following command in bash:
```bash
# >>> create virtual environment (choose yes)
conda create -n hyperpose python=3.7
# >>> activate the virtual environment, start installation
conda activate hyperpose
# >>> install cuda and cudnn using conda
# >>> install cudatoolkit and cudnn library using conda
conda install cudatoolkit=10.0.130
conda install cudnn=7.6.0
# >>> install tensorflow of version 2.0.0
pip install tensorflow-gpu==2.0.0
# >>> install the newest version tensorlayer from github
pip install tensorlayer==2.2.3
# >>> install other requirements (numpy<=17.0.0 because it has conflicts with pycocotools)
pip install opencv-python
pip install numpy==1.16.4
pip install pycocotools
pip install matplotlib
# >>> now the configuration is done, check whether the GPU is avaliable.
python
>>> import tensorflow as tf
>>> import tensorlayer as tl
>>> tf.test.is_gpu_available()
# >>> if the output is true, congratulation! you can import and run hyperpose now
>>> from hyperpose import Config,Model,Dataset
```

After configuring and activating conda enviroment, we can then begin to install the hyperpose.
(I)The first method to install is to put hyperpose python module in the working directory and import.(recommand)
After git-cloning the source [repository](https://github.com/tensorlayer/hyperpose.git), you can directly import hyperpose python library under the root directory of the cloned repository.
To make importion available, you should install the prerequist dependencies as followed:
you can either install according to the requirements.txt in the [repository](https://github.com/tensorlayer/hyperpose.git)
```bash
# install according to the requirements.txt
pip install -r requirements.txt
```
or install libraries one by one
```bash
# >>> install tensorflow of version 2.3.1
pip install tensorflow-gpu==2.3.1
# >>> install tensorlayer of version 2.2.3
pip install tensorlayer==2.2.3
# >>> install other requirements (numpy<=17.0.0 because it has conflicts with pycocotools)
pip install opencv-python
pip install numpy==1.16.4
pip install pycocotools
pip install matplotlib
```
This method of installation use the latest source code and thus is less likely to meet compatibility problems.
(II)The second method to install is to use pypi repositories.
We have already upload hyperpose python library to pypi website so you can install it using pip, which gives you the last stable version.
```bash
pip install hyperpose
```
This will download and install all dependencies automatically.

Now after installing dependent libraries and hyperpose itself, let's check whether the installation successes.
run following command in bash:
```bash
# >>> now the configuration is done, check whether the GPU is avaliable.
python
>>> import tensorflow as tf
>>> import tensorlayer as tl
>>> tf.test.is_gpu_available()
# >>> if the output is True, congratulation! you can import and run hyperpose now
>>> from hyperpose import Config,Model,Dataset
```

## Extra configuration for exporting model
For training, the above configuration is enough, but to export model into **onnx** format for inference,one should install the
following two extra library:
* tf2onnx (necessary ,used to convert .pb format model into .onnx format model) [reference](https://github.com/onnx/tensorflow-onnx)
The hypeprose python training library handles the whole pipelines for developing the pose estimation system, including training, evaluating and testing. Its goal is to produce a .npz file that contains the well-trained model weights. For the training platform, the enviroment configuration above is engough. However, most inference engine only accept .pb format or .onnx format model, such as [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html). Thus, one need to convert the trained model loaded with .npz file weight to .pb format or .onnx format for further deployment, which need extra configuration below:

(I)Convert to .pb format:
To convert the model into .pb format, we use *@tf.function* to decorate the *infer* function of each model class, so we can use the *get_concrete_function* function from tensorflow to consctruct the frozen model computation graph and then save it in .pb format.
We already provide a script with cli to facilitate conversion, which located at [export_pb.py](https://github.com/tensorlayer/hyperpose/blob/master/export_pb.py). What we need here is only **tensorflow** library that we already installed.

(II)Convert to .onnx format:
To convert the model in .onnx format, we need to first convert the model into .pb format, then convert it from .pb format into .onnx format. Two extra library are needed:
* tf2onnx
*tf2onnx* is used to convert .pb format model into .onnx format model, is necessary here. details information see [reference](https://github.com/onnx/tensorflow-onnx).
install tf2onnx by running:
```bash
pip install -U tf2onnx
```
* graph_transforms (unnecesary,used to check the input and output node of the .pb file if one doesn't know)
build graph_transforms according to [reference](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool)

* graph_transforms
*graph_transform* is used to check the input and output node of the .pb file if one doesn't know. when convert .pb file into .onnx file using tf2onnx, one is required to provide the input node name and output node name of the computation graph stored in .pb file, so he may need to use *graph_transform* to inspect the .pn file to get node names.
build graph_transforms according to [tensorflow tools](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool)



Expand Down
46 changes: 27 additions & 19 deletions docs/markdown/quick_start/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,25 @@
## Prerequisites
* Make sure you have configured 'hyperpose' virtual environment following the training installation guide,(if not, you can refer to [training installation](../install/training.md)).
* Make sure your GPU is available now(using tf.test.is_gpu_available() and it should return True)
* Make sure the Hyperpose training Library is under the root directory of the project(where you write train.py and eval.py)
* Make sure the hyperpose training Library is under the root directory of the project(where you write train.py and eval.py) or you have installed hyperpose through pypi.

## Train a model
The training procedure of Hyperpose is to set the model architecture, model backbone and dataset.
User specify these configuration using the seting functions of Config module with predefined enum value.
User specify these configuration using the set up functions of *Config* module with predefined enum value.
The code for training as simple as following would work.
```bash
# >>> import modules of hyperpose
from hyperpose import Config,Model,Dataset
# >>> set model name is necessary to distinguish models (neccesarry)
Config.set_model_name(args.model_name)
# >>> set model architecture(and model backbone when in need)
# >>> set model architecture (and set model backbone when in need)
Config.set_model_type(Config.MODEL.LightweightOpenpose)
Config.set_model_backbone(Config.BACKBONE.Vggtiny)
# >>> set dataset to use
Config.set_dataset_type(Config.DATA.MSCOCO)
# >>> set training type
Config.set_train_type(Config.TRAIN.Single_train)
# >>> configuration is done, get config object to assemble the system
# >>> configuration is done, get config object and assemble the system
config=Config.get_config()
model=Model.get_model(config)
dataset=Dataset.get_dataset(config)
Expand All @@ -30,21 +30,22 @@ train=Model.get_train(config)
train(model,dataset)
```
Then the integrated training pipeline will start.
for each model, Hyperpose will save all the related files in the direatory:
./save_dir/model_name, where *model_name* is the name user set by using *Config.set_model_name*
for each model, Hyperpose will save all the related files in the directory:
*./save_dir/model_name*, where *model_name* is the name user set by using *Config.set_model_name*
the directory and its contents are below:
* directory to save model ./save_dir/model_name/model_dir
* directory to save train result ./save_dir/model_name/train_vis_dir
* directory to save evaluate result ./save_dir/model_name/eval_vis_dir
* directory to save test result ./save_dir/model_name/test_vis_dir
* directory to save dataset visualize result ./save_dir/model_name/data_vis_dir
* file path to save train log ./save_dir/model_name/log.txt

The above code section show the simplest way to use Hyperpose training library, to make full use of Hyperpose training library,
you can refer to [training tutorial](../tutorial/training.md)
We provide a helpful training script with cli located at [train.py](https://github.com/tensorlayer/hyperpose/blob/master/train.py) to demonstrate the usage of hyperpose python training library, users can directly use the script to train thier own model or use it as a template for further modification.

## Eval a model
The evaluate procedure using Hyperpose is almost the same to the training procedure:
the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz
The evaluate procedure using Hyperpose is almost the same to the training procedure,
the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz,
The code for evaluating is followed:
```bash
# >>> import modules of hyperpose
from hyperpose import Config,Model,Dataset
Expand All @@ -68,29 +69,36 @@ It should be noted that:
1.the model architecture, model backbone, dataset type should be the same with the configuration under which model was trained.
2.the evaluation metrics will follow the official evaluation metrics of dataset

The above code section show the simplest way to use Hyperpose training library to evaluate a model trained by Hyperpose, to make full use of Hyperpose training library, you can refer to [training tutorial](../tutorial/training.md)
We also provide a helpful evaluating script with cli located at [eval.py](https://github.com/tensorlayer/hyperpose/blob/master/eval.py) to demonstrate how to evaluate the model trained by hyperpose, users can directly use the script to evaluate thier own model or use it as a template for further modification.

The above code sections show the simplest way to use Hyperpose training library to train and evaluate a model trained by Hyperpose, to make full use of Hyperpose training library, you can refer to [training tutorial](../tutorial/training.md)

## Export a model
The trained model weight is saved as a .npz file. For further deployment, one should convert the model loaded with the well-trained weight saved in the .npz file and convert it into the .pb format and .onnx format.
To export a model trained by Hyperpose, one should follow two step:
* (1)convert the trained .npz model into .pb format
this can be done either call the export_pb.py from Hyperpose repo
We use the *@tf.function* decorator to produce the static computation graph and save it into the .pb format.
We already provide a script with cli to facilitate conversion, which located at [export_pb.py](https://github.com/tensorlayer/hyperpose/blob/master/export_pb.py).
To convert a model with model_type=**your_model_type** and model_name=**your_model_name** developed by hyperpose,one should place the trained model weight **newest_model.npz** file at path *./save_dir/your_model_name/model_dir/newest_model.npz*,and run the command line followed:
```bash
python export_pb.py --model_type=your_model_type --model_name=your_model_name
python export_pb.py --model_type=your_model_type --model_name=your_model_name
```
then the converted model will be put in the ./save_dir/model_name/forzen_model_name.pb
one can also export himself by loading model and using get_concrete_function by himself, please refer the tutorial for details
* (2)convert the frozen .pb format model by tensorflow-onnx
Then the **frozen_your_model_name.pb** will be produced at path *./save_dir/your_model_name/frozen_your_model_name.pb*.
one can also export by loading model and using *get_concrete_function* by himself, please refer the [tutorial](../tutorial/training.md) for more details.
* (2)convert the frozen .pb format model into .onnx format
We use *tf2onnx* library to convert the .pb format model into .onnx format.
Make sure you have installed the extra requirements for exporting models from [training installation](../install/training.md)<br>
if you don't know the input and output name of the pb model,you should use the function *summarize_graph* function
of graph_transforms from tensorflow
if you don't know the input and output node names of the pb model,you should use the function *summarize_graph* function
of *graph_transforms* from tensorflow. (see [tensorflow tools](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool) for more details.)

```bash
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=your_frozen_model.pb
```
then, after knowing the input and output nodes of your .pb model,use tf2onnx
```bash
python -m tf2onnx.convert --graphdef your_frozen_model.pb --output output_model.onnx --inputs input0:0,input1:0... --outputs output0:0,output1:0,output2:0...
```
args follow inputs and outputs are the names of input and output nodes in .pb graph repectly, for example, if the input node name is **x** and output node name is **y1**,**y2**, then the convert bash should be:
args follow *--inputs* and *-outputs* are the names of input and output nodes in .pb graph respectively, for example, if the input node name is **x** and output node name is **y1**,**y2**, then the convert bash command line should be:
```
python -m tf2onnx.convert --graphdef your_frozen_model.pb --output output_model.onnx --inputs x:0 --outputs y1:0,y2:0
```
Expand Down

0 comments on commit 6664bab

Please sign in to comment.