From a2ee2732173cf99a6b44acc61844233a4629b28c Mon Sep 17 00:00:00 2001 From: Gyx-One <1137743903@qq.com> Date: Sat, 26 Jun 2021 18:03:47 +0800 Subject: [PATCH] (I)Document Refinement (1)training installation guide refined (2)training quick start guide refined --- docs/markdown/install/training.md | 90 +++++++++++++++++++-------- docs/markdown/quick_start/training.md | 46 ++++++++------ 2 files changed, 91 insertions(+), 45 deletions(-) diff --git a/docs/markdown/install/training.md b/docs/markdown/install/training.md index f51b83ae..a975432c 100644 --- a/docs/markdown/install/training.md +++ b/docs/markdown/install/training.md @@ -2,52 +2,90 @@ ## Prerequisites * [anaconda3](https://www.anaconda.com/products/individual) + Anaconda is used to create virtual environment that facilitates building the running environment and ease the complexity of library depedencies. Here we mainly use it to create virtual python environment and install cuda run-time libraries. * [CUDA](https://developer.nvidia.com/cuda-downloads) - + CUDA enviroment is essential to run deep learning neural networks on GPUs. The CUDA installation packages to download should match your system and your NVIDIA Driver version. ## Configure environment -Hyperpose training library can be directly used by putting Hyperpose in the directory and import. -But it has to install the prerequist environment to make it available. +There are two ways to install hyperpose python training library. -The following instructions have been tested on the environments below: +All the following instructions have been tested on the environments below: * Ubuntu 18.04, Tesla V100-DGXStation, Nvidia Driver Version 440.33.01, CUDA Verison=10.2 * Ubuntu 18.04, Tesla V100-DGXStation, Nvidia Driver Version 410.79, CUDA Verison=10.0 * Ubuntu 18.04, TITAN RTX, Nvidia Driver Version 430.64, CUDA Version=10.1 * Ubuntu 18.04, TITAN Xp, Nvidia Driver Version 430.26, CUDA Version=10.2 +Before all, we recommend you to create anaconda virtual environment first, which could handle the possible conflicts between the libraries you already have in your computers and the libraries hyperpose need to install, and also handle the dependencies of the cudatoolkit and cudnn library in a very simple way. +To create the virtual environment, run the following command in bash: ```bash # >>> create virtual environment (choose yes) conda create -n hyperpose python=3.7 # >>> activate the virtual environment, start installation conda activate hyperpose -# >>> install cuda and cudnn using conda +# >>> install cudatoolkit and cudnn library using conda conda install cudatoolkit=10.0.130 conda install cudnn=7.6.0 -# >>> install tensorflow of version 2.0.0 -pip install tensorflow-gpu==2.0.0 -# >>> install the newest version tensorlayer from github -pip install tensorlayer==2.2.3 -# >>> install other requirements (numpy<=17.0.0 because it has conflicts with pycocotools) -pip install opencv-python -pip install numpy==1.16.4 -pip install pycocotools -pip install matplotlib -# >>> now the configuration is done, check whether the GPU is avaliable. -python ->>> import tensorflow as tf ->>> import tensorlayer as tl ->>> tf.test.is_gpu_available() -# >>> if the output is true, congratulation! you can import and run hyperpose now ->>> from hyperpose import Config,Model,Dataset ``` + +After configuring and activating conda enviroment, we can then begin to install the hyperpose. +(I)The first method to install is to put hyperpose python module in the working directory and import.(recommand) +After git-cloning the source [repository](https://github.com/tensorlayer/hyperpose.git), you can directly import hyperpose python library under the root directory of the cloned repository. +To make importion available, you should install the prerequist dependencies as followed: +you can either install according to the requirements.txt in the [repository](https://github.com/tensorlayer/hyperpose.git) +```bash + # install according to the requirements.txt + pip install -r requirements.txt +``` +or install libraries one by one +```bash + # >>> install tensorflow of version 2.3.1 + pip install tensorflow-gpu==2.3.1 + # >>> install tensorlayer of version 2.2.3 + pip install tensorlayer==2.2.3 + # >>> install other requirements (numpy<=17.0.0 because it has conflicts with pycocotools) + pip install opencv-python + pip install numpy==1.16.4 + pip install pycocotools + pip install matplotlib +``` +This method of installation use the latest source code and thus is less likely to meet compatibility problems. +(II)The second method to install is to use pypi repositories. +We have already upload hyperpose python library to pypi website so you can install it using pip, which gives you the last stable version. +```bash + pip install hyperpose +``` +This will download and install all dependencies automatically. + +Now after installing dependent libraries and hyperpose itself, let's check whether the installation successes. +run following command in bash: +```bash + # >>> now the configuration is done, check whether the GPU is avaliable. + python + >>> import tensorflow as tf + >>> import tensorlayer as tl + >>> tf.test.is_gpu_available() + # >>> if the output is True, congratulation! you can import and run hyperpose now + >>> from hyperpose import Config,Model,Dataset +``` + ## Extra configuration for exporting model -For training, the above configuration is enough, but to export model into **onnx** format for inference,one should install the -following two extra library: -* tf2onnx (necessary ,used to convert .pb format model into .onnx format model) [reference](https://github.com/onnx/tensorflow-onnx) +The hypeprose python training library handles the whole pipelines for developing the pose estimation system, including training, evaluating and testing. Its goal is to produce a .npz file that contains the well-trained model weights. For the training platform, the enviroment configuration above is engough. However, most inference engine only accept .pb format or .onnx format model, such as [TensorRT](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html). Thus, one need to convert the trained model loaded with .npz file weight to .pb format or .onnx format for further deployment, which need extra configuration below: + +(I)Convert to .pb format: +To convert the model into .pb format, we use *@tf.function* to decorate the *infer* function of each model class, so we can use the *get_concrete_function* function from tensorflow to consctruct the frozen model computation graph and then save it in .pb format. +We already provide a script with cli to facilitate conversion, which located at [export_pb.py](https://github.com/tensorlayer/hyperpose/blob/master/export_pb.py). What we need here is only **tensorflow** library that we already installed. + +(II)Convert to .onnx format: +To convert the model in .onnx format, we need to first convert the model into .pb format, then convert it from .pb format into .onnx format. Two extra library are needed: +* tf2onnx +*tf2onnx* is used to convert .pb format model into .onnx format model, is necessary here. details information see [reference](https://github.com/onnx/tensorflow-onnx). +install tf2onnx by running: ```bash pip install -U tf2onnx ``` -* graph_transforms (unnecesary,used to check the input and output node of the .pb file if one doesn't know) -build graph_transforms according to [reference](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool) + +* graph_transforms +*graph_transform* is used to check the input and output node of the .pb file if one doesn't know. when convert .pb file into .onnx file using tf2onnx, one is required to provide the input node name and output node name of the computation graph stored in .pb file, so he may need to use *graph_transform* to inspect the .pn file to get node names. +build graph_transforms according to [tensorflow tools](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool) diff --git a/docs/markdown/quick_start/training.md b/docs/markdown/quick_start/training.md index 16ebe2ec..19eb3b18 100644 --- a/docs/markdown/quick_start/training.md +++ b/docs/markdown/quick_start/training.md @@ -3,25 +3,25 @@ ## Prerequisites * Make sure you have configured 'hyperpose' virtual environment following the training installation guide,(if not, you can refer to [training installation](../install/training.md)). * Make sure your GPU is available now(using tf.test.is_gpu_available() and it should return True) -* Make sure the Hyperpose training Library is under the root directory of the project(where you write train.py and eval.py) +* Make sure the hyperpose training Library is under the root directory of the project(where you write train.py and eval.py) or you have installed hyperpose through pypi. ## Train a model The training procedure of Hyperpose is to set the model architecture, model backbone and dataset. -User specify these configuration using the seting functions of Config module with predefined enum value. +User specify these configuration using the set up functions of *Config* module with predefined enum value. The code for training as simple as following would work. ```bash # >>> import modules of hyperpose from hyperpose import Config,Model,Dataset # >>> set model name is necessary to distinguish models (neccesarry) Config.set_model_name(args.model_name) -# >>> set model architecture(and model backbone when in need) +# >>> set model architecture (and set model backbone when in need) Config.set_model_type(Config.MODEL.LightweightOpenpose) Config.set_model_backbone(Config.BACKBONE.Vggtiny) # >>> set dataset to use Config.set_dataset_type(Config.DATA.MSCOCO) # >>> set training type Config.set_train_type(Config.TRAIN.Single_train) -# >>> configuration is done, get config object to assemble the system +# >>> configuration is done, get config object and assemble the system config=Config.get_config() model=Model.get_model(config) dataset=Dataset.get_dataset(config) @@ -30,21 +30,22 @@ train=Model.get_train(config) train(model,dataset) ``` Then the integrated training pipeline will start. -for each model, Hyperpose will save all the related files in the direatory: -./save_dir/model_name, where *model_name* is the name user set by using *Config.set_model_name* +for each model, Hyperpose will save all the related files in the directory: +*./save_dir/model_name*, where *model_name* is the name user set by using *Config.set_model_name* the directory and its contents are below: * directory to save model ./save_dir/model_name/model_dir * directory to save train result ./save_dir/model_name/train_vis_dir * directory to save evaluate result ./save_dir/model_name/eval_vis_dir +* directory to save test result ./save_dir/model_name/test_vis_dir * directory to save dataset visualize result ./save_dir/model_name/data_vis_dir * file path to save train log ./save_dir/model_name/log.txt -The above code section show the simplest way to use Hyperpose training library, to make full use of Hyperpose training library, -you can refer to [training tutorial](../tutorial/training.md) +We provide a helpful training script with cli located at [train.py](https://github.com/tensorlayer/hyperpose/blob/master/train.py) to demonstrate the usage of hyperpose python training library, users can directly use the script to train thier own model or use it as a template for further modification. ## Eval a model -The evaluate procedure using Hyperpose is almost the same to the training procedure: -the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz +The evaluate procedure using Hyperpose is almost the same to the training procedure, +the model will be loaded from the ./save_dir/model_name/model_dir/newest_model.npz, +The code for evaluating is followed: ```bash # >>> import modules of hyperpose from hyperpose import Config,Model,Dataset @@ -68,21 +69,28 @@ It should be noted that: 1.the model architecture, model backbone, dataset type should be the same with the configuration under which model was trained. 2.the evaluation metrics will follow the official evaluation metrics of dataset -The above code section show the simplest way to use Hyperpose training library to evaluate a model trained by Hyperpose, to make full use of Hyperpose training library, you can refer to [training tutorial](../tutorial/training.md) +We also provide a helpful evaluating script with cli located at [eval.py](https://github.com/tensorlayer/hyperpose/blob/master/eval.py) to demonstrate how to evaluate the model trained by hyperpose, users can directly use the script to evaluate thier own model or use it as a template for further modification. + +The above code sections show the simplest way to use Hyperpose training library to train and evaluate a model trained by Hyperpose, to make full use of Hyperpose training library, you can refer to [training tutorial](../tutorial/training.md) ## Export a model +The trained model weight is saved as a .npz file. For further deployment, one should convert the model loaded with the well-trained weight saved in the .npz file and convert it into the .pb format and .onnx format. To export a model trained by Hyperpose, one should follow two step: * (1)convert the trained .npz model into .pb format - this can be done either call the export_pb.py from Hyperpose repo + We use the *@tf.function* decorator to produce the static computation graph and save it into the .pb format. + We already provide a script with cli to facilitate conversion, which located at [export_pb.py](https://github.com/tensorlayer/hyperpose/blob/master/export_pb.py). + To convert a model with model_type=**your_model_type** and model_name=**your_model_name** developed by hyperpose,one should place the trained model weight **newest_model.npz** file at path *./save_dir/your_model_name/model_dir/newest_model.npz*,and run the command line followed: ```bash - python export_pb.py --model_type=your_model_type --model_name=your_model_name + python export_pb.py --model_type=your_model_type --model_name=your_model_name ``` - then the converted model will be put in the ./save_dir/model_name/forzen_model_name.pb - one can also export himself by loading model and using get_concrete_function by himself, please refer the tutorial for details -* (2)convert the frozen .pb format model by tensorflow-onnx + Then the **frozen_your_model_name.pb** will be produced at path *./save_dir/your_model_name/frozen_your_model_name.pb*. + one can also export by loading model and using *get_concrete_function* by himself, please refer the [tutorial](../tutorial/training.md) for more details. +* (2)convert the frozen .pb format model into .onnx format + We use *tf2onnx* library to convert the .pb format model into .onnx format. Make sure you have installed the extra requirements for exporting models from [training installation](../install/training.md)
- if you don't know the input and output name of the pb model,you should use the function *summarize_graph* function - of graph_transforms from tensorflow + if you don't know the input and output node names of the pb model,you should use the function *summarize_graph* function + of *graph_transforms* from tensorflow. (see [tensorflow tools](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms#using-the-graph-transform-tool) for more details.) + ```bash bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph=your_frozen_model.pb ``` @@ -90,7 +98,7 @@ To export a model trained by Hyperpose, one should follow two step: ```bash python -m tf2onnx.convert --graphdef your_frozen_model.pb --output output_model.onnx --inputs input0:0,input1:0... --outputs output0:0,output1:0,output2:0... ``` - args follow inputs and outputs are the names of input and output nodes in .pb graph repectly, for example, if the input node name is **x** and output node name is **y1**,**y2**, then the convert bash should be: + args follow *--inputs* and *-outputs* are the names of input and output nodes in .pb graph respectively, for example, if the input node name is **x** and output node name is **y1**,**y2**, then the convert bash command line should be: ``` python -m tf2onnx.convert --graphdef your_frozen_model.pb --output output_model.onnx --inputs x:0 --outputs y1:0,y2:0 ```