From f037275c2edcf46b8c2e1b24ff290e372766ea68 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 12:16:44 +0000 Subject: [PATCH 01/21] Add Readme for vision results --- docs/api_docs/python/vision_results.md | 68 ++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 docs/api_docs/python/vision_results.md diff --git a/docs/api_docs/python/vision_results.md b/docs/api_docs/python/vision_results.md new file mode 100644 index 00000000000..96c7778d3f9 --- /dev/null +++ b/docs/api_docs/python/vision_results.md @@ -0,0 +1,68 @@ +# Description of vision model prediction results + +## ClassifyResult +The code of ClassifyResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the classification label result and confidence the image. + +API: `fastdeploy.vision.ClassifyResult` +The ClassifyResult will return: +- **label_ids**(list of int):Member variables that represent the classification label results of a single image, the number of which is determined by the topk passed in when using the classification model. For example, you can return the label results of the top 5 categories. + +- **scores**(list of float):Member variables that indicate the confidence level of a single image on the corresponding classification result, the number of which is determined by the topk passed in when using the classification model, e.g. the confidence level of a top 5 classification can be returned. + +## SegmentationResult +The code of SegmentationResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the segmentation category predicted for each pixel in the image and the probability of the segmentation category. + +API: `fastdeploy.vision.SegmentationResult` +The SegmentationResult will return: +- **label_ids**(list of int):Member variable indicating the segmentation category for each pixel of a single image +- **score_map**(list of float):Member variable, the predicted probability value of the segmentation category corresponding to label_map (specified when exporting the model `--output_op argmax`) or the probability value normalized by softmax (specified when exporting the model `--output_op softmax` or when exporting the model `--output_op none` and set the model class member attribute `apply_softmax=true` when initializing the model) +- **shape**(list of int):Member variable indicating the shape of the output image, as H*W. + + +## DetectionResult +The code of DetectionResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the target location (detection box), target class and target confidence level detected by the image. + +API: `fastdeploy.vision.DetectionResult` +- **boxes**(list of list(float)):Member variable, represents the coordinates of all target boxes detected by a single image. boxes is a list, each element of which is a list of length 4, representing a box with 4 float values in order of xmin, ymin, xmax, ymax, i.e. the coordinates of the top left and bottom right corners. +- **socres**(list of float):Member variable indicating the confidence of all targets detected by a single image. +- **label_ids**(list of int):Member variable indicating all target categories detected for a single image. +- **masks**:Member variable that represents all instances of mask detected from a single image, with the same number of elements and shape size as boxes. +- **contain_masks**:Member variable indicating whether the detection result contains the instance mask, the result of the instance segmentation model is generally set to True. + +API: `fastdeploy.vision.Mask ` +- **data**:Member variable indicating a detected mask. +- **shape**:Member variable representing the shape of the mask, e.g. (h,w). + +## FaceDetectionResult +The FaceDetectionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the target frames detected by face detection, face landmarks, target confidence and the number of landmarks per face. + +API: `fastdeploy.vision.FaceDetectionResult` +- **data**(list of list(float)):Member variables that represent the coordinates of all target boxes detected by a single image. boxes is a list, each element of which is a list of length 4, representing a box with 4 float values in order of xmin, ymin, xmax, ymax, i.e. the coordinates of the top left and bottom right corners +- **scores**(list of float):Member variable indicating the confidence of all targets detected by a single image +- **landmarks**(list of list(float)): Member variables that represent the key points of all faces detected by a single image +- **landmarks_per_face**(int):Member variable indicating the number of key points in each face frame + +## FaceRecognitionResult +The FaceRecognitionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the embedding of the image features by the face recognition model. + +API: `fastdeploy.vision.FaceRecognitionResult` +- **landmarks_per_face**(list of float):Member variables, which indicate the final extracted features embedding of the face recognition model, can be used to calculate the feature similarity between faces. + +## MattingResult +The MattingResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the value of alpha transparency predicted by the model, the predicted outlook, etc. + +API:`fastdeploy.vision.MattingResult` +- **alpha**(list of float):This is a one-dimensional vector of predicted alpha transparency values in the range `[0.,1.]`, with length `h*w`, h,w being the height and width of the input image. +- **foreground**(list of float):This is a one-dimensional vector for the predicted foreground, the value domain is `[0.,255.]`, the length is `h*w*c`, h,w is the height and width of the input image, c is generally 3, foreground is not necessarily there, only if the model itself predicts the foreground, this property will be valid +- **contain_foreground**(bool):Indicates whether the predicted outcome includes the foreground +- **shape**(list of int): When `contain_foreground is false, the shape only contains (h,w), when contain_foreground is true, the shape contains (h,w,c), c is generally 3 + +## OCRResult +The OCRResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the text box detected in the image, the text box orientation classification, and the text content recognized inside the text box. + +API:`fastdeploy.vision.OCRResult ` +- **boxes**: Member variable, indicates the coordinates of all target boxes detected in a single image, `boxes.size()` indicates the number of boxes detected in a single image, each box is represented by 8 int values in order of the 4 coordinate points of the box, the order is lower left, lower right, upper right, upper left +- **text**:Member variable indicating the content of the recognized text in multiple text boxes, with the same number of elements as `boxes.size()` +- **rec_scores**:Member variable indicating the confidence level of the text identified in the box, the number of elements is the same as `boxes.size()` +- **cls_scores**:Member variable indicating the confidence level of the classification result of the text box, with the same number of elements as `boxes.size()` +- **cls_scores**:Member variable indicating the orientation category of the text box, the number of elements is the same as `boxes.size(`) From 9997fed53a17097f40aefe4332ec5e92836e714a Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 12:20:30 +0000 Subject: [PATCH 02/21] Add Readme for vision results --- docs/api_docs/python/vision_results.md | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/docs/api_docs/python/vision_results.md b/docs/api_docs/python/vision_results.md index 96c7778d3f9..999188ca9ef 100644 --- a/docs/api_docs/python/vision_results.md +++ b/docs/api_docs/python/vision_results.md @@ -3,8 +3,7 @@ ## ClassifyResult The code of ClassifyResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the classification label result and confidence the image. -API: `fastdeploy.vision.ClassifyResult` -The ClassifyResult will return: +API: `fastdeploy.vision.ClassifyResult`, The ClassifyResult will return: - **label_ids**(list of int):Member variables that represent the classification label results of a single image, the number of which is determined by the topk passed in when using the classification model. For example, you can return the label results of the top 5 categories. - **scores**(list of float):Member variables that indicate the confidence level of a single image on the corresponding classification result, the number of which is determined by the topk passed in when using the classification model, e.g. the confidence level of a top 5 classification can be returned. @@ -12,8 +11,7 @@ The ClassifyResult will return: ## SegmentationResult The code of SegmentationResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the segmentation category predicted for each pixel in the image and the probability of the segmentation category. -API: `fastdeploy.vision.SegmentationResult` -The SegmentationResult will return: +API: `fastdeploy.vision.SegmentationResult`, The SegmentationResult will return: - **label_ids**(list of int):Member variable indicating the segmentation category for each pixel of a single image - **score_map**(list of float):Member variable, the predicted probability value of the segmentation category corresponding to label_map (specified when exporting the model `--output_op argmax`) or the probability value normalized by softmax (specified when exporting the model `--output_op softmax` or when exporting the model `--output_op none` and set the model class member attribute `apply_softmax=true` when initializing the model) - **shape**(list of int):Member variable indicating the shape of the output image, as H*W. @@ -22,21 +20,21 @@ The SegmentationResult will return: ## DetectionResult The code of DetectionResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the target location (detection box), target class and target confidence level detected by the image. -API: `fastdeploy.vision.DetectionResult` +API: `fastdeploy.vision.DetectionResult`, The DetectionResult will return: - **boxes**(list of list(float)):Member variable, represents the coordinates of all target boxes detected by a single image. boxes is a list, each element of which is a list of length 4, representing a box with 4 float values in order of xmin, ymin, xmax, ymax, i.e. the coordinates of the top left and bottom right corners. - **socres**(list of float):Member variable indicating the confidence of all targets detected by a single image. - **label_ids**(list of int):Member variable indicating all target categories detected for a single image. - **masks**:Member variable that represents all instances of mask detected from a single image, with the same number of elements and shape size as boxes. - **contain_masks**:Member variable indicating whether the detection result contains the instance mask, the result of the instance segmentation model is generally set to True. -API: `fastdeploy.vision.Mask ` +API: `fastdeploy.vision.Mask `, The Mask will return: - **data**:Member variable indicating a detected mask. - **shape**:Member variable representing the shape of the mask, e.g. (h,w). ## FaceDetectionResult The FaceDetectionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the target frames detected by face detection, face landmarks, target confidence and the number of landmarks per face. -API: `fastdeploy.vision.FaceDetectionResult` +API: `fastdeploy.vision.FaceDetectionResult`, The FaceDetectionResult will return: - **data**(list of list(float)):Member variables that represent the coordinates of all target boxes detected by a single image. boxes is a list, each element of which is a list of length 4, representing a box with 4 float values in order of xmin, ymin, xmax, ymax, i.e. the coordinates of the top left and bottom right corners - **scores**(list of float):Member variable indicating the confidence of all targets detected by a single image - **landmarks**(list of list(float)): Member variables that represent the key points of all faces detected by a single image @@ -45,13 +43,13 @@ API: `fastdeploy.vision.FaceDetectionResult` ## FaceRecognitionResult The FaceRecognitionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the embedding of the image features by the face recognition model. -API: `fastdeploy.vision.FaceRecognitionResult` +API: `fastdeploy.vision.FaceRecognitionResult`, The FaceRecognitionResult will return: - **landmarks_per_face**(list of float):Member variables, which indicate the final extracted features embedding of the face recognition model, can be used to calculate the feature similarity between faces. ## MattingResult The MattingResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the value of alpha transparency predicted by the model, the predicted outlook, etc. -API:`fastdeploy.vision.MattingResult` +API:`fastdeploy.vision.MattingResult`, The MattingResult will return: - **alpha**(list of float):This is a one-dimensional vector of predicted alpha transparency values in the range `[0.,1.]`, with length `h*w`, h,w being the height and width of the input image. - **foreground**(list of float):This is a one-dimensional vector for the predicted foreground, the value domain is `[0.,255.]`, the length is `h*w*c`, h,w is the height and width of the input image, c is generally 3, foreground is not necessarily there, only if the model itself predicts the foreground, this property will be valid - **contain_foreground**(bool):Indicates whether the predicted outcome includes the foreground @@ -60,7 +58,7 @@ API:`fastdeploy.vision.MattingResult` ## OCRResult The OCRResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the text box detected in the image, the text box orientation classification, and the text content recognized inside the text box. -API:`fastdeploy.vision.OCRResult ` +API:`fastdeploy.vision.OCRResult`, The OCRResult will return: - **boxes**: Member variable, indicates the coordinates of all target boxes detected in a single image, `boxes.size()` indicates the number of boxes detected in a single image, each box is represented by 8 int values in order of the 4 coordinate points of the box, the order is lower left, lower right, upper right, upper left - **text**:Member variable indicating the content of the recognized text in multiple text boxes, with the same number of elements as `boxes.size()` - **rec_scores**:Member variable indicating the confidence level of the text identified in the box, the number of elements is the same as `boxes.size()` From 6fd8784b1970b5ab777a33304804e65907d276af Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 12:26:37 +0000 Subject: [PATCH 03/21] Add Readme for vision results --- docs/api_docs/python/vision_results.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/api_docs/python/vision_results.md b/docs/api_docs/python/vision_results.md index 999188ca9ef..80ca8788d1b 100644 --- a/docs/api_docs/python/vision_results.md +++ b/docs/api_docs/python/vision_results.md @@ -1,4 +1,4 @@ -# Description of vision model prediction results +# Description of Vision Results ## ClassifyResult The code of ClassifyResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the classification label result and confidence the image. @@ -59,8 +59,8 @@ API:`fastdeploy.vision.MattingResult`, The MattingResult will return: The OCRResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the text box detected in the image, the text box orientation classification, and the text content recognized inside the text box. API:`fastdeploy.vision.OCRResult`, The OCRResult will return: -- **boxes**: Member variable, indicates the coordinates of all target boxes detected in a single image, `boxes.size()` indicates the number of boxes detected in a single image, each box is represented by 8 int values in order of the 4 coordinate points of the box, the order is lower left, lower right, upper right, upper left -- **text**:Member variable indicating the content of the recognized text in multiple text boxes, with the same number of elements as `boxes.size()` -- **rec_scores**:Member variable indicating the confidence level of the text identified in the box, the number of elements is the same as `boxes.size()` -- **cls_scores**:Member variable indicating the confidence level of the classification result of the text box, with the same number of elements as `boxes.size()` -- **cls_scores**:Member variable indicating the orientation category of the text box, the number of elements is the same as `boxes.size(`) +- **boxes**(list of list(int)): Member variable, indicates the coordinates of all target boxes detected in a single image, `boxes.size()` indicates the number of boxes detected in a single image, each box is represented by 8 int values in order of the 4 coordinate points of the box, the order is lower left, lower right, upper right, upper left. +- **text**(list of string):Member variable indicating the content of the recognized text in multiple text boxes, with the same number of elements as `boxes.size()` +- **rec_scores**(list of float):Member variable indicating the confidence level of the text identified in the box, the number of elements is the same as `boxes.size()` +- **cls_scores**(list of float):Member variable indicating the confidence level of the classification result of the text box, with the same number of elements as `boxes.size()` +- **cls_labels**(list if int):Member variable indicating the orientation category of the text box, the number of elements is the same as `boxes.size(`) From c34823fdf6e859e186a510e944e148bf8abb1530 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 12:40:16 +0000 Subject: [PATCH 04/21] Add Readme for vision results --- docs/api_docs/python/vision_results_cn.md | 64 +++++++++++++++++++ ...vision_results.md => vision_results_en.md} | 0 2 files changed, 64 insertions(+) create mode 100644 docs/api_docs/python/vision_results_cn.md rename docs/api_docs/python/{vision_results.md => vision_results_en.md} (100%) diff --git a/docs/api_docs/python/vision_results_cn.md b/docs/api_docs/python/vision_results_cn.md new file mode 100644 index 00000000000..a0d2f0edf3e --- /dev/null +++ b/docs/api_docs/python/vision_results_cn.md @@ -0,0 +1,64 @@ +# 视觉模型预测结果说明 + +## ClassifyResult +ClassifyResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像的分类结果和置信度 + +API:`fastdeploy.vision.ClassifyResult`, 该结果返回: +**label_ids**(list of int): 成员变量,表示单张图片的分类结果,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类结果 +**scores**(list of float): 成员变量,表示单张图片在相应分类结果上的置信度,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类置信度 + + +## SegmentationResult +SegmentationResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像中每个像素预测出来的分割类别和分割类别的概率值 + +API:`fastdeploy.vision.SegmentationResul`, 该结果返回: +**label_map**(list of int): 成员变量,表示单张图片每个像素点的分割类别 +**score_map**(list of float): 成员变量,与label_map一一对应的所预测的分割类别概率值(当导出模型时指定`--output_op argmax`)或者经过softmax归一化化后的概率值(当导出模型时指定`--output_op softmax`或者导出模型时指定`--output_op none`同时模型初始化的时候设置模型类成员属性`apply_softmax=true`) +**shape**(list of int): 成员变量,表示输出图片的shape,为H*W + +## DetectionResult +DetectionResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测出来的目标框、目标类别和目标置信度。 + +API:`fastdeploy.vision.DetectionResult` , 该结果返回: +**boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 +**scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 +**label_ids**(list of int): 成员变量,表示单张图片检测出来的所有目标类别 +**masks**: 成员变量,表示单张图片检测出来的所有实例mask,其元素个数及shape大小与boxes一致 +**contain_masks**: 成员变量,表示检测结果中是否包含实例mask,实例分割模型的结果此项一般为True. +fastdeploy.vision.Mask +**data**: 成员变量,表示检测到的一个mask +**shape**: 成员变量,表示mask的shape,如 `(h,w)` + + +## FaceDetectionResult +FaceDetectionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸检测出来的目标框、人脸landmarks,目标置信度和每张人脸的landmark数量。 +API:`fastdeploy.vision.FaceDetectionResult` , 该结果返回: +**boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 +**scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 +**landmarks**(list of list(float)): 成员变量,表示单张图片检测出来的所有人脸的关键点 +**landmarks_per_face**(int): 成员变量,表示每个人脸框中的关键点的数量 + + +## FaceRecognitionResult +FaceRecognitionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸识别模型对图像特征的embedding。 + +API:`fastdeploy.vision.FaceRecognitionResult`, 该结果返回: +**embedding**(list of float): 成员变量,表示人脸识别模型最终提取的特征embedding,可以用来计算人脸之间的特征相似度。 + + +## MattingResult +MattingResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明模型预测的alpha透明度的值,预测的前景等。 +API:`fastdeploy.vision.MattingResult`, 该结果返回: +**alpha**(list of float): 是一维向量,为预测的alpha透明度的值,值域为`[0.,1.]`,长度为`h*w`,h,w为输入图像的高和宽 +**foreground(list of float): 是一维向量,为预测的前景,值域为`[0.,255.]`,长度为`h*w*c`,h,w为输入图像的高和宽,c一般为3,`foreground`不是一定有的,只有模型本身预测了前景,这个属性才会有效 +**contain_foreground**(bool): 表示预测的结果是否包含前景 +**shape**(list of int): 表示输出结果的shape,当`contain_foreground`为false,shape只包含`(h,w)`,当`contain_foreground`为true,shape包含`(h,w,c)`, c一般为3 + +## OCRResult +OCRResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测和识别出来的文本框,文本框方向分类,以及文本框内的文本内容 +API:`fastdeploy.vision.OCRResult`, 该结果返回: +**boxes**: 成员变量,表示单张图片检测出来的所有目标框坐标,boxes.size()表示单张图内检测出的框的个数,每个框以8个int数值依次表示框的4个坐标点,顺序为左下,右下,右上,左上 +**text**: 成员变量,表示多个文本框内被识别出来的文本内容,其元素个数与`boxes.size()`一致 +**rec_scores**: 成员变量,表示文本框内识别出来的文本的置信度,其元素个数与`boxes.size()`一致 +**cls_scores**: 成员变量,表示文本框的分类结果的置信度,其元素个数与`boxes.size()`一致 +**cls_labels**: 成员变量,表示文本框的方向分类类别,其元素个数与`boxes.size()`一致 diff --git a/docs/api_docs/python/vision_results.md b/docs/api_docs/python/vision_results_en.md similarity index 100% rename from docs/api_docs/python/vision_results.md rename to docs/api_docs/python/vision_results_en.md From ae11b80dc0f262dbfb0c98120899445f1de4080c Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 12:42:58 +0000 Subject: [PATCH 05/21] Add Readme for vision results --- docs/api_docs/python/vision_results_cn.md | 56 ++++++++++++----------- 1 file changed, 30 insertions(+), 26 deletions(-) diff --git a/docs/api_docs/python/vision_results_cn.md b/docs/api_docs/python/vision_results_cn.md index a0d2f0edf3e..c0d48fb1498 100644 --- a/docs/api_docs/python/vision_results_cn.md +++ b/docs/api_docs/python/vision_results_cn.md @@ -4,61 +4,65 @@ ClassifyResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像的分类结果和置信度 API:`fastdeploy.vision.ClassifyResult`, 该结果返回: -**label_ids**(list of int): 成员变量,表示单张图片的分类结果,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类结果 -**scores**(list of float): 成员变量,表示单张图片在相应分类结果上的置信度,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类置信度 +- **label_ids**(list of int): 成员变量,表示单张图片的分类结果,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类结果 +- **scores**(list of float): 成员变量,表示单张图片在相应分类结果上的置信度,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类置信度 ## SegmentationResult SegmentationResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像中每个像素预测出来的分割类别和分割类别的概率值 API:`fastdeploy.vision.SegmentationResul`, 该结果返回: -**label_map**(list of int): 成员变量,表示单张图片每个像素点的分割类别 -**score_map**(list of float): 成员变量,与label_map一一对应的所预测的分割类别概率值(当导出模型时指定`--output_op argmax`)或者经过softmax归一化化后的概率值(当导出模型时指定`--output_op softmax`或者导出模型时指定`--output_op none`同时模型初始化的时候设置模型类成员属性`apply_softmax=true`) -**shape**(list of int): 成员变量,表示输出图片的shape,为H*W +- **label_map**(list of int): 成员变量,表示单张图片每个像素点的分割类别 +- **score_map**(list of float): 成员变量,与label_map一一对应的所预测的分割类别概率值(当导出模型时指定`--output_op argmax`)或者经过softmax归一化化后的概率值(当导出模型时指定`--output_op softmax`或者导出模型时指定`--output_op none`同时模型初始化的时候设置模型类成员属性`apply_softmax=true`) +- **shape**(list of int): 成员变量,表示输出图片的shape,为H*W ## DetectionResult DetectionResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测出来的目标框、目标类别和目标置信度。 API:`fastdeploy.vision.DetectionResult` , 该结果返回: -**boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 -**scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 -**label_ids**(list of int): 成员变量,表示单张图片检测出来的所有目标类别 -**masks**: 成员变量,表示单张图片检测出来的所有实例mask,其元素个数及shape大小与boxes一致 -**contain_masks**: 成员变量,表示检测结果中是否包含实例mask,实例分割模型的结果此项一般为True. +- **boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 +- **scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 +- **label_ids**(list of int): 成员变量,表示单张图片检测出来的所有目标类别 +- **masks**: 成员变量,表示单张图片检测出来的所有实例mask,其元素个数及shape大小与boxes一致 +- **contain_masks**: 成员变量,表示检测结果中是否包含实例mask,实例分割模型的结果此项一般为True. + fastdeploy.vision.Mask -**data**: 成员变量,表示检测到的一个mask -**shape**: 成员变量,表示mask的shape,如 `(h,w)` +- **data**: 成员变量,表示检测到的一个mask +- **shape**: 成员变量,表示mask的shape,如 `(h,w)` ## FaceDetectionResult FaceDetectionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸检测出来的目标框、人脸landmarks,目标置信度和每张人脸的landmark数量。 + API:`fastdeploy.vision.FaceDetectionResult` , 该结果返回: -**boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 -**scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 -**landmarks**(list of list(float)): 成员变量,表示单张图片检测出来的所有人脸的关键点 -**landmarks_per_face**(int): 成员变量,表示每个人脸框中的关键点的数量 +- **boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 +- **scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 +- **landmarks**(list of list(float)): 成员变量,表示单张图片检测出来的所有人脸的关键点 +- **landmarks_per_face**(int): 成员变量,表示每个人脸框中的关键点的数量 ## FaceRecognitionResult FaceRecognitionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸识别模型对图像特征的embedding。 API:`fastdeploy.vision.FaceRecognitionResult`, 该结果返回: -**embedding**(list of float): 成员变量,表示人脸识别模型最终提取的特征embedding,可以用来计算人脸之间的特征相似度。 +- **embedding**(list of float): 成员变量,表示人脸识别模型最终提取的特征embedding,可以用来计算人脸之间的特征相似度。 ## MattingResult MattingResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明模型预测的alpha透明度的值,预测的前景等。 + API:`fastdeploy.vision.MattingResult`, 该结果返回: -**alpha**(list of float): 是一维向量,为预测的alpha透明度的值,值域为`[0.,1.]`,长度为`h*w`,h,w为输入图像的高和宽 -**foreground(list of float): 是一维向量,为预测的前景,值域为`[0.,255.]`,长度为`h*w*c`,h,w为输入图像的高和宽,c一般为3,`foreground`不是一定有的,只有模型本身预测了前景,这个属性才会有效 -**contain_foreground**(bool): 表示预测的结果是否包含前景 -**shape**(list of int): 表示输出结果的shape,当`contain_foreground`为false,shape只包含`(h,w)`,当`contain_foreground`为true,shape包含`(h,w,c)`, c一般为3 +- **alpha**(list of float): 是一维向量,为预测的alpha透明度的值,值域为`[0.,1.]`,长度为`h*w`,h,w为输入图像的高和宽 +- **foreground**(list of float): 是一维向量,为预测的前景,值域为`[0.,255.]`,长度为`h*w*c`,h,w为输入图像的高和宽,c一般为3,`foreground`不是一定有的,只有模型本身预测了前景,这个属性才会有效 +- **contain_foreground**(bool): 表示预测的结果是否包含前景 +- **shape**(list of int): 表示输出结果的shape,当`contain_foreground`为false,shape只包含`(h,w)`,当`contain_foreground`为true,shape包含`(h,w,c)`, c一般为3 ## OCRResult OCRResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测和识别出来的文本框,文本框方向分类,以及文本框内的文本内容 + API:`fastdeploy.vision.OCRResult`, 该结果返回: -**boxes**: 成员变量,表示单张图片检测出来的所有目标框坐标,boxes.size()表示单张图内检测出的框的个数,每个框以8个int数值依次表示框的4个坐标点,顺序为左下,右下,右上,左上 -**text**: 成员变量,表示多个文本框内被识别出来的文本内容,其元素个数与`boxes.size()`一致 -**rec_scores**: 成员变量,表示文本框内识别出来的文本的置信度,其元素个数与`boxes.size()`一致 -**cls_scores**: 成员变量,表示文本框的分类结果的置信度,其元素个数与`boxes.size()`一致 -**cls_labels**: 成员变量,表示文本框的方向分类类别,其元素个数与`boxes.size()`一致 +- **boxes**(list of list(int)): 成员变量,表示单张图片检测出来的所有目标框坐标,boxes.size()表示单张图内检测出的框的个数,每个框以8个int数值依次表示框的4个坐标点,顺序为左下,右下,右上,左上 +- **text**(list of string): 成员变量,表示多个文本框内被识别出来的文本内容,其元素个数与`boxes.size()`一致 +- **rec_scores**(list of float): 成员变量,表示文本框内识别出来的文本的置信度,其元素个数与`boxes.size()`一致 +- **cls_scores**(list of float): 成员变量,表示文本框的分类结果的置信度,其元素个数与`boxes.size()`一致 +- **cls_labels**(list of int): 成员变量,表示文本框的方向分类类别,其元素个数与`boxes.size()`一致 From cef34150bce9965cba614252dc00c3e612bd4ae1 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 13:01:37 +0000 Subject: [PATCH 06/21] Add Readme for vision results --- docs/api_docs/python/vision_results_cn.md | 68 +++++++++++------------ docs/api_docs/python/vision_results_en.md | 36 ++++++------ 2 files changed, 52 insertions(+), 52 deletions(-) diff --git a/docs/api_docs/python/vision_results_cn.md b/docs/api_docs/python/vision_results_cn.md index c0d48fb1498..cbae4cd99e0 100644 --- a/docs/api_docs/python/vision_results_cn.md +++ b/docs/api_docs/python/vision_results_cn.md @@ -1,68 +1,68 @@ # 视觉模型预测结果说明 ## ClassifyResult -ClassifyResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像的分类结果和置信度 +ClassifyResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像的分类结果和置信度. API:`fastdeploy.vision.ClassifyResult`, 该结果返回: -- **label_ids**(list of int): 成员变量,表示单张图片的分类结果,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类结果 -- **scores**(list of float): 成员变量,表示单张图片在相应分类结果上的置信度,其个数根据在使用分类模型时传入的topk决定,例如可以返回top 5的分类置信度 +- **label_ids**(list of int): 成员变量,表示单张图片的分类结果,其个数根据在使用分类模型时传入的`topk`决定,例如可以返回`top5`的分类结果. +- **scores**(list of float): 成员变量,表示单张图片在相应分类结果上的置信度,其个数根据在使用分类模型时传入的`topk`决定,例如可以返回`top5`的分类置信度. ## SegmentationResult -SegmentationResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像中每个像素预测出来的分割类别和分割类别的概率值 +SegmentationResult代码定义在`fastdeploy/vision/ttommon/result.h`中,用于表明图像中每个像素预测出来的分割类别和分割类别的概率值. -API:`fastdeploy.vision.SegmentationResul`, 该结果返回: -- **label_map**(list of int): 成员变量,表示单张图片每个像素点的分割类别 -- **score_map**(list of float): 成员变量,与label_map一一对应的所预测的分割类别概率值(当导出模型时指定`--output_op argmax`)或者经过softmax归一化化后的概率值(当导出模型时指定`--output_op softmax`或者导出模型时指定`--output_op none`同时模型初始化的时候设置模型类成员属性`apply_softmax=true`) -- **shape**(list of int): 成员变量,表示输出图片的shape,为H*W +API:`fastdeploy.vision.SegmentationResult`, 该结果返回: +- **label_map**(list of int): 成员变量,表示单张图片每个像素点的分割类别. +- **score_map**(list of float): 成员变量,与label_map一一对应的所预测的分割类别概率值(当导出模型时指定`--output_op argmax`)或者经过softmax归一化化后的概率值(当导出模型时指定`--output_op softmax`或者导出模型时指定`--output_op none`同时模型初始化的时候设置模型类成员属性`apply_softmax=true`). +- **shape**(list of int): 成员变量,表示输出图片的尺寸,为`H*W`. ## DetectionResult -DetectionResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测出来的目标框、目标类别和目标置信度。 +DetectionResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测出来的目标框、目标类别和目标置信度. API:`fastdeploy.vision.DetectionResult` , 该结果返回: -- **boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 -- **scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 -- **label_ids**(list of int): 成员变量,表示单张图片检测出来的所有目标类别 -- **masks**: 成员变量,表示单张图片检测出来的所有实例mask,其元素个数及shape大小与boxes一致 -- **contain_masks**: 成员变量,表示检测结果中是否包含实例mask,实例分割模型的结果此项一般为True. +- **boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标. boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标. +- **scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度. +- **label_ids**(list of int): 成员变量,表示单张图片检测出来的所有目标类别. +- **masks**: 成员变量,表示单张图片检测出来的所有实例mask,其元素个数及shape大小与boxes一致. +- **contain_masks**: 成员变量,表示检测结果中是否包含实例mask,实例分割模型的结果此项一般为`True`. fastdeploy.vision.Mask -- **data**: 成员变量,表示检测到的一个mask -- **shape**: 成员变量,表示mask的shape,如 `(h,w)` +- **data**: 成员变量,表示检测到的一个mask. +- **shape**: 成员变量,表示mask的尺寸,如 `H*W`. ## FaceDetectionResult -FaceDetectionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸检测出来的目标框、人脸landmarks,目标置信度和每张人脸的landmark数量。 +FaceDetectionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸检测出来的目标框、人脸landmarks,目标置信度和每张人脸的landmark数量. API:`fastdeploy.vision.FaceDetectionResult` , 该结果返回: -- **boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标 -- **scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度 -- **landmarks**(list of list(float)): 成员变量,表示单张图片检测出来的所有人脸的关键点 -- **landmarks_per_face**(int): 成员变量,表示每个人脸框中的关键点的数量 +- **boxes**(list of list(float)): 成员变量,表示单张图片检测出来的所有目标框坐标。boxes是一个list,其每个元素为一个长度为4的list, 表示为一个框,每个框以4个float数值依次表示xmin, ymin, xmax, ymax, 即左上角和右下角坐标. +- **scores**(list of float): 成员变量,表示单张图片检测出来的所有目标置信度. +- **landmarks**(list of list(float)): 成员变量,表示单张图片检测出来的所有人脸的关键点. +- **landmarks_per_face**(int): 成员变量,表示每个人脸框中的关键点的数量. ## FaceRecognitionResult -FaceRecognitionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸识别模型对图像特征的embedding。 +FaceRecognitionResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明人脸识别模型对图像特征的embedding. API:`fastdeploy.vision.FaceRecognitionResult`, 该结果返回: -- **embedding**(list of float): 成员变量,表示人脸识别模型最终提取的特征embedding,可以用来计算人脸之间的特征相似度。 +- **embedding**(list of float): 成员变量,表示人脸识别模型最终提取的特征embedding,可以用来计算人脸之间的特征相似度. ## MattingResult -MattingResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明模型预测的alpha透明度的值,预测的前景等。 +MattingResult 代码定义在`fastdeploy/vision/common/result.h`中,用于表明模型预测的alpha透明度的值,预测的前景等. API:`fastdeploy.vision.MattingResult`, 该结果返回: -- **alpha**(list of float): 是一维向量,为预测的alpha透明度的值,值域为`[0.,1.]`,长度为`h*w`,h,w为输入图像的高和宽 -- **foreground**(list of float): 是一维向量,为预测的前景,值域为`[0.,255.]`,长度为`h*w*c`,h,w为输入图像的高和宽,c一般为3,`foreground`不是一定有的,只有模型本身预测了前景,这个属性才会有效 -- **contain_foreground**(bool): 表示预测的结果是否包含前景 -- **shape**(list of int): 表示输出结果的shape,当`contain_foreground`为false,shape只包含`(h,w)`,当`contain_foreground`为true,shape包含`(h,w,c)`, c一般为3 +- **alpha**(list of float): 是一维向量,为预测的alpha透明度的值,值域为`[0.,1.]`,长度为`H*W`,H,W为输入图像的高和宽. +- **foreground**(list of float): 是一维向量,为预测的前景,值域为`[0.,255.]`,长度为`H*W*C`,H,W为输入图像的高和宽,C一般为3,`foreground`不是一定有的,只有模型本身预测了前景,这个属性才会有效. +- **contain_foreground**(bool): 表示预测的结果是否包含前景. +- **shape**(list of int): 表示输出结果的shape,当`contain_foreground`为`false`,shape只包含`(H,W)`,当`contain_foreground`为true,shape包含`(H,W,C)`, C一般为3. ## OCRResult -OCRResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测和识别出来的文本框,文本框方向分类,以及文本框内的文本内容 +OCRResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测和识别出来的文本框,文本框方向分类,以及文本框内的文本内容. API:`fastdeploy.vision.OCRResult`, 该结果返回: -- **boxes**(list of list(int)): 成员变量,表示单张图片检测出来的所有目标框坐标,boxes.size()表示单张图内检测出的框的个数,每个框以8个int数值依次表示框的4个坐标点,顺序为左下,右下,右上,左上 -- **text**(list of string): 成员变量,表示多个文本框内被识别出来的文本内容,其元素个数与`boxes.size()`一致 -- **rec_scores**(list of float): 成员变量,表示文本框内识别出来的文本的置信度,其元素个数与`boxes.size()`一致 -- **cls_scores**(list of float): 成员变量,表示文本框的分类结果的置信度,其元素个数与`boxes.size()`一致 -- **cls_labels**(list of int): 成员变量,表示文本框的方向分类类别,其元素个数与`boxes.size()`一致 +- **boxes**(list of list(int)): 成员变量,表示单张图片检测出来的所有目标框坐标,boxes.size()表示单张图内检测出的框的个数,每个框以8个int数值依次表示框的4个坐标点,顺序为左下,右下,右上,左上. +- **text**(list of string): 成员变量,表示多个文本框内被识别出来的文本内容,其元素个数与`boxes.size()`一致. +- **rec_scores**(list of float): 成员变量,表示文本框内识别出来的文本的置信度,其元素个数与`boxes.size()`一致. +- **cls_scores**(list of float): 成员变量,表示文本框的分类结果的置信度,其元素个数与`boxes.size()`一致. +- **cls_labels**(list of int): 成员变量,表示文本框的方向分类类别,其元素个数与`boxes.size()`一致. diff --git a/docs/api_docs/python/vision_results_en.md b/docs/api_docs/python/vision_results_en.md index 80ca8788d1b..a1561497a69 100644 --- a/docs/api_docs/python/vision_results_en.md +++ b/docs/api_docs/python/vision_results_en.md @@ -4,17 +4,17 @@ The code of ClassifyResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the classification label result and confidence the image. API: `fastdeploy.vision.ClassifyResult`, The ClassifyResult will return: -- **label_ids**(list of int):Member variables that represent the classification label results of a single image, the number of which is determined by the topk passed in when using the classification model. For example, you can return the label results of the top 5 categories. +- **label_ids**(list of int):Member variables that represent the classification label results of a single image, the number of which is determined by the `topk ` passed in when using the classification model. For example, you can return the label results of the Top 5 categories. -- **scores**(list of float):Member variables that indicate the confidence level of a single image on the corresponding classification result, the number of which is determined by the topk passed in when using the classification model, e.g. the confidence level of a top 5 classification can be returned. +- **scores**(list of float):Member variables that indicate the confidence level of a single image on the corresponding classification result, the number of which is determined by the `topk ` passed in when using the classification model, e.g. the confidence level of a Top 5 classification can be returned. ## SegmentationResult The code of SegmentationResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the segmentation category predicted for each pixel in the image and the probability of the segmentation category. API: `fastdeploy.vision.SegmentationResult`, The SegmentationResult will return: -- **label_ids**(list of int):Member variable indicating the segmentation category for each pixel of a single image -- **score_map**(list of float):Member variable, the predicted probability value of the segmentation category corresponding to label_map (specified when exporting the model `--output_op argmax`) or the probability value normalized by softmax (specified when exporting the model `--output_op softmax` or when exporting the model `--output_op none` and set the model class member attribute `apply_softmax=true` when initializing the model) -- **shape**(list of int):Member variable indicating the shape of the output image, as H*W. +- **label_ids**(list of int):Member variable indicating the segmentation category for each pixel of a single image. +- **score_map**(list of float):Member variable, the predicted probability value of the segmentation category corresponding to `label_map ` (specified when exporting the model `--output_op argmax`) or the probability value normalized by softmax (specified when exporting the model `--output_op softmax` or when exporting the model `--output_op none` and set the model class member attribute `apply_softmax=true` when initializing the model). +- **shape**(list of int):Member variable indicating the shape of the output image, as `H*W `. ## DetectionResult @@ -29,16 +29,16 @@ API: `fastdeploy.vision.DetectionResult`, The DetectionResult will return: API: `fastdeploy.vision.Mask `, The Mask will return: - **data**:Member variable indicating a detected mask. -- **shape**:Member variable representing the shape of the mask, e.g. (h,w). +- **shape**:Member variable representing the shape of the mask, e.g. `(H,W) `. ## FaceDetectionResult The FaceDetectionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the target frames detected by face detection, face landmarks, target confidence and the number of landmarks per face. API: `fastdeploy.vision.FaceDetectionResult`, The FaceDetectionResult will return: -- **data**(list of list(float)):Member variables that represent the coordinates of all target boxes detected by a single image. boxes is a list, each element of which is a list of length 4, representing a box with 4 float values in order of xmin, ymin, xmax, ymax, i.e. the coordinates of the top left and bottom right corners -- **scores**(list of float):Member variable indicating the confidence of all targets detected by a single image -- **landmarks**(list of list(float)): Member variables that represent the key points of all faces detected by a single image -- **landmarks_per_face**(int):Member variable indicating the number of key points in each face frame +- **data**(list of list(float)):Member variables that represent the coordinates of all target boxes detected by a single image. boxes is a list, each element of which is a list of length 4, representing a box with 4 float values in order of xmin, ymin, xmax, ymax, i.e. the coordinates of the top left and bottom right corners. +- **scores**(list of float):Member variable indicating the confidence of all targets detected by a single image. +- **landmarks**(list of list(float)): Member variables that represent the key points of all faces detected by a single image. +- **landmarks_per_face**(int):Member variable indicating the number of key points in each face frame. ## FaceRecognitionResult The FaceRecognitionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the embedding of the image features by the face recognition model. @@ -50,17 +50,17 @@ API: `fastdeploy.vision.FaceRecognitionResult`, The FaceRecognitionResult will r The MattingResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the value of alpha transparency predicted by the model, the predicted outlook, etc. API:`fastdeploy.vision.MattingResult`, The MattingResult will return: -- **alpha**(list of float):This is a one-dimensional vector of predicted alpha transparency values in the range `[0.,1.]`, with length `h*w`, h,w being the height and width of the input image. -- **foreground**(list of float):This is a one-dimensional vector for the predicted foreground, the value domain is `[0.,255.]`, the length is `h*w*c`, h,w is the height and width of the input image, c is generally 3, foreground is not necessarily there, only if the model itself predicts the foreground, this property will be valid -- **contain_foreground**(bool):Indicates whether the predicted outcome includes the foreground -- **shape**(list of int): When `contain_foreground is false, the shape only contains (h,w), when contain_foreground is true, the shape contains (h,w,c), c is generally 3 +- **alpha**(list of float):This is a one-dimensional vector of predicted alpha transparency values in the range `[0.,1.]`, with length `H*W`, H,W being the height and width of the input image. +- **foreground**(list of float):This is a one-dimensional vector for the predicted foreground, the value domain is `[0.,255.]`, the length is `H*W*C`, H,W is the height and width of the input image, C is generally 3, foreground is not necessarily there, only if the model itself predicts the foreground, this property will be valid. +- **contain_foreground**(bool):Indicates whether the predicted outcome includes the foreground. +- **shape**(list of int): When `contain_foreground` is false, the shape only contains `(H,W)`, when `contain_foreground` is `true,` the shape contains `(H,W,C)`, C is generally 3. ## OCRResult The OCRResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the text box detected in the image, the text box orientation classification, and the text content recognized inside the text box. API:`fastdeploy.vision.OCRResult`, The OCRResult will return: - **boxes**(list of list(int)): Member variable, indicates the coordinates of all target boxes detected in a single image, `boxes.size()` indicates the number of boxes detected in a single image, each box is represented by 8 int values in order of the 4 coordinate points of the box, the order is lower left, lower right, upper right, upper left. -- **text**(list of string):Member variable indicating the content of the recognized text in multiple text boxes, with the same number of elements as `boxes.size()` -- **rec_scores**(list of float):Member variable indicating the confidence level of the text identified in the box, the number of elements is the same as `boxes.size()` -- **cls_scores**(list of float):Member variable indicating the confidence level of the classification result of the text box, with the same number of elements as `boxes.size()` -- **cls_labels**(list if int):Member variable indicating the orientation category of the text box, the number of elements is the same as `boxes.size(`) +- **text**(list of string):Member variable indicating the content of the recognized text in multiple text boxes, with the same number of elements as `boxes.size()`. +- **rec_scores**(list of float):Member variable indicating the confidence level of the text identified in the box, the number of elements is the same as `boxes.size()`. +- **cls_scores**(list of float):Member variable indicating the confidence level of the classification result of the text box, with the same number of elements as `boxes.size()`. +- **cls_labels**(list of int):Member variable indicating the orientation category of the text box, the number of elements is the same as `boxes.size()`. From 532f18fda136d4dc8c62920f2b9ea18ed06043a7 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 13:04:27 +0000 Subject: [PATCH 07/21] Add Readme for vision results --- docs/api_docs/python/vision_results_cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/api_docs/python/vision_results_cn.md b/docs/api_docs/python/vision_results_cn.md index cbae4cd99e0..53d4e255d87 100644 --- a/docs/api_docs/python/vision_results_cn.md +++ b/docs/api_docs/python/vision_results_cn.md @@ -55,7 +55,7 @@ API:`fastdeploy.vision.MattingResult`, 该结果返回: - **alpha**(list of float): 是一维向量,为预测的alpha透明度的值,值域为`[0.,1.]`,长度为`H*W`,H,W为输入图像的高和宽. - **foreground**(list of float): 是一维向量,为预测的前景,值域为`[0.,255.]`,长度为`H*W*C`,H,W为输入图像的高和宽,C一般为3,`foreground`不是一定有的,只有模型本身预测了前景,这个属性才会有效. - **contain_foreground**(bool): 表示预测的结果是否包含前景. -- **shape**(list of int): 表示输出结果的shape,当`contain_foreground`为`false`,shape只包含`(H,W)`,当`contain_foreground`为true,shape包含`(H,W,C)`, C一般为3. +- **shape**(list of int): 表示输出结果的shape,当`contain_foreground`为`false`,shape只包含`(H,W)`,当`contain_foreground`为`true`,shape包含`(H,W,C)`, C一般为3. ## OCRResult OCRResult代码定义在`fastdeploy/vision/common/result.h`中,用于表明图像检测和识别出来的文本框,文本框方向分类,以及文本框内的文本内容. From 302743c92c63b27dbaf393ffd9d4a0314516f27c Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 13:06:15 +0000 Subject: [PATCH 08/21] Add Readme for vision results --- docs/api_docs/python/vision_results_en.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/api_docs/python/vision_results_en.md b/docs/api_docs/python/vision_results_en.md index a1561497a69..3fa559c38bf 100644 --- a/docs/api_docs/python/vision_results_en.md +++ b/docs/api_docs/python/vision_results_en.md @@ -4,17 +4,17 @@ The code of ClassifyResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the classification label result and confidence the image. API: `fastdeploy.vision.ClassifyResult`, The ClassifyResult will return: -- **label_ids**(list of int):Member variables that represent the classification label results of a single image, the number of which is determined by the `topk ` passed in when using the classification model. For example, you can return the label results of the Top 5 categories. +- **label_ids**(list of int):Member variables that represent the classification label results of a single image, the number of which is determined by the `topk` passed in when using the classification model. For example, you can return the label results of the Top 5 categories. -- **scores**(list of float):Member variables that indicate the confidence level of a single image on the corresponding classification result, the number of which is determined by the `topk ` passed in when using the classification model, e.g. the confidence level of a Top 5 classification can be returned. +- **scores**(list of float):Member variables that indicate the confidence level of a single image on the corresponding classification result, the number of which is determined by the `topk` passed in when using the classification model, e.g. the confidence level of a Top 5 classification can be returned. ## SegmentationResult The code of SegmentationResult is defined in `fastdeploy/vision/common/result.h` and is used to indicate the segmentation category predicted for each pixel in the image and the probability of the segmentation category. API: `fastdeploy.vision.SegmentationResult`, The SegmentationResult will return: - **label_ids**(list of int):Member variable indicating the segmentation category for each pixel of a single image. -- **score_map**(list of float):Member variable, the predicted probability value of the segmentation category corresponding to `label_map ` (specified when exporting the model `--output_op argmax`) or the probability value normalized by softmax (specified when exporting the model `--output_op softmax` or when exporting the model `--output_op none` and set the model class member attribute `apply_softmax=true` when initializing the model). -- **shape**(list of int):Member variable indicating the shape of the output image, as `H*W `. +- **score_map**(list of float):Member variable, the predicted probability value of the segmentation category corresponding to `label_map` (specified when exporting the model `--output_op argmax`) or the probability value normalized by softmax (specified when exporting the model `--output_op softmax` or when exporting the model `--output_op none` and set the model class member attribute `apply_softmax=true` when initializing the model). +- **shape**(list of int):Member variable indicating the shape of the output image, as `H*W`. ## DetectionResult @@ -27,9 +27,9 @@ API: `fastdeploy.vision.DetectionResult`, The DetectionResult will return: - **masks**:Member variable that represents all instances of mask detected from a single image, with the same number of elements and shape size as boxes. - **contain_masks**:Member variable indicating whether the detection result contains the instance mask, the result of the instance segmentation model is generally set to True. -API: `fastdeploy.vision.Mask `, The Mask will return: +API: `fastdeploy.vision.Mask`, The Mask will return: - **data**:Member variable indicating a detected mask. -- **shape**:Member variable representing the shape of the mask, e.g. `(H,W) `. +- **shape**:Member variable representing the shape of the mask, e.g. `(H,W)`. ## FaceDetectionResult The FaceDetectionResult code is defined in `fastdeploy/vision/common/result.h` and is used to indicate the target frames detected by face detection, face landmarks, target confidence and the number of landmarks per face. From b9968f62adaed191ea41376086d78628f4ac1ef7 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 13:14:25 +0000 Subject: [PATCH 09/21] Add Readme for vision results --- docs/api_docs/python/vision_results_cn.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/api_docs/python/vision_results_cn.md b/docs/api_docs/python/vision_results_cn.md index 53d4e255d87..586464a0671 100644 --- a/docs/api_docs/python/vision_results_cn.md +++ b/docs/api_docs/python/vision_results_cn.md @@ -26,7 +26,7 @@ API:`fastdeploy.vision.DetectionResult` , 该结果返回: - **masks**: 成员变量,表示单张图片检测出来的所有实例mask,其元素个数及shape大小与boxes一致. - **contain_masks**: 成员变量,表示检测结果中是否包含实例mask,实例分割模型的结果此项一般为`True`. -fastdeploy.vision.Mask +`fastdeploy.vision.Mask` , 该结果返回: - **data**: 成员变量,表示检测到的一个mask. - **shape**: 成员变量,表示mask的尺寸,如 `H*W`. From 5be415e6ee1d9a8c6a06c227915ef24fffb9e937 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 17 Oct 2022 13:16:57 +0000 Subject: [PATCH 10/21] Add Readme for vision results --- docs/api_docs/python/vision_results_en.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/api_docs/python/vision_results_en.md b/docs/api_docs/python/vision_results_en.md index 3fa559c38bf..1e97b2e9dc4 100644 --- a/docs/api_docs/python/vision_results_en.md +++ b/docs/api_docs/python/vision_results_en.md @@ -25,7 +25,7 @@ API: `fastdeploy.vision.DetectionResult`, The DetectionResult will return: - **socres**(list of float):Member variable indicating the confidence of all targets detected by a single image. - **label_ids**(list of int):Member variable indicating all target categories detected for a single image. - **masks**:Member variable that represents all instances of mask detected from a single image, with the same number of elements and shape size as boxes. -- **contain_masks**:Member variable indicating whether the detection result contains the instance mask, the result of the instance segmentation model is generally set to True. +- **contain_masks**:Member variable indicating whether the detection result contains the instance mask, the result of the instance segmentation model is generally set to `True`. API: `fastdeploy.vision.Mask`, The Mask will return: - **data**:Member variable indicating a detected mask. From d04eaf982b8b71ceb510c8eca5657467c5f6456d Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Tue, 18 Oct 2022 08:56:32 +0000 Subject: [PATCH 11/21] Add comments to create API docs --- docs/api_docs/python/ocr.md | 40 +++++++++++- docs/api_docs/python/requirements.txt | 1 + fastdeploy/vision/ocr/ppocr/classifier.h | 28 +++++++-- fastdeploy/vision/ocr/ppocr/dbdetector.h | 30 ++++++--- fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h | 27 +++++++- fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h | 25 ++++++-- fastdeploy/vision/ocr/ppocr/recognizer.h | 30 +++++++-- .../fastdeploy/vision/ocr/ppocr/__init__.py | 63 +++++++++++++++---- 8 files changed, 205 insertions(+), 39 deletions(-) diff --git a/docs/api_docs/python/ocr.md b/docs/api_docs/python/ocr.md index 4174694af27..552eabdc5fb 100644 --- a/docs/api_docs/python/ocr.md +++ b/docs/api_docs/python/ocr.md @@ -1,3 +1,41 @@ # OCR API -comming soon... +## fastdeploy.vision.ocr.DBDetector + +```{eval-rst} +.. autoclass:: fastdeploy.vision.ocr.DBDetector + :members: + :inherited-members: +``` + +## fastdeploy.vision.ocr.Classifier + +```{eval-rst} +.. autoclass:: fastdeploy.vision.ocr.Classifier + :members: + :inherited-members: +``` + +## fastdeploy.vision.ocr.Recognizer + +```{eval-rst} +.. autoclass:: fastdeploy.vision.ocr.Recognizer + :members: + :inherited-members: +``` + +## fastdeploy.vision.ocr.PPOCRSystemv2 + +```{eval-rst} +.. autoclass:: fastdeploy.vision.ocr.PPOCRSystemv2 + :members: + :inherited-members: +``` + +## fastdeploy.vision.ocr.PPOCRSystemv3 + +```{eval-rst} +.. autoclass:: fastdeploy.vision.ocr.PPOCRSystemv3 + :members: + :inherited-members: +``` diff --git a/docs/api_docs/python/requirements.txt b/docs/api_docs/python/requirements.txt index 73b4a140f62..4f8fa23fedf 100644 --- a/docs/api_docs/python/requirements.txt +++ b/docs/api_docs/python/requirements.txt @@ -3,3 +3,4 @@ recommonmark sphinx_markdown_tables sphinx_rtd_theme furo +myst_parser diff --git a/fastdeploy/vision/ocr/ppocr/classifier.h b/fastdeploy/vision/ocr/ppocr/classifier.h index 110ef7f370b..f810f98a376 100644 --- a/fastdeploy/vision/ocr/ppocr/classifier.h +++ b/fastdeploy/vision/ocr/ppocr/classifier.h @@ -20,20 +20,36 @@ namespace fastdeploy { namespace vision { +/** \brief All OCR series model APIs are defined inside this namespace + * + */ namespace ocr { - +/*! @brief Classifier object is used to load the classification model provided by PaddleOCR. + */ class FASTDEPLOY_DECL Classifier : public FastDeployModel { public: Classifier(); + /** \brief Set path of model file, and the configuration of runtime + * + * \param[in] model_file Path of model file, e.g ./ch_ppocr_mobile_v2.0_cls_infer/model.pdmodel. + * \param[in] params_file Path of parameter file, e.g ./ch_ppocr_mobile_v2.0_cls_infer/model.pdiparams, if the model format is ONNX, this parameter will be ignored. + * \param[in] custom_option RuntimeOption for inference, the default will use cpu, and choose the backend defined in `valid_cpu_backends`. + * \param[in] model_format Model format of the loaded model, default is Paddle format. + */ Classifier(const std::string& model_file, const std::string& params_file = "", const RuntimeOption& custom_option = RuntimeOption(), const ModelFormat& model_format = ModelFormat::PADDLE); - + /// Get model's name std::string ModelName() const { return "ppocr/ocr_cls"; } - + /** \brief Predict the input image and get OCR classification model result. + * + * \param[in] im The input image data, comes from cv::imread(). + * \param[in] result The output of OCR classification model result will be writen to this structure. + * \return true if the prediction is successed, otherwise false. + */ virtual bool Predict(cv::Mat* img, std::tuple* result); - // pre & post parameters + // Pre & Post parameters float cls_thresh; std::vector cls_image_shape; int cls_batch_num; @@ -44,9 +60,9 @@ class FASTDEPLOY_DECL Classifier : public FastDeployModel { private: bool Initialize(); - + /// Preprocess the input data, and set the preprocessed results to `outputs` bool Preprocess(Mat* img, FDTensor* output); - + /// Postprocess the inferenced results, and set the final result to `result` bool Postprocess(FDTensor& infer_result, std::tuple* result); }; diff --git a/fastdeploy/vision/ocr/ppocr/dbdetector.h b/fastdeploy/vision/ocr/ppocr/dbdetector.h index ad80c132967..53bf3aceec6 100644 --- a/fastdeploy/vision/ocr/ppocr/dbdetector.h +++ b/fastdeploy/vision/ocr/ppocr/dbdetector.h @@ -20,22 +20,38 @@ namespace fastdeploy { namespace vision { +/** \brief All OCR series model APIs are defined inside this namespace + * + */ namespace ocr { +/*! @brief DBDetector object is used to load the detection model provided by PaddleOCR. + */ class FASTDEPLOY_DECL DBDetector : public FastDeployModel { public: DBDetector(); - + /** \brief Set path of model file, and the configuration of runtime + * + * \param[in] model_file Path of model file, e.g ./ch_PP-OCRv3_det_infer/model.pdmodel. + * \param[in] params_file Path of parameter file, e.g ./ch_PP-OCRv3_det_infer/model.pdiparams, if the model format is ONNX, this parameter will be ignored. + * \param[in] custom_option RuntimeOption for inference, the default will use cpu, and choose the backend defined in `valid_cpu_backends`. + * \param[in] model_format Model format of the loaded model, default is Paddle format. + */ DBDetector(const std::string& model_file, const std::string& params_file = "", const RuntimeOption& custom_option = RuntimeOption(), const ModelFormat& model_format = ModelFormat::PADDLE); - + /// Get model's name std::string ModelName() const { return "ppocr/ocr_det"; } - + /** \brief Predict the input image and get OCR detection model result. + * + * \param[in] im The input image data, comes from cv::imread(). + * \param[in] boxes_result The output of OCR detection model result will be writen to this structure. + * \return true if the prediction is successed, otherwise false. + */ virtual bool Predict(cv::Mat* im, std::vector>* boxes_result); - // pre&post process parameters + // Pre & Post process parameters int max_side_len; float ratio_h{}; @@ -53,14 +69,14 @@ class FASTDEPLOY_DECL DBDetector : public FastDeployModel { private: bool Initialize(); - + /// Preprocess the input data, and set the preprocessed results to `outputs` bool Preprocess(Mat* mat, FDTensor* outputs, std::map>* im_info); - + /*! @brief Postprocess the inferenced results, and set the final result to `boxes_result` + */ bool Postprocess(FDTensor& infer_result, std::vector>* boxes_result, const std::map>& im_info); - PostProcessor post_processor_; }; diff --git a/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h b/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h index f2a8ccbed8c..1b70adb5fd2 100644 --- a/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h +++ b/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h @@ -27,17 +27,38 @@ namespace fastdeploy { namespace application { +/** \brief OCR system can launch detection model, classification model and recognition model sequentially. All OCR system APIs are defined inside this namespace. + * + */ namespace ocrsystem { - +/*! @brief PPOCRSystemv2 is used to load PP-OCRv2 series models provided by PaddleOCR. + */ class FASTDEPLOY_DECL PPOCRSystemv2 : public FastDeployModel { public: + /** \brief Set up the detection model path, classification model path and recognition model path respectively. + * + * \param[in] det_model Path of detection model, e.g ./ch_PP-OCRv2_det_infer + * \param[in] cls_model Path of classification model, e.g ./ch_ppocr_mobile_v2.0_cls_infer + * \param[in] rec_model Path of recognition model, e.g ./ch_PP-OCRv2_rec_infer + */ PPOCRSystemv2(fastdeploy::vision::ocr::DBDetector* det_model, fastdeploy::vision::ocr::Classifier* cls_model, fastdeploy::vision::ocr::Recognizer* rec_model); + /** \brief Classification model is optional, so this function is set up the detection model path and recognition model path respectively. + * + * \param[in] det_model Path of detection model, e.g ./ch_PP-OCRv2_det_infer + * \param[in] rec_model Path of recognition model, e.g ./ch_PP-OCRv2_rec_infer + */ PPOCRSystemv2(fastdeploy::vision::ocr::DBDetector* det_model, fastdeploy::vision::ocr::Recognizer* rec_model); + /** \brief Predict the input image and get OCR result. + * + * \param[in] im The input image data, comes from cv::imread(). + * \param[in] result The output OCR result will be writen to this structure. + * \return true if the prediction successed, otherwise false. + */ virtual bool Predict(cv::Mat* img, fastdeploy::vision::OCRResult* result); bool Initialized() const override; @@ -45,9 +66,11 @@ class FASTDEPLOY_DECL PPOCRSystemv2 : public FastDeployModel { fastdeploy::vision::ocr::DBDetector* detector_ = nullptr; fastdeploy::vision::ocr::Classifier* classifier_ = nullptr; fastdeploy::vision::ocr::Recognizer* recognizer_ = nullptr; - + /// Luanch the detection process in OCR. virtual bool Detect(cv::Mat* img, fastdeploy::vision::OCRResult* result); + /// Luanch the recognition process in OCR. virtual bool Recognize(cv::Mat* img, fastdeploy::vision::OCRResult* result); + /// Luanch the classification process in OCR. virtual bool Classify(cv::Mat* img, fastdeploy::vision::OCRResult* result); }; diff --git a/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h b/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h index d9e2d4584ae..c88a0aff20e 100644 --- a/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h +++ b/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h @@ -18,19 +18,36 @@ namespace fastdeploy { namespace application { +/** \brief OCR system can launch detection model, classification model and recognition model sequentially. All OCR system APIs are defined inside this namespace. + * + */ namespace ocrsystem { - +/*! @brief PPOCRSystemv3 is used to load PP-OCRv3 series models provided by PaddleOCR. + */ class FASTDEPLOY_DECL PPOCRSystemv3 : public PPOCRSystemv2 { public: + /** \brief Set up the detection model path, classification model path and recognition model path respectively. + * + * \param[in] det_model Path of detection model, e.g ./ch_PP-OCRv3_det_infer + * \param[in] cls_model Path of classification model, e.g ./ch_ppocr_mobile_v2.0_cls_infer + * \param[in] rec_model Path of recognition model, e.g ./ch_PP-OCRv3_rec_infer + */ PPOCRSystemv3(fastdeploy::vision::ocr::DBDetector* det_model, fastdeploy::vision::ocr::Classifier* cls_model, - fastdeploy::vision::ocr::Recognizer* rec_model) : PPOCRSystemv2(det_model, cls_model, rec_model) { + fastdeploy::vision::ocr::Recognizer* rec_model) + : PPOCRSystemv2(det_model, cls_model, rec_model) { // The only difference between v2 and v3 recognizer_->rec_image_shape[1] = 48; } - + /** \brief Classification model is optional, so this function is set up the detection model path and recognition model path respectively. + * + * \param[in] det_model Path of detection model, e.g ./ch_PP-OCRv3_det_infer + * \param[in] rec_model Path of recognition model, e.g ./ch_PP-OCRv3_rec_infer + */ PPOCRSystemv3(fastdeploy::vision::ocr::DBDetector* det_model, - fastdeploy::vision::ocr::Recognizer* rec_model) : PPOCRSystemv2(det_model, rec_model) { + fastdeploy::vision::ocr::Recognizer* rec_model) + : PPOCRSystemv2(det_model, rec_model) { + // The only difference between v2 and v3 recognizer_->rec_image_shape[1] = 48; } }; diff --git a/fastdeploy/vision/ocr/ppocr/recognizer.h b/fastdeploy/vision/ocr/ppocr/recognizer.h index ebe99d1e86c..3ab6731ba46 100644 --- a/fastdeploy/vision/ocr/ppocr/recognizer.h +++ b/fastdeploy/vision/ocr/ppocr/recognizer.h @@ -20,22 +20,39 @@ namespace fastdeploy { namespace vision { +/** \brief All OCR series model APIs are defined inside this namespace + * + */ namespace ocr { - +/*! @brief Recognizer object is used to load the recognition model provided by PaddleOCR. + */ class FASTDEPLOY_DECL Recognizer : public FastDeployModel { public: Recognizer(); + /** \brief Set path of model file, and the configuration of runtime + * + * \param[in] model_file Path of model file, e.g ./ch_PP-OCRv3_rec_infer/model.pdmodel. + * \param[in] params_file Path of parameter file, e.g ./ch_PP-OCRv3_rec_infer/model.pdiparams, if the model format is ONNX, this parameter will be ignored. + * \param[in] label_path Path of label file used by OCR recognition model. e.g ./ppocr_keys_v1.txt + * \param[in] custom_option RuntimeOption for inference, the default will use cpu, and choose the backend defined in `valid_cpu_backends`. + * \param[in] model_format Model format of the loaded model, default is Paddle format. + */ Recognizer(const std::string& model_file, const std::string& params_file = "", const std::string& label_path = "", const RuntimeOption& custom_option = RuntimeOption(), const ModelFormat& model_format = ModelFormat::PADDLE); - + /// Get model's name std::string ModelName() const { return "ppocr/ocr_rec"; } - + /** \brief Predict the input image and get OCR recognition model result. + * + * \param[in] im The input image data, comes from cv::imread(). + * \param[in] rec_result The output of OCR recognition model result will be writen to this structure. + * \return true if the prediction is successed, otherwise false. + */ virtual bool Predict(cv::Mat* img, std::tuple* rec_result); - // pre & post parameters + // Pre & Post parameters std::vector label_list; int rec_batch_num; int rec_img_h; @@ -48,10 +65,11 @@ class FASTDEPLOY_DECL Recognizer : public FastDeployModel { private: bool Initialize(); - + /// Preprocess the input data, and set the preprocessed results to `outputs` bool Preprocess(Mat* img, FDTensor* outputs, const std::vector& rec_image_shape); - + /*! @brief Postprocess the inferenced results, and set the final result to `rec_result` + */ bool Postprocess(FDTensor& infer_result, std::tuple* rec_result); }; diff --git a/python/fastdeploy/vision/ocr/ppocr/__init__.py b/python/fastdeploy/vision/ocr/ppocr/__init__.py index 53888ba0406..54e1f77ea65 100644 --- a/python/fastdeploy/vision/ocr/ppocr/__init__.py +++ b/python/fastdeploy/vision/ocr/ppocr/__init__.py @@ -24,8 +24,13 @@ def __init__(self, params_file="", runtime_option=None, model_format=ModelFormat.PADDLE): - # 调用基函数进行backend_option的初始化 - # 初始化后的option保存在self._runtime_option + """Load OCR detection model provided by PaddleOCR. + + :param model_file: (str)Path of model file, e.g ./ch_PP-OCRv3_det_infer/model.pdmodel. + :param params_file: (str)Path of parameter file, e.g ./ch_PP-OCRv3_det_infer/model.pdiparams, if the model format is ONNX, this parameter will be ignored. + :param runtime_option: (fastdeploy.RuntimeOption)RuntimeOption for inference this model, if it's None, will use the default backend on CPU. + :param model_format: (fastdeploy.ModelForamt)Model format of the loaded model. + """ super(DBDetector, self).__init__(runtime_option) if (len(model_file) == 0): @@ -33,7 +38,6 @@ def __init__(self, else: self._model = C.vision.ocr.DBDetector( model_file, params_file, self._runtime_option, model_format) - # 通过self.initialized判断整个模型的初始化是否成功 assert self.initialized, "DBDetector initialize failed." # 一些跟DBDetector模型有关的属性封装 @@ -81,8 +85,8 @@ def det_db_thresh(self, value): @det_db_box_thresh.setter def det_db_box_thresh(self, value): assert isinstance( - value, - float), "The value to set `det_db_box_thresh` must be type of float." + value, float + ), "The value to set `det_db_box_thresh` must be type of float." self._model.det_db_box_thresh = value @det_db_unclip_ratio.setter @@ -119,8 +123,13 @@ def __init__(self, params_file="", runtime_option=None, model_format=ModelFormat.PADDLE): - # 调用基函数进行backend_option的初始化 - # 初始化后的option保存在self._runtime_option + """Load OCR classification model provided by PaddleOCR. + + :param model_file: (str)Path of model file, e.g ./ch_ppocr_mobile_v2.0_cls_infer/model.pdmodel. + :param params_file: (str)Path of parameter file, e.g ./ch_ppocr_mobile_v2.0_cls_infer/model.pdiparams, if the model format is ONNX, this parameter will be ignored. + :param runtime_option: (fastdeploy.RuntimeOption)RuntimeOption for inference this model, if it's None, will use the default backend on CPU. + :param model_format: (fastdeploy.ModelForamt)Model format of the loaded model. + """ super(Classifier, self).__init__(runtime_option) if (len(model_file) == 0): @@ -128,7 +137,6 @@ def __init__(self, else: self._model = C.vision.ocr.Classifier( model_file, params_file, self._runtime_option, model_format) - # 通过self.initialized判断整个模型的初始化是否成功 assert self.initialized, "Classifier initialize failed." @property @@ -159,7 +167,8 @@ def cls_image_shape(self, value): @cls_batch_num.setter def cls_batch_num(self, value): assert isinstance( - value, int), "The value to set `cls_batch_num` must be type of int." + value, + int), "The value to set `cls_batch_num` must be type of int." self._model.cls_batch_num = value @@ -170,8 +179,14 @@ def __init__(self, label_path="", runtime_option=None, model_format=ModelFormat.PADDLE): - # 调用基函数进行backend_option的初始化 - # 初始化后的option保存在self._runtime_option + """Load OCR recognition model provided by PaddleOCR + + :param model_file: (str)Path of model file, e.g ./ch_PP-OCRv3_rec_infer/model.pdmodel. + :param params_file: (str)Path of parameter file, e.g ./ch_PP-OCRv3_rec_infer/model.pdiparams, if the model format is ONNX, this parameter will be ignored. + :param label_path: (str)Path of label file used by OCR recognition model. e.g ./ppocr_keys_v1.txt + :param runtime_option: (fastdeploy.RuntimeOption)RuntimeOption for inference this model, if it's None, will use the default backend on CPU. + :param model_format: (fastdeploy.ModelForamt)Model format of the loaded model. + """ super(Recognizer, self).__init__(runtime_option) if (len(model_file) == 0): @@ -180,7 +195,6 @@ def __init__(self, self._model = C.vision.ocr.Recognizer( model_file, params_file, label_path, self._runtime_option, model_format) - # 通过self.initialized判断整个模型的初始化是否成功 assert self.initialized, "Recognizer initialize failed." @property @@ -210,12 +224,19 @@ def rec_img_w(self, value): @rec_batch_num.setter def rec_batch_num(self, value): assert isinstance( - value, int), "The value to set `rec_batch_num` must be type of int." + value, + int), "The value to set `rec_batch_num` must be type of int." self._model.rec_batch_num = value class PPOCRSystemv3(FastDeployModel): def __init__(self, det_model=None, cls_model=None, rec_model=None): + """Load detetion, classification and recognition models to construct PP-OCRv3 + + :param det_model: (FastDeployModel) The detection model object created by fastdeploy.vision.ocr.DBDetector. + :param cls_model: (FastDeployModel) The classification model object created by fastdeploy.vision.ocr.Classifier. + :param rec_model: (FastDeployModel) The recognition model object created by fastdeploy.vision.ocr.Recognizer. + """ assert det_model is not None and rec_model is not None, "The det_model and rec_model cannot be None." if cls_model is None: self.system = C.vision.ocr.PPOCRSystemv3(det_model._model, @@ -225,11 +246,22 @@ def __init__(self, det_model=None, cls_model=None, rec_model=None): det_model._model, cls_model._model, rec_model._model) def predict(self, input_image): + """Predict an input image + + :param input_image: (numpy.ndarray)The input image data, 3-D array with layout HWC, BGR format + :return: OCRResult + """ return self.system.predict(input_image) class PPOCRSystemv2(FastDeployModel): def __init__(self, det_model=None, cls_model=None, rec_model=None): + """Load detetion, classification and recognition models to construct PP-OCRv2. + + :param det_model: (FastDeployModel) The detection model object created by fastdeploy.vision.ocr.DBDetector. + :param cls_model: (FastDeployModel) The classification model object created by fastdeploy.vision.ocr.Classifier. + :param rec_model: (FastDeployModel) The recognition model object created by fastdeploy.vision.ocr.Recognizer. + """ assert det_model is not None and rec_model is not None, "The det_model and rec_model cannot be None." if cls_model is None: self.system = C.vision.ocr.PPOCRSystemv2(det_model._model, @@ -239,4 +271,9 @@ def __init__(self, det_model=None, cls_model=None, rec_model=None): det_model._model, cls_model._model, rec_model._model) def predict(self, input_image): + """Predict an input image + + :param input_image: (numpy.ndarray)The input image data, 3-D array with layout HWC, BGR format + :return: OCRResult + """ return self.system.predict(input_image) From 72837577b22d53461a69b2b5fcc65b9f6bade074 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Tue, 18 Oct 2022 11:23:02 +0000 Subject: [PATCH 12/21] Improve OCR comments --- fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h | 6 +++--- python/fastdeploy/vision/ocr/ppocr/__init__.py | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h b/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h index 1b70adb5fd2..04bf26b9f64 100644 --- a/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h +++ b/fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h @@ -66,11 +66,11 @@ class FASTDEPLOY_DECL PPOCRSystemv2 : public FastDeployModel { fastdeploy::vision::ocr::DBDetector* detector_ = nullptr; fastdeploy::vision::ocr::Classifier* classifier_ = nullptr; fastdeploy::vision::ocr::Recognizer* recognizer_ = nullptr; - /// Luanch the detection process in OCR. + /// Launch the detection process in OCR. virtual bool Detect(cv::Mat* img, fastdeploy::vision::OCRResult* result); - /// Luanch the recognition process in OCR. + /// Launch the recognition process in OCR. virtual bool Recognize(cv::Mat* img, fastdeploy::vision::OCRResult* result); - /// Luanch the classification process in OCR. + /// Launch the classification process in OCR. virtual bool Classify(cv::Mat* img, fastdeploy::vision::OCRResult* result); }; diff --git a/python/fastdeploy/vision/ocr/ppocr/__init__.py b/python/fastdeploy/vision/ocr/ppocr/__init__.py index 54e1f77ea65..412332e3a65 100644 --- a/python/fastdeploy/vision/ocr/ppocr/__init__.py +++ b/python/fastdeploy/vision/ocr/ppocr/__init__.py @@ -231,7 +231,7 @@ def rec_batch_num(self, value): class PPOCRSystemv3(FastDeployModel): def __init__(self, det_model=None, cls_model=None, rec_model=None): - """Load detetion, classification and recognition models to construct PP-OCRv3 + """Consruct a pipeline with text detector, direction classifier and text recognizer models :param det_model: (FastDeployModel) The detection model object created by fastdeploy.vision.ocr.DBDetector. :param cls_model: (FastDeployModel) The classification model object created by fastdeploy.vision.ocr.Classifier. @@ -256,7 +256,7 @@ def predict(self, input_image): class PPOCRSystemv2(FastDeployModel): def __init__(self, det_model=None, cls_model=None, rec_model=None): - """Load detetion, classification and recognition models to construct PP-OCRv2. + """Consruct a pipeline with text detector, direction classifier and text recognizer models :param det_model: (FastDeployModel) The detection model object created by fastdeploy.vision.ocr.DBDetector. :param cls_model: (FastDeployModel) The classification model object created by fastdeploy.vision.ocr.Classifier. From 51ad5629984a543b4becb20c5268cbe37e3be402 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 24 Oct 2022 12:58:49 +0000 Subject: [PATCH 13/21] fix conflict --- fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h | 57 ------------------- 1 file changed, 57 deletions(-) delete mode 100644 fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h diff --git a/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h b/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h deleted file mode 100644 index c88a0aff20e..00000000000 --- a/fastdeploy/vision/ocr/ppocr/ppocr_system_v3.h +++ /dev/null @@ -1,57 +0,0 @@ -// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -#pragma once - -#include "fastdeploy/vision/ocr/ppocr/ppocr_system_v2.h" - -namespace fastdeploy { -namespace application { -/** \brief OCR system can launch detection model, classification model and recognition model sequentially. All OCR system APIs are defined inside this namespace. - * - */ -namespace ocrsystem { -/*! @brief PPOCRSystemv3 is used to load PP-OCRv3 series models provided by PaddleOCR. - */ -class FASTDEPLOY_DECL PPOCRSystemv3 : public PPOCRSystemv2 { - public: - /** \brief Set up the detection model path, classification model path and recognition model path respectively. - * - * \param[in] det_model Path of detection model, e.g ./ch_PP-OCRv3_det_infer - * \param[in] cls_model Path of classification model, e.g ./ch_ppocr_mobile_v2.0_cls_infer - * \param[in] rec_model Path of recognition model, e.g ./ch_PP-OCRv3_rec_infer - */ - PPOCRSystemv3(fastdeploy::vision::ocr::DBDetector* det_model, - fastdeploy::vision::ocr::Classifier* cls_model, - fastdeploy::vision::ocr::Recognizer* rec_model) - : PPOCRSystemv2(det_model, cls_model, rec_model) { - // The only difference between v2 and v3 - recognizer_->rec_image_shape[1] = 48; - } - /** \brief Classification model is optional, so this function is set up the detection model path and recognition model path respectively. - * - * \param[in] det_model Path of detection model, e.g ./ch_PP-OCRv3_det_infer - * \param[in] rec_model Path of recognition model, e.g ./ch_PP-OCRv3_rec_infer - */ - PPOCRSystemv3(fastdeploy::vision::ocr::DBDetector* det_model, - fastdeploy::vision::ocr::Recognizer* rec_model) - : PPOCRSystemv2(det_model, rec_model) { - // The only difference between v2 and v3 - recognizer_->rec_image_shape[1] = 48; - } -}; - -} // namespace ocrsystem -} // namespace application -} // namespace fastdeploy From aecbf0058d07cbe14488e2eda4ab7573fe0bdf4e Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Mon, 24 Oct 2022 13:01:26 +0000 Subject: [PATCH 14/21] Fix OCR Readme --- examples/vision/ocr/PP-OCRv2/cpp/README.md | 2 +- examples/vision/ocr/PP-OCRv2/python/README.md | 2 +- examples/vision/ocr/PP-OCRv3/cpp/README.md | 2 +- examples/vision/ocr/PP-OCRv3/python/README.md | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/examples/vision/ocr/PP-OCRv2/cpp/README.md b/examples/vision/ocr/PP-OCRv2/cpp/README.md index 65478725647..f5c02011d4c 100644 --- a/examples/vision/ocr/PP-OCRv2/cpp/README.md +++ b/examples/vision/ocr/PP-OCRv2/cpp/README.md @@ -22,7 +22,7 @@ make -j wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar tar -xvf ch_PP-OCRv2_det_infer.tar -https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar tar -xvf ch_ppocr_mobile_v2.0_cls_infer.tar wgethttps://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar diff --git a/examples/vision/ocr/PP-OCRv2/python/README.md b/examples/vision/ocr/PP-OCRv2/python/README.md index c51f8781fd9..a846f19c0f5 100644 --- a/examples/vision/ocr/PP-OCRv2/python/README.md +++ b/examples/vision/ocr/PP-OCRv2/python/README.md @@ -13,7 +13,7 @@ wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar tar -xvf ch_PP-OCRv2_det_infer.tar -https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar tar -xvf ch_ppocr_mobile_v2.0_cls_infer.tar wgethttps://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar diff --git a/examples/vision/ocr/PP-OCRv3/cpp/README.md b/examples/vision/ocr/PP-OCRv3/cpp/README.md index 16a62887677..ace4d9b1287 100644 --- a/examples/vision/ocr/PP-OCRv3/cpp/README.md +++ b/examples/vision/ocr/PP-OCRv3/cpp/README.md @@ -22,7 +22,7 @@ make -j wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar tar -xvf ch_PP-OCRv3_det_infer.tar -https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar tar -xvf ch_ppocr_mobile_v2.0_cls_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar diff --git a/examples/vision/ocr/PP-OCRv3/python/README.md b/examples/vision/ocr/PP-OCRv3/python/README.md index 0fda05e281d..a51abba98ac 100644 --- a/examples/vision/ocr/PP-OCRv3/python/README.md +++ b/examples/vision/ocr/PP-OCRv3/python/README.md @@ -13,7 +13,7 @@ wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_infer.tar tar xvf ch_PP-OCRv3_det_infer.tar -https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar tar -xvf ch_ppocr_mobile_v2.0_cls_infer.tar wget https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_infer.tar From 10e6107cbdbe218b80f0789db3c45b41b2a946e6 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Wed, 9 Nov 2022 04:56:16 +0000 Subject: [PATCH 15/21] Fix PPOCR readme --- examples/vision/ocr/PP-OCRv2/cpp/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vision/ocr/PP-OCRv2/cpp/README.md b/examples/vision/ocr/PP-OCRv2/cpp/README.md index b4d254d4832..42769806ab6 100644 --- a/examples/vision/ocr/PP-OCRv2/cpp/README.md +++ b/examples/vision/ocr/PP-OCRv2/cpp/README.md @@ -13,7 +13,7 @@ mkdir build cd build wget https://https://bj.bcebos.com/paddlehub/fastdeploy/cpp/fastdeploy-linux-x64-gpu-0.4.0.tgz -tar xvf fastdeploy-linux-x64-0.4.0.tgz +tar xvf fastdeploy-linux-x64-gpu-0.4.0.tgz cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-0.4.0 make -j From 94a4d8a38a64d0de52f22ee3381206f5cf594c59 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Wed, 9 Nov 2022 04:58:08 +0000 Subject: [PATCH 16/21] Fix PPOCR readme --- examples/vision/ocr/PP-OCRv3/cpp/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vision/ocr/PP-OCRv3/cpp/README.md b/examples/vision/ocr/PP-OCRv3/cpp/README.md index d8b28701e67..e96b1b362c7 100644 --- a/examples/vision/ocr/PP-OCRv3/cpp/README.md +++ b/examples/vision/ocr/PP-OCRv3/cpp/README.md @@ -13,7 +13,7 @@ mkdir build cd build wget https://https://bj.bcebos.com/paddlehub/fastdeploy/cpp/fastdeploy-linux-x64-gpu-0.4.0.tgz -tar xvf fastdeploy-linux-x64-0.4.0.tgz +tar xvf fastdeploy-linux-x64-gpu-0.4.0.tgz cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-0.4.0 make -j From 9d904ce58e454142dbdeb101b2590dcf2b24c38f Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Tue, 31 Jan 2023 06:40:03 +0000 Subject: [PATCH 17/21] fix conflict --- examples/vision/ocr/PP-OCRv2/cpp/README_CN.md | 162 ++++++++++++++++++ .../ocr/PP-OCRv2/cpp/infer_static_shape.cc | 120 +++++++++++++ 2 files changed, 282 insertions(+) create mode 100644 examples/vision/ocr/PP-OCRv2/cpp/README_CN.md create mode 100644 examples/vision/ocr/PP-OCRv2/cpp/infer_static_shape.cc diff --git a/examples/vision/ocr/PP-OCRv2/cpp/README_CN.md b/examples/vision/ocr/PP-OCRv2/cpp/README_CN.md new file mode 100644 index 00000000000..ec8b0c16b3a --- /dev/null +++ b/examples/vision/ocr/PP-OCRv2/cpp/README_CN.md @@ -0,0 +1,162 @@ +[English](README.md) | 简体中文 +# PPOCRv2 C++部署示例 + +本目录下提供`infer.cc`快速完成PPOCRv2在CPU/GPU,以及GPU上通过TensorRT加速部署的示例。 + +在部署前,需确认以下两个步骤 + +- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md) +- 2. 根据开发环境,下载预编译部署库和samples代码,参考[FastDeploy预编译库](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md) + +以Linux上CPU推理为例,在本目录执行如下命令即可完成编译测试,支持此模型需保证FastDeploy版本0.7.0以上(x.x.x>=0.7.0) + +``` +mkdir build +cd build +# 下载FastDeploy预编译库,用户可在上文提到的`FastDeploy预编译库`中自行选择合适的版本使用 +wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz +tar xvf fastdeploy-linux-x64-x.x.x.tgz +cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x +make -j + + +# 下载模型,图片和字典文件 +wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_det_infer.tar +tar -xvf ch_PP-OCRv2_det_infer.tar + +wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar +tar -xvf ch_ppocr_mobile_v2.0_cls_infer.tar + +wget https://paddleocr.bj.bcebos.com/PP-OCRv2/chinese/ch_PP-OCRv2_rec_infer.tar +tar -xvf ch_PP-OCRv2_rec_infer.tar + +wget https://gitee.com/paddlepaddle/PaddleOCR/raw/release/2.6/doc/imgs/12.jpg + +wget https://gitee.com/paddlepaddle/PaddleOCR/raw/release/2.6/ppocr/utils/ppocr_keys_v1.txt + +# CPU推理 +./infer_demo ./ch_PP-OCRv2_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer ./ppocr_keys_v1.txt ./12.jpg 0 +# GPU推理 +./infer_demo ./ch_PP-OCRv2_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer ./ppocr_keys_v1.txt ./12.jpg 1 +# GPU上TensorRT推理 +./infer_demo ./ch_PP-OCRv2_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer ./ppocr_keys_v1.txt ./12.jpg 2 +# GPU上Paddle-TRT推理 +./infer_demo ./ch_PP-OCRv2_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer ./ppocr_keys_v1.txt ./12.jpg 3 +# 昆仑芯XPU推理 +./infer_demo ./ch_PP-OCRv2_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer ./ppocr_keys_v1.txt ./12.jpg 4 +# 华为昇腾推理, 需要使用静态shape的demo, 若用户需要连续地预测图片, 输入图片尺寸需要准备为统一尺寸 +./infer_static_shape_demo ./ch_PP-OCRv2_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer ./ppocr_keys_v1.txt ./12.jpg 1 +``` + +以上命令只适用于Linux或MacOS, Windows下SDK的使用方式请参考: +- [如何在Windows中使用FastDeploy C++ SDK](../../../../../docs/cn/faq/use_sdk_on_windows.md) + +如果用户使用华为昇腾NPU部署, 请参考以下方式在部署前初始化部署环境: +- [如何使用华为昇腾NPU部署](../../../../../docs/cn/faq/use_sdk_on_ascend.md) + +运行完成可视化结果如下图所示 + + + + +## PPOCRv2 C++接口 + +### PPOCRv2类 + +``` +fastdeploy::pipeline::PPOCRv2(fastdeploy::vision::ocr::DBDetector* det_model, + fastdeploy::vision::ocr::Classifier* cls_model, + fastdeploy::vision::ocr::Recognizer* rec_model); +``` + +PPOCRv2 的初始化,由检测,分类和识别模型串联构成 + +**参数** + +> * **DBDetector**(model): OCR中的检测模型 +> * **Classifier**(model): OCR中的分类模型 +> * **Recognizer**(model): OCR中的识别模型 + +``` +fastdeploy::pipeline::PPOCRv2(fastdeploy::vision::ocr::DBDetector* det_model, + fastdeploy::vision::ocr::Recognizer* rec_model); +``` +PPOCRv2 的初始化,由检测,识别模型串联构成(无分类器) + +**参数** + +> * **DBDetector**(model): OCR中的检测模型 +> * **Recognizer**(model): OCR中的识别模型 + +#### Predict函数 + +> ``` +> bool Predict(cv::Mat* img, fastdeploy::vision::OCRResult* result); +> bool Predict(const cv::Mat& img, fastdeploy::vision::OCRResult* result); +> ``` +> +> 模型预测接口,输入一张图片,返回OCR预测结果 +> +> **参数** +> +> > * **img**: 输入图像,注意需为HWC,BGR格式 +> > * **result**: OCR预测结果,包括由检测模型输出的检测框位置,分类模型输出的方向分类,以及识别模型输出的识别结果, OCRResult说明参考[视觉模型预测结果](../../../../../docs/api/vision_results/) + + +## DBDetector C++接口 + +### DBDetector类 + +``` +fastdeploy::vision::ocr::DBDetector(const std::string& model_file, const std::string& params_file = "", + const RuntimeOption& custom_option = RuntimeOption(), + const ModelFormat& model_format = ModelFormat::PADDLE); +``` + +DBDetector模型加载和初始化,其中模型为paddle模型格式。 + +**参数** + +> * **model_file**(str): 模型文件路径 +> * **params_file**(str): 参数文件路径,当模型格式为ONNX时,此参数传入空字符串即可 +> * **runtime_option**(RuntimeOption): 后端推理配置,默认为None,即采用默认配置 +> * **model_format**(ModelFormat): 模型格式,默认为Paddle格式 + +### Classifier类与DBDetector类相同 + +### Recognizer类 +``` + Recognizer(const std::string& model_file, + const std::string& params_file = "", + const std::string& label_path = "", + const RuntimeOption& custom_option = RuntimeOption(), + const ModelFormat& model_format = ModelFormat::PADDLE); +``` +Recognizer类初始化时,需要在label_path参数中,输入识别模型所需的label文件,其他参数均与DBDetector类相同 + +**参数** +> * **label_path**(str): 识别模型的label文件路径 + + +### 类成员变量 +#### DBDetector预处理参数 +用户可按照自己的实际需求,修改下列预处理参数,从而影响最终的推理和部署效果 + +> > * **max_side_len**(int): 检测算法前向时图片长边的最大尺寸,当长边超出这个值时会将长边resize到这个大小,短边等比例缩放,默认为960 +> > * **det_db_thresh**(double): DB模型输出预测图的二值化阈值,默认为0.3 +> > * **det_db_box_thresh**(double): DB模型输出框的阈值,低于此值的预测框会被丢弃,默认为0.6 +> > * **det_db_unclip_ratio**(double): DB模型输出框扩大的比例,默认为1.5 +> > * **det_db_score_mode**(string):DB后处理中计算文本框平均得分的方式,默认为slow,即求polygon区域的平均分数的方式 +> > * **use_dilation**(bool):是否对检测输出的feature map做膨胀处理,默认为Fasle + +#### Classifier预处理参数 +用户可按照自己的实际需求,修改下列预处理参数,从而影响最终的推理和部署效果 + +> > * **cls_thresh**(double): 当分类模型输出的得分超过此阈值,输入的图片将被翻转,默认为0.9 + +## 其它文档 + +- [PPOCR 系列模型介绍](../../) +- [PPOCRv2 Python部署](../python) +- [模型预测结果说明](../../../../../docs/api/vision_results/) +- [如何切换模型推理后端引擎](../../../../../docs/cn/faq/how_to_change_backend.md) diff --git a/examples/vision/ocr/PP-OCRv2/cpp/infer_static_shape.cc b/examples/vision/ocr/PP-OCRv2/cpp/infer_static_shape.cc new file mode 100644 index 00000000000..7a48ba879e6 --- /dev/null +++ b/examples/vision/ocr/PP-OCRv2/cpp/infer_static_shape.cc @@ -0,0 +1,120 @@ +// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#include "fastdeploy/vision.h" +#ifdef WIN32 +const char sep = '\\'; +#else +const char sep = '/'; +#endif + +void InitAndInfer(const std::string& det_model_dir, + const std::string& cls_model_dir, + const std::string& rec_model_dir, + const std::string& rec_label_file, + const std::string& image_file, + const fastdeploy::RuntimeOption& option) { + auto det_model_file = det_model_dir + sep + "inference.pdmodel"; + auto det_params_file = det_model_dir + sep + "inference.pdiparams"; + + auto cls_model_file = cls_model_dir + sep + "inference.pdmodel"; + auto cls_params_file = cls_model_dir + sep + "inference.pdiparams"; + + auto rec_model_file = rec_model_dir + sep + "inference.pdmodel"; + auto rec_params_file = rec_model_dir + sep + "inference.pdiparams"; + + auto det_option = option; + auto cls_option = option; + auto rec_option = option; + + auto det_model = fastdeploy::vision::ocr::DBDetector( + det_model_file, det_params_file, det_option); + auto cls_model = fastdeploy::vision::ocr::Classifier( + cls_model_file, cls_params_file, cls_option); + auto rec_model = fastdeploy::vision::ocr::Recognizer( + rec_model_file, rec_params_file, rec_label_file, rec_option); + + // Users could enable static shape infer for rec model when deploy PP-OCR on + // hardware + // which can not support dynamic shape infer well, like Huawei Ascend series. + rec_model.GetPreprocessor().SetStaticShapeInfer(true); + + assert(det_model.Initialized()); + assert(cls_model.Initialized()); + assert(rec_model.Initialized()); + + // The classification model is optional, so the PP-OCR can also be connected + // in series as follows + // auto ppocr_v2 = fastdeploy::pipeline::PPOCRv2(&det_model, &rec_model); + auto ppocr_v2 = + fastdeploy::pipeline::PPOCRv2(&det_model, &cls_model, &rec_model); + + // When users enable static shape infer for rec model, the batch size of cls + // and rec model must to be set to 1. + ppocr_v2.SetClsBatchSize(1); + ppocr_v2.SetRecBatchSize(1); + + if (!ppocr_v2.Initialized()) { + std::cerr << "Failed to initialize PP-OCR." << std::endl; + return; + } + + auto im = cv::imread(image_file); + + fastdeploy::vision::OCRResult result; + if (!ppocr_v2.Predict(im, &result)) { + std::cerr << "Failed to predict." << std::endl; + return; + } + + std::cout << result.Str() << std::endl; + + auto vis_im = fastdeploy::vision::VisOcr(im, result); + cv::imwrite("vis_result.jpg", vis_im); + std::cout << "Visualized result saved in ./vis_result.jpg" << std::endl; +} + +int main(int argc, char* argv[]) { + if (argc < 7) { + std::cout << "Usage: infer_demo path/to/det_model path/to/cls_model " + "path/to/rec_model path/to/rec_label_file path/to/image " + "run_option, " + "e.g ./infer_demo ./ch_PP-OCRv2_det_infer " + "./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv2_rec_infer " + "./ppocr_keys_v1.txt ./12.jpg 0" + << std::endl; + std::cout << "The data type of run_option is int, 0: run with cpu; 1: run " + "with ascend." + << std::endl; + return -1; + } + + fastdeploy::RuntimeOption option; + int flag = std::atoi(argv[6]); + + if (flag == 0) { + option.UseCpu(); + } else if (flag == 1) { + option.UseAscend(); + } + + std::string det_model_dir = argv[1]; + std::string cls_model_dir = argv[2]; + std::string rec_model_dir = argv[3]; + std::string rec_label_file = argv[4]; + std::string test_image = argv[5]; + InitAndInfer(det_model_dir, cls_model_dir, rec_model_dir, rec_label_file, + test_image, option); + return 0; +} From 4c9ea4749f665624d70d5cc4202ad1e446cff620 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Thu, 2 Feb 2023 12:40:41 +0000 Subject: [PATCH 18/21] Improve ascend readme --- docs/cn/build_and_install/huawei_ascend.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/docs/cn/build_and_install/huawei_ascend.md b/docs/cn/build_and_install/huawei_ascend.md index 3741027e278..a4e04c0d9c8 100644 --- a/docs/cn/build_and_install/huawei_ascend.md +++ b/docs/cn/build_and_install/huawei_ascend.md @@ -118,5 +118,13 @@ FastDeploy现在已经集成FlyCV, 用户可以在支持的硬件平台上使用 ## 六.昇腾部署Demo参考 -- 华为昇腾NPU 上使用C++部署 PaddleClas 分类模型请参考:[PaddleClas 华为升腾NPU C++ 部署示例](../../../examples/vision/classification/paddleclas/cpp/README.md) -- 华为昇腾NPU 上使用Python部署 PaddleClas 分类模型请参考:[PaddleClas 华为升腾NPU Python 部署示例](../../../examples/vision/classification/paddleclas/python/README.md) + +| 模型系列 | C++ 部署示例 | Python 部署示例 | +| :-----------| :-------- | :--------------- | +| PaddleClas | [昇腾NPU C++ 部署示例](../../../examples/vision/classification/paddleclas/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/classification/paddleclas/python/README_CN.md) | +| PaddleDetection | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/paddledetection/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/paddledetection/python/README_CN.md) | +| PaddleSeg | [昇腾NPU C++ 部署示例](../../../examples/vision/segmentation/paddleseg/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples//vision/segmentation/paddleseg/python/README_CN.md) | +| PaddleOCR | [昇腾NPU C++ 部署示例](../../../examples/vision/ocr/PP-OCRv3/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision//ocr/PP-OCRv3/python/README_CN.md) | +| Yolov5 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov5/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/sdetection/yolov5/python/README_CN.md) | +| Yolov6 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov6/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/yolov6/python/README_CN.md) | +| Yolov7 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov7/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/yolov7/python/README_CN.md) | From e313f99c9ee6523e34539d62ea8a6d232563ae51 Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Thu, 2 Feb 2023 12:42:23 +0000 Subject: [PATCH 19/21] Improve ascend readme --- docs/cn/build_and_install/huawei_ascend.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/cn/build_and_install/huawei_ascend.md b/docs/cn/build_and_install/huawei_ascend.md index a4e04c0d9c8..520b23eab32 100644 --- a/docs/cn/build_and_install/huawei_ascend.md +++ b/docs/cn/build_and_install/huawei_ascend.md @@ -125,6 +125,6 @@ FastDeploy现在已经集成FlyCV, 用户可以在支持的硬件平台上使用 | PaddleDetection | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/paddledetection/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/paddledetection/python/README_CN.md) | | PaddleSeg | [昇腾NPU C++ 部署示例](../../../examples/vision/segmentation/paddleseg/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples//vision/segmentation/paddleseg/python/README_CN.md) | | PaddleOCR | [昇腾NPU C++ 部署示例](../../../examples/vision/ocr/PP-OCRv3/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision//ocr/PP-OCRv3/python/README_CN.md) | -| Yolov5 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov5/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/sdetection/yolov5/python/README_CN.md) | +| Yolov5 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov5/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/yolov5/python/README_CN.md) | | Yolov6 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov6/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/yolov6/python/README_CN.md) | | Yolov7 | [昇腾NPU C++ 部署示例](../../../examples/vision/detection/yolov7/cpp/README_CN.md) | [昇腾NPU Python 部署示例](../../../examples/vision/detection/yolov7/python/README_CN.md) | From d12b7c440ee0a041ef28605b314c07c3c829b9eb Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Thu, 2 Feb 2023 13:11:39 +0000 Subject: [PATCH 20/21] Improve ascend readme --- docs/en/build_and_install/huawei_ascend.md | 12 +++++++++--- .../detection/paddledetection/cpp/README.md | 16 ++++++++++------ .../detection/paddledetection/python/README.md | 12 ++++++++---- examples/vision/detection/yolov5/cpp/README.md | 12 +++++++----- .../vision/detection/yolov5/python/README.md | 8 +++++--- .../vision/detection/yolov6/python/README.md | 9 ++++++--- examples/vision/detection/yolov7/cpp/README.md | 14 ++++++++------ examples/vision/ocr/PP-OCRv3/cpp/README.md | 2 ++ .../vision/segmentation/paddleseg/cpp/README.md | 10 ++++++---- 9 files changed, 61 insertions(+), 34 deletions(-) diff --git a/docs/en/build_and_install/huawei_ascend.md b/docs/en/build_and_install/huawei_ascend.md index 55743ca1c01..ce0e38c1539 100644 --- a/docs/en/build_and_install/huawei_ascend.md +++ b/docs/en/build_and_install/huawei_ascend.md @@ -117,6 +117,12 @@ In end-to-end model inference, the pre-processing and post-processing phases are ## Deployment demo reference -- Deploying PaddleClas Classification Model on Huawei Ascend NPU using C++ please refer to: [PaddleClas Huawei Ascend NPU C++ Deployment Example](../../../examples/vision/classification/paddleclas/cpp/README.md) - -- Deploying PaddleClas classification model on Huawei Ascend NPU using Python please refer to: [PaddleClas Huawei Ascend NPU Python Deployment Example](../../../examples/vision/classification/paddleclas/python/README.md) +| Model | C++ Example | Python Example | +| :-----------| :-------- | :--------------- | +| PaddleClas | [Ascend NPU C++ Example](../../../examples/vision/classification/paddleclas/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/classification/paddleclas/python/README_CN.md) | +| PaddleDetection | [Ascend NPU C++ Example](../../../examples/vision/detection/paddledetection/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/paddledetection/python/README_CN.md) | +| PaddleSeg | [Ascend NPU C++ Example](../../../examples/vision/segmentation/paddleseg/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples//vision/segmentation/paddleseg/python/README_CN.md) | +| PaddleOCR | [Ascend NPU C++ Example](../../../examples/vision/ocr/PP-OCRv3/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision//ocr/PP-OCRv3/python/README_CN.md) | +| Yolov5 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov5/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov5/python/README_CN.md) | +| Yolov6 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov6/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov6/python/README_CN.md) | +| Yolov7 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov7/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov7/python/README_CN.md) | diff --git a/examples/vision/detection/paddledetection/cpp/README.md b/examples/vision/detection/paddledetection/cpp/README.md index b53d8ae4840..94e73fd458b 100755 --- a/examples/vision/detection/paddledetection/cpp/README.md +++ b/examples/vision/detection/paddledetection/cpp/README.md @@ -1,7 +1,7 @@ English | [简体中文](README_CN.md) # PaddleDetection C++ Deployment Example -This directory provides examples that `infer_xxx.cc` fast finishes the deployment of PaddleDetection models, including PPYOLOE/PicoDet/YOLOX/YOLOv3/PPYOLO/FasterRCNN/YOLOv5/YOLOv6/YOLOv7/RTMDet on CPU/GPU and GPU accelerated by TensorRT. +This directory provides examples that `infer_xxx.cc` fast finishes the deployment of PaddleDetection models, including PPYOLOE/PicoDet/YOLOX/YOLOv3/PPYOLO/FasterRCNN/YOLOv5/YOLOv6/YOLOv7/RTMDet on CPU/GPU and GPU accelerated by TensorRT. Before deployment, two steps require confirmation @@ -15,13 +15,13 @@ ppyoloe is taken as an example for inference deployment mkdir build cd build -# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above +# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz tar xvf fastdeploy-linux-x64-x.x.x.tgz cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x make -j -# Download the PPYOLOE model file and test images +# Download the PPYOLOE model file and test images wget https://bj.bcebos.com/paddlehub/fastdeploy/ppyoloe_crn_l_300e_coco.tgz wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg tar xvf ppyoloe_crn_l_300e_coco.tgz @@ -33,12 +33,16 @@ tar xvf ppyoloe_crn_l_300e_coco.tgz ./infer_ppyoloe_demo ./ppyoloe_crn_l_300e_coco 000000014439.jpg 1 # TensorRT Inference on GPU ./infer_ppyoloe_demo ./ppyoloe_crn_l_300e_coco 000000014439.jpg 2 +# Kunlunxin XPU Inference +./infer_ppyoloe_demo ./ppyoloe_crn_l_300e_coco 000000014439.jpg 3 +# Huawei Ascend Inference +./infer_ppyoloe_demo ./ppyoloe_crn_l_300e_coco 000000014439.jpg 4 ``` The above command works for Linux or MacOS. For SDK use-pattern in Windows, refer to: - [How to use FastDeploy C++ SDK in Windows](../../../../../docs/en/faq/use_sdk_on_windows.md) -## PaddleDetection C++ Interface +## PaddleDetection C++ Interface ### Model Class @@ -56,7 +60,7 @@ Loading and initializing PaddleDetection PPYOLOE model, where the format of mode **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path > * **config_file**(str): • Configuration file path, which is the deployment yaml file exported by PaddleDetection > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default, which is the default configuration @@ -73,7 +77,7 @@ Loading and initializing PaddleDetection PPYOLOE model, where the format of mode > **Parameter** > > > * **im**: Input images in HWC or BGR format -> > * **result**: Detection result, including detection box and confidence of each box. Refer to [Vision Model Prediction Result](../../../../../docs/api/vision_results/) for DetectionResult +> > * **result**: Detection result, including detection box and confidence of each box. Refer to [Vision Model Prediction Result](../../../../../docs/api/vision_results/) for DetectionResult - [Model Description](../../) - [Python Deployment](../python) diff --git a/examples/vision/detection/paddledetection/python/README.md b/examples/vision/detection/paddledetection/python/README.md index baec5fe06eb..d0aa3e3012b 100755 --- a/examples/vision/detection/paddledetection/python/README.md +++ b/examples/vision/detection/paddledetection/python/README.md @@ -9,11 +9,11 @@ Before deployment, two steps require confirmation. This directory provides examples that `infer_xxx.py` fast finishes the deployment of PPYOLOE/PicoDet models on CPU/GPU and GPU accelerated by TensorRT. The script is as follows ```bash -# Download deployment example code +# Download deployment example code git clone https://github.com/PaddlePaddle/FastDeploy.git cd FastDeploy/examples/vision/detection/paddledetection/python/ -# Download the PPYOLOE model file and test images +# Download the PPYOLOE model file and test images wget https://bj.bcebos.com/paddlehub/fastdeploy/ppyoloe_crn_l_300e_coco.tgz wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg tar xvf ppyoloe_crn_l_300e_coco.tgz @@ -24,6 +24,10 @@ python infer_ppyoloe.py --model_dir ppyoloe_crn_l_300e_coco --image 000000014439 python infer_ppyoloe.py --model_dir ppyoloe_crn_l_300e_coco --image 000000014439.jpg --device gpu # TensorRT inference on GPU (Attention: It is somewhat time-consuming for the operation of model serialization when running TensorRT inference for the first time. Please be patient.) python infer_ppyoloe.py --model_dir ppyoloe_crn_l_300e_coco --image 000000014439.jpg --device gpu --use_trt True +# Kunlunxin XPU Inference +python infer_ppyoloe.py --model_dir ppyoloe_crn_l_300e_coco --image 000000014439.jpg --device kunlunxin +# Huawei Ascend Inference +python infer_ppyoloe.py --model_dir ppyoloe_crn_l_300e_coco --image 000000014439.jpg --device ascend ``` The visualized result after running is as follows @@ -31,7 +35,7 @@ The visualized result after running is as follows -## PaddleDetection Python Interface +## PaddleDetection Python Interface ```python fastdeploy.vision.detection.PPYOLOE(model_file, params_file, config_file, runtime_option=None, model_format=ModelFormat.PADDLE) @@ -52,7 +56,7 @@ PaddleDetection model loading and initialization, among which model_file and par **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path > * **config_file**(str): Inference configuration yaml file path > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default. (use the default configuration) diff --git a/examples/vision/detection/yolov5/cpp/README.md b/examples/vision/detection/yolov5/cpp/README.md index 1b5e9ad8681..74f18208836 100755 --- a/examples/vision/detection/yolov5/cpp/README.md +++ b/examples/vision/detection/yolov5/cpp/README.md @@ -12,12 +12,12 @@ Taking the CPU inference on Linux as an example, the compilation test can be com ```bash mkdir build cd build -# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above +# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz tar xvf fastdeploy-linux-x64-x.x.x.tgz cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x make -j -# Download the official converted yolov5 Paddle model files and test images +# Download the official converted yolov5 Paddle model files and test images wget https://bj.bcebos.com/paddlehub/fastdeploy/yolov5s_infer.tar tar -xvf yolov5s_infer.tar wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg @@ -31,11 +31,13 @@ wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/0000000 ./infer_paddle_demo yolov5s_infer 000000014439.jpg 2 # KunlunXin XPU inference ./infer_paddle_demo yolov5s_infer 000000014439.jpg 3 +# Huawei Ascend Inference +./infer_paddle_demo yolov5s_infer 000000014439.jpg 4 ``` The above steps apply to the inference of Paddle models. If you want to conduct the inference of ONNX models, follow these steps: ```bash -# 1. Download the official converted yolov5 ONNX model files and test images +# 1. Download the official converted yolov5 ONNX model files and test images wget https://bj.bcebos.com/paddlehub/fastdeploy/yolov5s.onnx wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg @@ -53,7 +55,7 @@ The visualized result after running is as follows The above command works for Linux or MacOS. For SDK use-pattern in Windows, refer to: - [How to use FastDeploy C++ SDK in Windows](../../../../../docs/cn/faq/use_sdk_on_windows.md) -## YOLOv5 C++ Interface +## YOLOv5 C++ Interface ### YOLOv5 Class @@ -69,7 +71,7 @@ YOLOv5 model loading and initialization, among which model_file is the exported **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path. Merely passing an empty string when the model is in ONNX format > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default, which is the default configuration > * **model_format**(ModelFormat): Model format. ONNX format by default diff --git a/examples/vision/detection/yolov5/python/README.md b/examples/vision/detection/yolov5/python/README.md index 0e815dd091d..23b6665c795 100755 --- a/examples/vision/detection/yolov5/python/README.md +++ b/examples/vision/detection/yolov5/python/README.md @@ -22,17 +22,19 @@ wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/0000000 python infer.py --model yolov5s_infer --image 000000014439.jpg --device cpu # GPU inference python infer.py --model yolov5s_infer --image 000000014439.jpg --device gpu -# TensorRT inference on GPU +# TensorRT inference on GPU python infer.py --model yolov5s_infer --image 000000014439.jpg --device gpu --use_trt True # KunlunXin XPU inference python infer.py --model yolov5s_infer --image 000000014439.jpg --device kunlunxin +# Huawei Ascend Inference +python infer.py --model yolov5s_infer --image 000000014439.jpg --device ascend ``` The visualized result after running is as follows -## YOLOv5 Python Interface +## YOLOv5 Python Interface ```python fastdeploy.vision.detection.YOLOv5(model_file, params_file=None, runtime_option=None, model_format=ModelFormat.ONNX) @@ -42,7 +44,7 @@ YOLOv5 model loading and initialization, among which model_file is the exported **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path. No need to set when the model is in ONNX format > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default, which is the default configuration > * **model_format**(ModelFormat): Model format. ONNX format by default diff --git a/examples/vision/detection/yolov6/python/README.md b/examples/vision/detection/yolov6/python/README.md index 789df97474e..04bc9f34518 100755 --- a/examples/vision/detection/yolov6/python/README.md +++ b/examples/vision/detection/yolov6/python/README.md @@ -23,6 +23,9 @@ python infer_paddle_model.py --model yolov6s_infer --image 000000014439.jpg --d python infer_paddle_model.py --model yolov6s_infer --image 000000014439.jpg --device gpu # KunlunXin XPU inference python infer_paddle_model.py --model yolov6s_infer --image 000000014439.jpg --device kunlunxin +# Huawei Ascend Inference +python infer_paddle_model.py --model yolov6s_infer --image 000000014439.jpg --device ascend + ``` If you want to verify the inference of ONNX models, refer to the following command: ```bash @@ -34,7 +37,7 @@ wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/0000000 python infer.py --model yolov6s.onnx --image 000000014439.jpg --device cpu # GPU inference python infer.py --model yolov6s.onnx --image 000000014439.jpg --device gpu -# TensorRT inference on GPU +# TensorRT inference on GPU python infer.py --model yolov6s.onnx --image 000000014439.jpg --device gpu --use_trt True ``` @@ -42,7 +45,7 @@ The visualized result after running is as follows -## YOLOv6 Python Interface +## YOLOv6 Python Interface ```python fastdeploy.vision.detection.YOLOv6(model_file, params_file=None, runtime_option=None, model_format=ModelFormat.ONNX) @@ -52,7 +55,7 @@ YOLOv6 model loading and initialization, among which model_file is the exported **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path. No need to set when the model is in ONNX format > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default, which is the default configuration > * **model_format**(ModelFormat): Model format. ONNX format by default diff --git a/examples/vision/detection/yolov7/cpp/README.md b/examples/vision/detection/yolov7/cpp/README.md index e36875e0cd2..a3abd6d19aa 100755 --- a/examples/vision/detection/yolov7/cpp/README.md +++ b/examples/vision/detection/yolov7/cpp/README.md @@ -1,7 +1,7 @@ English | [简体中文](README_CN.md) # YOLOv7 C++ Deployment Example -This directory provides examples that `infer.cc` fast finishes the deployment of YOLOv7 on CPU/GPU and GPU accelerated by TensorRT. +This directory provides examples that `infer.cc` fast finishes the deployment of YOLOv7 on CPU/GPU and GPU accelerated by TensorRT. Before deployment, two steps require confirmation @@ -13,7 +13,7 @@ Taking the CPU inference on Linux as an example, the compilation test can be com ```bash mkdir build cd build -# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above +# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz tar xvf fastdeploy-linux-x64-x.x.x.tgz cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x @@ -29,10 +29,12 @@ wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/0000000 ./infer_paddle_model_demo yolov7_infer 000000014439.jpg 1 # KunlunXin XPU inference ./infer_paddle_model_demo yolov7_infer 000000014439.jpg 2 +# Huawei Ascend inference +./infer_paddle_model_demo yolov7_infer 000000014439.jpg 3 ``` If you want to verify the inference of ONNX models, refer to the following command: ```bash -# Download the official converted yolov7 ONNX model files and test images +# Download the official converted yolov7 ONNX model files and test images wget https://bj.bcebos.com/paddlehub/fastdeploy/yolov7.onnx wget https://gitee.com/paddlepaddle/PaddleDetection/raw/release/2.4/demo/000000014439.jpg @@ -52,7 +54,7 @@ The visualized result after running is as follows The above command works for Linux or MacOS. For SDK use-pattern in Windows, refer to: - [How to use FastDeploy C++ SDK in Windows](../../../../../docs/en/faq/use_sdk_on_windows.md) -## YOLOv7 C++ Interface +## YOLOv7 C++ Interface ### YOLOv7 Class @@ -68,7 +70,7 @@ YOLOv7 model loading and initialization, among which model_file is the exported **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path. Merely passing an empty string when the model is in ONNX format > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default, which is the default configuration > * **model_format**(ModelFormat): Model format. ONNX format by default @@ -86,7 +88,7 @@ YOLOv7 model loading and initialization, among which model_file is the exported > **Parameter** > > > * **im**: Input images in HWC or BGR format -> > * **result**: Detection results, including detection box and confidence of each box. Refer to [Vision Model Prediction Results](../../../../../docs/api/vision_results/) for DetectionResult +> > * **result**: Detection results, including detection box and confidence of each box. Refer to [Vision Model Prediction Results](../../../../../docs/api/vision_results/) for DetectionResult > > * **conf_threshold**: Filtering threshold of detection box confidence > > * **nms_iou_threshold**: iou threshold during NMS processing diff --git a/examples/vision/ocr/PP-OCRv3/cpp/README.md b/examples/vision/ocr/PP-OCRv3/cpp/README.md index b0ac61359a9..923bda51303 100755 --- a/examples/vision/ocr/PP-OCRv3/cpp/README.md +++ b/examples/vision/ocr/PP-OCRv3/cpp/README.md @@ -44,6 +44,8 @@ wget https://gitee.com/paddlepaddle/PaddleOCR/raw/release/2.6/ppocr/utils/ppocr_ ./infer_demo ./ch_PP-OCRv3_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv3_rec_infer ./ppocr_keys_v1.txt ./12.jpg 3 # KunlunXin XPU inference ./infer_demo ./ch_PP-OCRv3_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv3_rec_infer ./ppocr_keys_v1.txt ./12.jpg 4 +# Huawei Ascend inference, need to use the infer_static_shape_demo, if the user needs to predict the image continuously, the input image size needs to be prepared as a uniform size. +./infer_static_shape_demo ./ch_PP-OCRv3_det_infer ./ch_ppocr_mobile_v2.0_cls_infer ./ch_PP-OCRv3_rec_infer ./ppocr_keys_v1.txt ./12.jpg 1 ``` The above command works for Linux or MacOS. For SDK in Windows, refer to: diff --git a/examples/vision/segmentation/paddleseg/cpp/README.md b/examples/vision/segmentation/paddleseg/cpp/README.md index 4c5be9f6c12..572e3807881 100755 --- a/examples/vision/segmentation/paddleseg/cpp/README.md +++ b/examples/vision/segmentation/paddleseg/cpp/README.md @@ -1,7 +1,7 @@ English | [简体中文](README_CN.md) # PaddleSeg C++ Deployment Example -This directory provides examples that `infer.cc` fast finishes the deployment of Unet on CPU/GPU and GPU accelerated by TensorRT. +This directory provides examples that `infer.cc` fast finishes the deployment of Unet on CPU/GPU and GPU accelerated by TensorRT. Before deployment, two steps require confirmation @@ -15,7 +15,7 @@ Taking the inference on Linux as an example, the compilation test can be complet ```bash mkdir build cd build -# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above +# Download the FastDeploy precompiled library. Users can choose your appropriate version in the `FastDeploy Precompiled Library` mentioned above wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-x.x.x.tgz tar xvf fastdeploy-linux-x64-x.x.x.tgz cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/fastdeploy-linux-x64-x.x.x @@ -35,6 +35,8 @@ wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png ./infer_demo Unet_cityscapes_without_argmax_infer cityscapes_demo.png 2 # kunlunxin XPU inference ./infer_demo Unet_cityscapes_without_argmax_infer cityscapes_demo.png 3 +# Huawei Ascend Inference +./infer_demo Unet_cityscapes_without_argmax_infer cityscapes_demo.png 4 ``` The visualized result after running is as follows @@ -45,7 +47,7 @@ The visualized result after running is as follows The above command works for Linux or MacOS. For SDK use-pattern in Windows, refer to: - [How to use FastDeploy C++ SDK in Windows](../../../../../docs/cn/faq/use_sdk_on_windows.md) -## PaddleSeg C++ Interface +## PaddleSeg C++ Interface ### PaddleSeg Class @@ -62,7 +64,7 @@ PaddleSegModel model loading and initialization, among which model_file is the e **Parameter** -> * **model_file**(str): Model file path +> * **model_file**(str): Model file path > * **params_file**(str): Parameter file path > * **config_file**(str): Inference deployment configuration file > * **runtime_option**(RuntimeOption): Backend inference configuration. None by default, which is the default configuration From 18c4fa8616e897f1251e0bd49cce4feb43144b7c Mon Sep 17 00:00:00 2001 From: yunyaoXYY Date: Thu, 2 Feb 2023 13:13:31 +0000 Subject: [PATCH 21/21] Improve ascend readme --- docs/en/build_and_install/huawei_ascend.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/en/build_and_install/huawei_ascend.md b/docs/en/build_and_install/huawei_ascend.md index ce0e38c1539..c648e2ea371 100644 --- a/docs/en/build_and_install/huawei_ascend.md +++ b/docs/en/build_and_install/huawei_ascend.md @@ -119,10 +119,10 @@ In end-to-end model inference, the pre-processing and post-processing phases are ## Deployment demo reference | Model | C++ Example | Python Example | | :-----------| :-------- | :--------------- | -| PaddleClas | [Ascend NPU C++ Example](../../../examples/vision/classification/paddleclas/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/classification/paddleclas/python/README_CN.md) | -| PaddleDetection | [Ascend NPU C++ Example](../../../examples/vision/detection/paddledetection/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/paddledetection/python/README_CN.md) | -| PaddleSeg | [Ascend NPU C++ Example](../../../examples/vision/segmentation/paddleseg/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples//vision/segmentation/paddleseg/python/README_CN.md) | -| PaddleOCR | [Ascend NPU C++ Example](../../../examples/vision/ocr/PP-OCRv3/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision//ocr/PP-OCRv3/python/README_CN.md) | -| Yolov5 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov5/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov5/python/README_CN.md) | -| Yolov6 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov6/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov6/python/README_CN.md) | -| Yolov7 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov7/cpp/README_CN.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov7/python/README_CN.md) | +| PaddleClas | [Ascend NPU C++ Example](../../../examples/vision/classification/paddleclas/cpp/README.md) | [Ascend NPU Python Example](../../../examples/vision/classification/paddleclas/python/README.md) | +| PaddleDetection | [Ascend NPU C++ Example](../../../examples/vision/detection/paddledetection/cpp/README.md) | [Ascend NPU Python Example](../../../examples/vision/detection/paddledetection/python/README.md) | +| PaddleSeg | [Ascend NPU C++ Example](../../../examples/vision/segmentation/paddleseg/cpp/README.md) | [Ascend NPU Python Example](../../../examples//vision/segmentation/paddleseg/python/README.md) | +| PaddleOCR | [Ascend NPU C++ Example](../../../examples/vision/ocr/PP-OCRv3/cpp/README.md) | [Ascend NPU Python Example](../../../examples/vision//ocr/PP-OCRv3/python/README.md) | +| Yolov5 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov5/cpp/README.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov5/python/README.md) | +| Yolov6 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov6/cpp/README.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov6/python/README.md) | +| Yolov7 | [Ascend NPU C++ Example](../../../examples/vision/detection/yolov7/cpp/README.md) | [Ascend NPU Python Example](../../../examples/vision/detection/yolov7/python/README.md) |