-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pose detection in python with CUDA support #2041
Comments
Hello @gmontamat, I got the same issue and I solved it by following #1651 for holistic landmark. I think it's so similar to pose detection. You can try to read it. |
Hi @thamquocdung . I'll try that and report here the results. Do you happen to have a diff for your |
I've modified diff --git a/mediapipe/modules/pose_detection/pose_detection_gpu.pbtxt b/mediapipe/modules/pose_detection/pose_detection_gpu.pbtxt
index 98917d9..f4c1c0e 100644
--- a/mediapipe/modules/pose_detection/pose_detection_gpu.pbtxt
+++ b/mediapipe/modules/pose_detection/pose_detection_gpu.pbtxt
@@ -14,7 +14,7 @@
type: "PoseDetectionGpu"
-# GPU image. (GpuBuffer)
+# CPU image.
input_stream: "IMAGE:image"
# Detected poses. (std::vector<Detection>)
@@ -36,12 +36,24 @@ input_stream: "IMAGE:image"
# this packet so that they don't wait for it unnecessarily.
output_stream: "DETECTIONS:detections"
+node: {
+ calculator: "ColorConvertCalculator"
+ input_stream: "RGB_IN:image"
+ output_stream: "RGBA_OUT:image_rgba"
+}
+
+node: {
+ calculator: "ImageFrameToGpuBufferCalculator"
+ input_stream: "image_rgba"
+ output_stream: "image_gpu"
+}
+
# Transforms the input image into a 224x224 one while keeping the aspect ratio
# (what is expected by the corresponding model), resulting in potential
# letterboxing in the transformed image.
node: {
calculator: "ImageToTensorCalculator"
- input_stream: "IMAGE_GPU:image"
+ input_stream: "IMAGE:image_gpu"
output_stream: "TENSORS:input_tensors"
output_stream: "LETTERBOX_PADDING:letterbox_padding"
options: { And now I get
when I initialize |
@gmontamat , you don't need to make any change in diff --git a/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt b/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt
index c439737..c84c958 100644
--- a/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt
+++ b/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt
@@ -88,6 +88,20 @@ output_stream: "ROI_FROM_LANDMARKS:pose_rect_from_landmarks"
# Regions of interest calculated based on pose detections. (NormalizedRect)
output_stream: "ROI_FROM_DETECTION:pose_rect_from_detection"
+
+node: {
+ calculator: "ColorConvertCalculator"
+ input_stream: "RGB_IN:image"
+ output_stream: "RGBA_OUT:image_rgba"
+}
+
+node: {
+ calculator: "ImageFrameToGpuBufferCalculator"
+ input_stream: "image_rgba"
+ output_stream: "image_gpu"
+}
+
+
# Defines whether landmarks on the previous image should be used to help
# localize landmarks on the current image.
node {
@@ -117,7 +131,7 @@ node: {
# Calculates size of the image.
node {
calculator: "ImagePropertiesCalculator"
- input_stream: "IMAGE_GPU:image"
+ input_stream: "IMAGE_GPU:image_gpu"
output_stream: "SIZE:image_size"
}
@@ -126,7 +140,7 @@ node {
# round of pose detection.
node {
calculator: "GateCalculator"
- input_stream: "image"
+ input_stream: "image_gpu"
input_stream: "image_size"
input_stream: "DISALLOW:prev_pose_rect_from_landmarks_is_present"
output_stream: "image_for_pose_detection"
@@ -181,7 +195,7 @@ node {
node {
calculator: "PoseLandmarkByRoiGpu"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
- input_stream: "IMAGE:image"
+ input_stream: "IMAGE:image_gpu"
input_stream: "ROI:pose_rect"
output_stream: "LANDMARKS:unfiltered_pose_landmarks"
output_stream: "AUXILIARY_LANDMARKS:unfiltered_auxiliary_landmarks"
@@ -214,7 +228,7 @@ node {
# timestamp bound update occurs to jump start the feedback loop.
node {
calculator: "PreviousLoopbackCalculator"
- input_stream: "MAIN:image"
+ input_stream: "MAIN:image_gpu"
input_stream: "LOOP:pose_rect_from_landmarks"
input_stream_info: {
tag_index: "LOOP"
diff --git a/mediapipe/python/BUILD b/mediapipe/python/BUILD
index 08a2995..a61cff2 100644
--- a/mediapipe/python/BUILD
+++ b/mediapipe/python/BUILD
@@ -72,5 +72,10 @@ cc_library(
"//mediapipe/modules/pose_detection:pose_detection_cpu",
"//mediapipe/modules/pose_landmark:pose_landmark_by_roi_cpu",
"//mediapipe/modules/pose_landmark:pose_landmark_cpu",
+ "//mediapipe/modules/pose_landmark:pose_landmark_gpu",
+ "//mediapipe/gpu:image_frame_to_gpu_buffer_calculator",
+ "//mediapipe/calculators/image:color_convert_calculator",
+
],
)
diff --git a/mediapipe/python/solutions/pose.py b/mediapipe/python/solutions/pose.py
index e25fe62..16c0346 100644
--- a/mediapipe/python/solutions/pose.py
+++ b/mediapipe/python/solutions/pose.py
@@ -82,7 +82,7 @@ class PoseLandmark(enum.IntEnum):
LEFT_FOOT_INDEX = 31
RIGHT_FOOT_INDEX = 32
-BINARYPB_FILE_PATH = 'mediapipe/modules/pose_landmark/pose_landmark_cpu.binarypb'
+BINARYPB_FILE_PATH = 'mediapipe/modules/pose_landmark/pose_landmark_gpu.binarypb'
POSE_CONNECTIONS = frozenset([
(PoseLandmark.NOSE, PoseLandmark.RIGHT_EYE_INNER),
(PoseLandmark.RIGHT_EYE_INNER, PoseLandmark.RIGHT_EYE),
@@ -180,9 +180,9 @@ class Pose(SolutionBase):
.ConstantSidePacketCalculatorOptions.ConstantSidePacket(
bool_value=not static_image_mode)
],
- 'poselandmarkcpu__posedetectioncpu__TensorsToDetectionsCalculator.min_score_thresh':
+ 'poselandmarkgpu__posedetectiongpu__TensorsToDetectionsCalculator.min_score_thresh':
min_detection_confidence,
- 'poselandmarkcpu__poselandmarkbyroicpu__ThresholdingCalculator.threshold':
+ 'poselandmarkgpu__poselandmarkbyroigpu__ThresholdingCalculator.threshold':
min_tracking_confidence,
},
outputs=['pose_landmarks'])
diff --git a/setup.py b/setup.py
index 81569b3..4b15862 100644
--- a/setup.py
+++ b/setup.py
@@ -225,8 +225,9 @@ class BuildBinaryGraphs(build.build):
'face_detection/face_detection_front_cpu',
'face_landmark/face_landmark_front_cpu',
'hand_landmark/hand_landmark_tracking_cpu',
'holistic_landmark/holistic_landmark_cpu', 'objectron/objectron_cpu',
- 'pose_landmark/pose_landmark_cpu'
+ 'pose_landmark/pose_landmark_gpu',
]
for binary_graph in binary_graphs:
sys.stderr.write('generating binarypb: %s\n' %
@@ -240,7 +241,8 @@ class BuildBinaryGraphs(build.build):
'bazel',
'build',
'--compilation_mode=opt',
- '--define=MEDIAPIPE_DISABLE_GPU=1',
+ '--copt=-DMESA_EGL_NO_X11_HEADERS',
+ '--copt=-DEGL_NO_X11',
'--action_env=PYTHON_BIN_PATH=' + _normalize_path(sys.executable),
os.path.join('mediapipe/modules/', graph_path),
]
@@ -296,7 +298,8 @@ class BuildBazelExtension(build_ext.build_ext):
'bazel',
'build',
'--compilation_mode=opt',
- '--define=MEDIAPIPE_DISABLE_GPU=1',
+ '--copt=-DMESA_EGL_NO_X11_HEADERS',
+ '--copt=-DEGL_NO_X11',
'--action_env=PYTHON_BIN_PATH=' + _normalize_path(sys.executable),
str(ext.bazel_target + '.so'),
] |
@thamquocdung very useful, build takes 2mins, now the python based pose-detect codes can run on RTX3080, the inference is more fasttttter, thank you!
|
@ykk648 Yeah, It took me a lot of time to configure and understand the params. So I wanna share with you. Hope it's helpful ^^ |
@thamquocdung Thank you for sharing,but when I run error info:
Have you encountered such a problem? |
@ZhiyuXu0124 I think you missed |
@ZhiyuXu0124 you'll notice in my first diff, I've added that package to the Dockerfile to build the library in the container |
@thamquocdung thank you! Your changes worked. I'm leaving my diff below since I've also compiled for CUDA and used the Dockerfile provided so as no to mess up with dependencies on my system: (apply diff to https://github.com/google/mediapipe/tree/ae05ad04b3ae43d475ccb2868e23f1418fea8746) diff --git a/.bazelrc b/.bazelrc
index 37a0bc1..0e18020 100644
--- a/.bazelrc
+++ b/.bazelrc
@@ -87,6 +87,16 @@ build:darwin_x86_64 --apple_platform_type=macos
build:darwin_x86_64 --macos_minimum_os=10.12
build:darwin_x86_64 --cpu=darwin_x86_64
+# This config refers to building with CUDA available. It does not necessarily
+# mean that we build CUDA op kernels.
+build:using_cuda --define=using_cuda=true
+build:using_cuda --action_env TF_NEED_CUDA=1
+build:using_cuda --crosstool_top=@local_config_cuda//crosstool:toolchain
+
+# This config refers to building CUDA op kernels with nvcc.
+build:cuda --config=using_cuda
+build:cuda --define=using_cuda_nvcc=true
+
# This bazelrc file is meant to be written by a setup script.
try-import %workspace%/.configure.bazelrc
diff --git a/Dockerfile b/Dockerfile
index c4c4df3..24b2d81 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-FROM ubuntu:18.04
+FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
MAINTAINER <mediapipe@google.com>
@@ -39,7 +39,13 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
libopencv-video-dev \
libopencv-calib3d-dev \
libopencv-features2d-dev \
- software-properties-common && \
+ software-properties-common \
+ python3-venv libprotobuf-dev protobuf-compiler cmake libgtk2.0-dev \
+ mesa-common-dev libegl1-mesa-dev libgles2-mesa-dev mesa-utils \
+ pkg-config libgtk-3-dev libavcodec-dev libavformat-dev libswscale-dev libv4l-dev \
+ libxvidcore-dev libx264-dev libjpeg-dev libpng-dev libtiff-dev \
+ gfortran openexr libatlas-base-dev python3-dev python3-numpy \
+ libtbb2 libtbb-dev libdc1394-22-dev && \
add-apt-repository -y ppa:openjdk-r/ppa && \
apt-get update && apt-get install -y openjdk-8-jdk && \
apt-get clean && \
@@ -69,3 +75,4 @@ COPY . /mediapipe/
# If we want the docker image to contain the pre-built object_detection_offline_demo binary, do the following
# RUN bazel build -c opt --define MEDIAPIPE_DISABLE_GPU=1 mediapipe/examples/desktop/demo:object_detection_tensorflow_demo
+ENV TF_CUDA_PATHS=/usr/local/cuda-10.1,/usr/lib/x86_64-linux-gnu,/usr/include
diff --git a/mediapipe/framework/tool/BUILD b/mediapipe/framework/tool/BUILD
index 890889a..fe3ebfe 100644
--- a/mediapipe/framework/tool/BUILD
+++ b/mediapipe/framework/tool/BUILD
@@ -97,6 +97,7 @@ cc_binary(
deps = [
"@com_google_absl//absl/strings",
],
+ linkopts = ["-lm"],
)
cc_library(
diff --git a/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt b/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt
index c439737..c84c958 100644
--- a/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt
+++ b/mediapipe/modules/pose_landmark/pose_landmark_gpu.pbtxt
@@ -88,6 +88,20 @@ output_stream: "ROI_FROM_LANDMARKS:pose_rect_from_landmarks"
# Regions of interest calculated based on pose detections. (NormalizedRect)
output_stream: "ROI_FROM_DETECTION:pose_rect_from_detection"
+
+node: {
+ calculator: "ColorConvertCalculator"
+ input_stream: "RGB_IN:image"
+ output_stream: "RGBA_OUT:image_rgba"
+}
+
+node: {
+ calculator: "ImageFrameToGpuBufferCalculator"
+ input_stream: "image_rgba"
+ output_stream: "image_gpu"
+}
+
+
# Defines whether landmarks on the previous image should be used to help
# localize landmarks on the current image.
node {
@@ -117,7 +131,7 @@ node: {
# Calculates size of the image.
node {
calculator: "ImagePropertiesCalculator"
- input_stream: "IMAGE_GPU:image"
+ input_stream: "IMAGE_GPU:image_gpu"
output_stream: "SIZE:image_size"
}
@@ -126,7 +140,7 @@ node {
# round of pose detection.
node {
calculator: "GateCalculator"
- input_stream: "image"
+ input_stream: "image_gpu"
input_stream: "image_size"
input_stream: "DISALLOW:prev_pose_rect_from_landmarks_is_present"
output_stream: "image_for_pose_detection"
@@ -181,7 +195,7 @@ node {
node {
calculator: "PoseLandmarkByRoiGpu"
input_side_packet: "MODEL_COMPLEXITY:model_complexity"
- input_stream: "IMAGE:image"
+ input_stream: "IMAGE:image_gpu"
input_stream: "ROI:pose_rect"
output_stream: "LANDMARKS:unfiltered_pose_landmarks"
output_stream: "AUXILIARY_LANDMARKS:unfiltered_auxiliary_landmarks"
@@ -214,7 +228,7 @@ node {
# timestamp bound update occurs to jump start the feedback loop.
node {
calculator: "PreviousLoopbackCalculator"
- input_stream: "MAIN:image"
+ input_stream: "MAIN:image_gpu"
input_stream: "LOOP:pose_rect_from_landmarks"
input_stream_info: {
tag_index: "LOOP"
diff --git a/mediapipe/python/BUILD b/mediapipe/python/BUILD
index 08a2995..dc05f34 100644
--- a/mediapipe/python/BUILD
+++ b/mediapipe/python/BUILD
@@ -72,5 +72,8 @@ cc_library(
"//mediapipe/modules/pose_detection:pose_detection_cpu",
"//mediapipe/modules/pose_landmark:pose_landmark_by_roi_cpu",
"//mediapipe/modules/pose_landmark:pose_landmark_cpu",
+ "//mediapipe/modules/pose_landmark:pose_landmark_gpu",
+ "//mediapipe/gpu:image_frame_to_gpu_buffer_calculator",
+ "//mediapipe/calculators/image:color_convert_calculator",
],
)
diff --git a/mediapipe/python/solutions/pose.py b/mediapipe/python/solutions/pose.py
index e25fe62..16c0346 100644
--- a/mediapipe/python/solutions/pose.py
+++ b/mediapipe/python/solutions/pose.py
@@ -82,7 +82,7 @@ class PoseLandmark(enum.IntEnum):
LEFT_FOOT_INDEX = 31
RIGHT_FOOT_INDEX = 32
-BINARYPB_FILE_PATH = 'mediapipe/modules/pose_landmark/pose_landmark_cpu.binarypb'
+BINARYPB_FILE_PATH = 'mediapipe/modules/pose_landmark/pose_landmark_gpu.binarypb'
POSE_CONNECTIONS = frozenset([
(PoseLandmark.NOSE, PoseLandmark.RIGHT_EYE_INNER),
(PoseLandmark.RIGHT_EYE_INNER, PoseLandmark.RIGHT_EYE),
@@ -180,9 +180,9 @@ class Pose(SolutionBase):
.ConstantSidePacketCalculatorOptions.ConstantSidePacket(
bool_value=not static_image_mode)
],
- 'poselandmarkcpu__posedetectioncpu__TensorsToDetectionsCalculator.min_score_thresh':
+ 'poselandmarkgpu__posedetectiongpu__TensorsToDetectionsCalculator.min_score_thresh':
min_detection_confidence,
- 'poselandmarkcpu__poselandmarkbyroicpu__ThresholdingCalculator.threshold':
+ 'poselandmarkgpu__poselandmarkbyroigpu__ThresholdingCalculator.threshold':
min_tracking_confidence,
},
outputs=['pose_landmarks'])
diff --git a/setup.py b/setup.py
index 81569b3..8e9dd93 100644
--- a/setup.py
+++ b/setup.py
@@ -33,7 +33,7 @@ from distutils import spawn
import distutils.command.build as build
import distutils.command.clean as clean
-__version__ = '0.8'
+__version__ = '0.8.4-cuda10.1'
IS_WINDOWS = (platform.system() == 'Windows')
MP_ROOT_PATH = os.path.dirname(os.path.abspath(__file__))
ROOT_INIT_PY = os.path.join(MP_ROOT_PATH, '__init__.py')
@@ -226,7 +226,7 @@ class BuildBinaryGraphs(build.build):
'face_landmark/face_landmark_front_cpu',
'hand_landmark/hand_landmark_tracking_cpu',
'holistic_landmark/holistic_landmark_cpu', 'objectron/objectron_cpu',
- 'pose_landmark/pose_landmark_cpu'
+ 'pose_landmark/pose_landmark_gpu'
]
for binary_graph in binary_graphs:
sys.stderr.write('generating binarypb: %s\n' %
@@ -240,7 +240,10 @@ class BuildBinaryGraphs(build.build):
'bazel',
'build',
'--compilation_mode=opt',
- '--define=MEDIAPIPE_DISABLE_GPU=1',
+ # '--define=MEDIAPIPE_DISABLE_GPU=1',
+ '--config=cuda',
+ '--spawn_strategy=local',
+ '--copt=-DMESA_EGL_NO_X11_HEADERS',
'--action_env=PYTHON_BIN_PATH=' + _normalize_path(sys.executable),
os.path.join('mediapipe/modules/', graph_path),
]
@@ -296,7 +299,10 @@ class BuildBazelExtension(build_ext.build_ext):
'bazel',
'build',
'--compilation_mode=opt',
- '--define=MEDIAPIPE_DISABLE_GPU=1',
+ # '--define=MEDIAPIPE_DISABLE_GPU=1',
+ '--config=cuda',
+ '--spawn_strategy=local',
+ '--copt=-DMESA_EGL_NO_X11_HEADERS',
'--action_env=PYTHON_BIN_PATH=' + _normalize_path(sys.executable),
str(ext.bazel_target + '.so'),
]
diff --git a/third_party/BUILD b/third_party/BUILD
index 5800098..384dcb2 100644
--- a/third_party/BUILD
+++ b/third_party/BUILD
@@ -113,6 +113,8 @@ cmake_external(
"WITH_PNG": "ON",
"WITH_TIFF": "ON",
"WITH_WEBP": "OFF",
+ "WITH_OPENEXR": "OFF",
+ "WITH_IPP": "OFF",
# Optimization flags
"CV_ENABLE_INTRINSICS": "ON",
"WITH_EIGEN": "ON", The steps to build the python wheel are: $ docker build -t mediapipe .
$ docker run -it --rm -v $(realpath ..):/host --name mediapipe mediapipe
# python3 setup.py gen_protos
# python3 setup.py bdist_wheel
# cp dist/*.whl /host Thanks again for the help. You can mark this issue as solved. |
@ykk648 do you mind letting me know how did you setup |
@gmontamat Acturally I don't know, you may have a AMD GPU when I using Nvidia, my envs are Ubuntu20.04+GTX3080+CUDA11.1, besides follow opengl-es-setup-on-linux-desktop, never installed openGL by myself. |
I think I closed this issue too early :(
Which means that:
Any help is very much appreciated! edit: I think there are 2 possible alternatives then:
|
@thamquocdung I have followed your instructions. I have encountered the following error.
Use --sandbox_debug to see verbose messages from the sandbox gcc failed: error executing command /usr/bin/gcc @bazel-out/k8-opt/bin/mediapipe/python/_framework_bindings.so-2.params Use --sandbox_debug to see verbose messages from the sandbox |
@danial880, Sorry, I have not encountered your issue before. Which arch you have built on? Make sure you reach all of the required packages. You should follow these instructions (install.md and python.md) |
@gmontamat have you tried compiling it with only opengl support and not CUDA? If so, does it use the Nvidia driver in that situation? |
@txf- the OpenGL build #2041 (comment) worked for me (but used to be slow running Mesa OpenGL - found the issue though)! I found out why docker wasn't able to run the Nvidia OpenGL runtime on my server: turns out it only runs on a specific display (:0 on my machine) and I was sharing VNC's display to the docker container running mediapipe (:3 instead of :0). When I figured out this, So I figured out how to make alternative 1 from #2041 (comment) work (running OpenGL's mediapipe pkg on the GPU with Docker). That said, I'd really like to know why the CUDA package I'm trying to build using this diff #2041 (comment) isn't using CUDA at all (it uses OpenGL) even when I followed these steps: https://google.github.io/mediapipe/getting_started/gpu_support.html#tensorflow-cuda-support-and-setup-on-linux-desktop |
@thamquocdung @danial880 I've encountered a similar issue while compiling, that's why I had to add this change which is on #2041 (comment) : diff --git a/mediapipe/framework/tool/BUILD b/mediapipe/framework/tool/BUILD
index 890889a..fe3ebfe 100644
--- a/mediapipe/framework/tool/BUILD
+++ b/mediapipe/framework/tool/BUILD
@@ -97,6 +97,7 @@ cc_binary(
deps = [
"@com_google_absl//absl/strings",
],
+ linkopts = ["-lm"],
)
cc_library( |
@gmontamat thanks for sharing. I ran setup_opencv.sh. After that i installed the mediapipe. Error doesn't appear. Its running fine on my 1080Ti giving 67 fps at an average of 3 seconds. |
@gmontamat Thanks for your patch, i can install it in the docker and run with Nvidida OpenGL on hand_tracking. But I still have some questions abount pose landmark detection
Error
Segmentation Error
|
I've tried to consolidate the different patch files on this thread and the linked comments on the master branch but I'm running into a segmentation fault. Has anyone solved cuda support in a docker container for pose? Here is my diff:
|
@sgowroji , I have fallowed #2320 to build mediapipe package in python, after python3 setup.py bdist_wheel command . I am facing following error ERROR: Skipping 'mediapipe/modules/pose_landmark/pose_landmark_gpuselfie_segmentation/selfie_segmentation_cpu': no such target '//mediapipe/modules/pose_landmark:pose_landmark_gpuselfie_segmentation/selfie_segmentation_cpu': target 'pose_landmark_gpuselfie_segmentation/s |
Hi @gmontamat , |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
Closing as stale. Please reopen if you'd like to work on this further. |
Hello,
Based on @jiuqiant 's comment: #1651 (comment)
I'd like to build the python module to run pose detection (
mediapipe.solutions.pose.Pose
) with GPU CUDA support.To simplify things, I've modified the repo's
Dockerfile
so that I can build the python package on it.Here's the diff so far:
Once applied, I build the image, start the container and build the python wheel:
When using the wheel I generated in python, I get:
My guess is that I'm missing changes in
mediapipe/modules/pose_detection/pose_detection_gpu.pbtxt
to supportimage_frame_to_gpu_buffer_calculator
, has anyone done it before? any ideas on how to modifypose_detection_gpu.pbtxt
The text was updated successfully, but these errors were encountered: