Skip to content
Permalink
Browse files

Improved doc, added AVX header

  • Loading branch information...
gineshidalgo99 committed Apr 8, 2019
1 parent fe767a1 commit b00243a71e3cab0311e8d04e611ca7eadf81885d
@@ -9,19 +9,20 @@ OpenPose - Frequently Asked Question (FAQ)
4. [Profiling Speed and Estimating FPS without Display](#profiling-speed-and-estimating-fps-without-display)
5. [Webcam Slower than Images](#webcam-slower-than-images)
6. [Video/Webcam Not Working](#videowebcam-not-working)
7. [Cannot Find OpenPose.dll Error](#cannot-find-openpose.dll-error-windows)
7. [Cannot Find OpenPose.dll Error](#cannot-find-openposedll-error-windows)
8. [Free Invalid Pointer Error](#free-invalid-pointer-error)
9. [Source Directory does not Contain CMakeLists.txt (Windows)](#source-directory-does-not-contain-cmakelists.txt-windows)
9. [Source Directory does not Contain CMakeLists.txt (Windows)](#source-directory-does-not-contain-cmakeliststxt-windows)
10. [How Should I Link my IP Camera?](#how-should-i-link-my-ip-camera)
11. [Difference between BODY_25 vs. COCO vs. MPI](#difference-between-body_25-vs.-coco-vs.-mpi)
11. [Difference between BODY_25 vs. COCO vs. MPI](#difference-between-body_25-vs-coco-vs-mpi)
12. [How to Measure the Latency Time?](#how-to-measure-the-latency-time)
13. [Zero People Detected](#zero-people-detected)
14. [Check Failed for ReadProtoFromBinaryFile (Failed to Parse NetParameter File)](#check-failed-for-readprotofrombinaryfile-failed-to-parse-netparameter-file)
15. [3D OpenPose Returning Wrong Results: 0, NaN, Infinity, etc.](#3d-openpose-returning-wrong-results-0-nan-infinity-etc)
16. [Protobuf Clip Param Caffe Error](#protobuf-clip-param-caffe-error)
17. [The Human Skeleton Looks like Dotted Lines Rather than Solid Lines](#the-human-skeleton-looks-like-dotted-lines-rather-than-solid-lines)
18. [Huge RAM Usage](#huge-ram-usage)
19. [CUDA_cublas_device_LIBRARY Not Found](#cuda_cublas_device_library_not_found)
19. [CUDA_cublas_device_LIBRARY Not Found](#cuda_cublas_device_library-not-found)
20. [CMake-GUI Error While Getting Default Caffe](#cmake-gui-error-while-getting-default-caffe)



@@ -176,3 +177,17 @@ CUDA_cublas_device_LIBRARY (ADVANCED)
```

**A**: Make sure to download and install CMake-GUI following the [doc/prerequisites.md](./prerequisites.md) section. This is a known problem with CMake-GUI versions from 3.8 to 3.11 (unfortunately, default Ubuntu 18 CMake-GUI uses 3.10). You will need a CMake version >= 3.12.



### CMake-GUI Error While Getting Default Caffe
**Q**: It seems to me CMake-gui does not download Caffe at all. I tried to wipe everything and try to install OpenPose again, but received the same mistake. I also tried to see if cmake follows the ifs in the CMakeLists.txt correctly and reaches the branches where he establishes that Caffe needs to be downloaded and it seems to me it does so.

**A**: There are 2 solutions to try. First, if you were using an old OP version and you just updated it, you should simply completely remove that OpenPose folder, and then re-download and re-compile OpenPose. Second, and only if after re-cloning master and running CMake-GUI the `3rdparty/caffe/` folder stays empty, manually trigger the git submodules to update. So the clone step becomes:
```
git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose
cd openpose
git submodule init
git submodle update
```
@@ -6,16 +6,16 @@ In case of hand camera views at which the hands are visible but not the rest of
## OpenCV-based Face Keypoint Detector
Note that this method will be faster than the current system if there is few people in the image, but it is also much less accurate (OpenCV face detector only works with big and frontal faces, while OpenPose works with more scales and face rotations).
```
./build/examples/openpose/openpose.bin --body_disable --face --face_detector 1
./build/examples/openpose/openpose.bin --body 0 --face --face_detector 1
```

## Custom Standalone Face or Hand Keypoint Detector
Check the examples in `examples/tutorial_api_cpp/`, in particular [examples/tutorial_api_cpp/06_face_from_image.cpp](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/examples/tutorial_api_cpp/06_face_from_image.cpp) and [examples/tutorial_api_cpp/07_hand_from_image.cpp](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/examples/tutorial_api_cpp/07_hand_from_image.cpp). The provide examples of face and/or hand keypoint detection given a known bounding box or rectangle for the face and/or hand locations. These examples are equivalent to use the following flags:
```
# Face
examples/tutorial_api_cpp/06_face_from_image.cpp --body_disable --face --face_detector 2
examples/tutorial_api_cpp/06_face_from_image.cpp --body 0 --face --face_detector 2
# Hands
examples/tutorial_api_cpp/07_hand_from_image.cpp --body_disable --hand --hand_detector 2
examples/tutorial_api_cpp/07_hand_from_image.cpp --body 0 --hand --hand_detector 2
```

Note: both `FaceExtractor` and `HandExtractor` classes requires as input **squared rectangles**.
@@ -3,7 +3,7 @@
// it includes all the OpenPose configuration flags.
// Input: An image and the face rectangle locations.
// Output: OpenPose face keypoint detection.
// NOTE: This demo is auto-selecting the following flags: `--body_disable --face --face_detector 2`
// NOTE: This demo is auto-selecting the following flags: `--body 0 --face --face_detector 2`

// Command-line user intraface
#define OPENPOSE_FLAGS_DISABLE_PRODUCER
@@ -202,7 +202,7 @@ int tutorialApiCpp()

// Info
op::log("NOTE: In addition with the user flags, this demo has auto-selected the following flags:\n"
"\t`--body_disable --face --face_detector 2`", op::Priority::High);
"\t`--body 0 --face --face_detector 2`", op::Priority::High);

// Measuring total time
op::printTime(opTimer, "OpenPose demo successfully finished. Total time: ", " seconds.", op::Priority::High);
@@ -3,7 +3,7 @@
// it includes all the OpenPose configuration flags.
// Input: An image and the hand rectangle locations.
// Output: OpenPose hand keypoint detection.
// NOTE: This demo is auto-selecting the following flags: `--body_disable --hand --hand_detector 2`
// NOTE: This demo is auto-selecting the following flags: `--body 0 --hand --hand_detector 2`

// Command-line user intraface
#define OPENPOSE_FLAGS_DISABLE_PRODUCER
@@ -211,7 +211,7 @@ int tutorialApiCpp()

// Info
op::log("NOTE: In addition with the user flags, this demo has auto-selected the following flags:\n"
"\t`--body_disable --hand --hand_detector 2`", op::Priority::High);
"\t`--body 0 --hand --hand_detector 2`", op::Priority::High);

// Measuring total time
op::printTime(opTimer, "OpenPose demo successfully finished. Total time: ", " seconds.", op::Priority::High);
@@ -1,6 +1,11 @@
#ifndef OPENPOSE_GPU_CUDA_HU
#define OPENPOSE_GPU_CUDA_HU

// Note: This class should only be included if CUDA is enabled

#include <cuda.h>
#include <cuda_runtime.h>

namespace op
{
// VERY IMPORTANT: These fast functions does NOT work for negative integer numbers.
@@ -180,8 +180,8 @@ namespace op
4,45, 45,46, 46,47, 47,48, 4,49, 49,50, 50,51, 51,52, 4,53, 53,54, 54,55, 55,56, 4,57, 57,58, 58,59, 59,60, 4,61, 61,62, 62,63, 63,64
#define POSE_BODY_65_SCALES_RENDER_GPU \
1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f,1.f,1.f,1.f,1.f,1.f, \
0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, \
0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f
0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f
#define POSE_BODY_65_COLORS_RENDER_GPU \
255.f, 0.f, 85.f, \
255.f, 0.f, 0.f, \
@@ -274,10 +274,10 @@ namespace op
1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f, 1.f,1.f, \
1.f,1.f, \
1.f,1.f,1.f,1.f,1.f,1.f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f
#define POSE_BODY_95_COLORS_RENDER_GPU \
255.f, 0.f, 85.f, \
170.f, 0.f, 255.f, \
@@ -396,16 +396,17 @@ namespace op
F135+20,F135+21, F135+22,F135+23, F135+23,F135+24, F135+24,F135+25, F135+25,F135+26, F135+27,F135+28, F135+28,F135+29, F135+29,F135+30, F135+31,F135+32, F135+32,F135+33, F135+33,F135+34, F135+34,F135+35, F135+36,F135+37, F135+37,F135+38, F135+38,F135+39, F135+39,F135+40, F135+40,F135+41, \
F135+41,F135+36, F135+42,F135+43, F135+43,F135+44, F135+44,F135+45, F135+45,F135+46, F135+46,F135+47, F135+47,F135+42, F135+48,F135+49, F135+49,F135+50, F135+50,F135+51, F135+51,F135+52, F135+52,F135+53, F135+53,F135+54, F135+54,F135+55, F135+55,F135+56, F135+56,F135+57, F135+57,F135+58, \
F135+58,F135+59, F135+59,F135+48, F135+60,F135+61, F135+61,F135+62, F135+62,F135+63, F135+63,F135+64, F135+64,F135+65, F135+65,F135+66, F135+66,F135+67, F135+67,F135+60
// Disabled really noisy values
#define POSE_BODY_135_SCALES_RENDER_GPU \
1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f, 1.f,1.f,1.f,1.f,1.f, 1.f,1.f, \
1.f,1.f, \
1.f,0.00f, \
1.f,1.f,1.f,1.f,1.f,1.f, \
0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, \
0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, 0.75f,0.75f,0.75f,0.75f,0.75f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f, \
0.55f,0.55f,0.55f,0.55f,0.55f, 0.55f,0.55f,0.55f,0.55f,0.55f
0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, 0.60f,0.60f,0.60f,0.60f,0.60f, \
0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.00f,0.00f,0.00f, 0.00f,0.00f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f, \
0.45f,0.45f,0.45f,0.45f,0.45f, 0.45f,0.45f,0.45f,0.45f,0.45f
#define POSE_BODY_135_COLORS_RENDER_GPU \
255.f, 0.f, 85.f, \
170.f, 0.f, 255.f, \
@@ -0,0 +1,100 @@
#ifndef OPENPOSE_UTILITIES_AVX_HPP
#define OPENPOSE_UTILITIES_AVX_HPP

// Warning:
// This file contains auxiliary functions for AVX.
// This file should only be included from cpp files.
// Default #include <openpose/headers.hpp> does not include it.

#ifdef WITH_AVX
#include <cstdint> // uintptr_t
#include <memory> // shared_ptr
#include <immintrin.h>
#include <openpose/utilities/errorAndLog.hpp>

namespace op
{
#ifdef __GNUC__
#define ALIGN32(x) x __attribute__((aligned(32)))
#elif defined(_MSC_VER) // defined(_WIN32)
#define ALIGN32(x) __declspec(align(32))
#else
#error Unknown environment!
#endif

// Functions
// Sources:
// - https://stackoverflow.com/questions/32612190/how-to-solve-the-32-byte-alignment-issue-for-avx-load-store-operations
// - https://embeddedartistry.com/blog/2017/2/20/implementing-aligned-malloc
// - https://embeddedartistry.com/blog/2017/2/23/c-smart-pointers-with-aligned-mallocfree
typedef unsigned long long offset_t;
#define PTR_OFFSET_SZ sizeof(offset_t)
#ifndef align_up
#define align_up(num, align) \
(((num) + ((align) - 1)) & ~((align) - 1))
#endif
inline void * aligned_malloc(const size_t align, const size_t size)
{
void * ptr = nullptr;

// 2 conditions:
// - We want both align and size to be greater than 0
// - We want it to be a power of two since align_up operates on powers of two
if (align && size && (align & (align - 1)) == 0)
{
// We know we have to fit an offset value
// We also allocate extra bytes to ensure we can meet the alignment
const auto hdr_size = PTR_OFFSET_SZ + (align - 1);
void * p = malloc(size + hdr_size);

if (p)
{
// Add the offset size to malloc's pointer (we will always store that)
// Then align the resulting value to the arget alignment
ptr = (void *) align_up(((uintptr_t)p + PTR_OFFSET_SZ), align);

// Calculate the offset and store it behind our aligned pointer
*((offset_t *)ptr - 1) = (offset_t)((uintptr_t)ptr - (uintptr_t)p);

} // else nullptr, could not malloc
} // else nullptr, invalid arguments

if (ptr == nullptr)
{
error("Shared pointer could not be allocated for Array data storage.",
__LINE__, __FUNCTION__, __FILE__);
}

return ptr;
}
inline void aligned_free(void * ptr)
{
if (ptr == nullptr)
error("Received nullptr.", __LINE__, __FUNCTION__, __FILE__);

// Walk backwards from the passed-in pointer to get the pointer offset
// We convert to an offset_t pointer and rely on pointer math to get the data
offset_t offset = *((offset_t *)ptr - 1);

// Once we have the offset, we can get our original pointer and call free
void * p = (void *)((uint8_t *)ptr - offset);
free(p);
}
template<class T>
std::shared_ptr<T> aligned_shared_ptr(const size_t size)
{
try
{
return std::shared_ptr<T>(static_cast<T*>(
aligned_malloc(8*sizeof(T), sizeof(T)*size)), &aligned_free);
}
catch (const std::exception& e)
{
error(e.what(), __LINE__, __FUNCTION__, __FILE__);
return std::shared_ptr<T>{};
}
}
}
#endif

#endif // OPENPOSE_UTILITIES_AVX_HPP
@@ -8,6 +8,11 @@

namespace op
{
// The following functions provides basic functions to measure time. Usage example:
// const auto timerInit = getTimerInit();
// // [Some code in here]
// const auto timeSeconds = getTimeSeconds(timerInit);
// const printTime(timeSeconds, "Function X took ", " seconds.");
OP_API std::chrono::time_point<std::chrono::high_resolution_clock> getTimerInit();

OP_API double getTimeSeconds(const std::chrono::time_point<std::chrono::high_resolution_clock>& timerInit);
@@ -16,6 +21,47 @@ namespace op
const std::chrono::time_point<std::chrono::high_resolution_clock>& timerInit, const std::string& firstMessage,
const std::string& secondMessage, const Priority priority);

// The following functions will run REPS times and average the final time in seconds. Usage example:
// const auto REPS = 1000;
// double time = 0.;
// OP_PROFILE_INIT(REPS);
// // [Some code in here]
// OP_PROFILE_END(time, 1e3, REPS); // Time in msec. 1 = sec, 1e3 = msec, 1e6 = usec, 1e9 = nsec, etc.
// log("Function X took " + std::to_string(time) + " milliseconds.");
#define OP_PROFILE_INIT(REPS) \
{ \
const auto timerInit = getTimerInit(); \
for (auto rep = 0 ; rep < (REPS) ; ++rep) \
{
#define OP_PROFILE_END(finalTime, factor, REPS) \
} \
(finalTime) = (factor)/(float)(REPS)*getTimeSeconds(timerInit); \
}

// The following functions will run REPS times, wait for the kernels to finish, and then average the final time
// in seconds. Usage example:
// const auto REPS = 1000;
// double time = 0.;
// OP_CUDA_PROFILE_INIT(REPS);
// // [Some code with CUDA calls in here]
// OP_CUDA_PROFILE_END(time, 1e3, REPS); // Time in msec. 1 = sec, 1e3 = msec, 1e6 = usec, 1e9 = nsec, etc.
// log("Function X took " + std::to_string(time) + " milliseconds.");
// Analogous to OP_PROFILE_INIT, but also waits for CUDA kernels to finish their asynchronous operations
// It requires: #include <cuda_runtime.h>
#define OP_CUDA_PROFILE_INIT(REPS) \
{ \
cudaDeviceSynchronize(); \
const auto timerInit = getTimerInit(); \
for (auto rep = 0 ; rep < (REPS) ; ++rep) \
{
// Analogous to OP_PROFILE_END, but also waits for CUDA kernels to finish their asynchronous operations
// It requires: #include <cuda_runtime.h>
#define OP_CUDA_PROFILE_END(finalTime, factor, REPS) \
} \
cudaDeviceSynchronize(); \
(finalTime) = (factor)/(float)(REPS)*getTimeSeconds(timerInit); \
}

// Enable PROFILER_ENABLED on Makefile.config or CMake in order to use this function. Otherwise nothing will be outputted.
// How to use - example:
// For GPU - It can only be applied in the main.cpp file:
@@ -11,15 +11,19 @@ clear && clear

# Parameters
IMAGE_FOLDER=~/devel/images/val2017/
IMAGE_FOOT_FOLDER=~/devel/images/val2017_foot/
JSON_FOLDER=../evaluation/coco_val_jsons/
# JSON_FOLDER=/media/posefs3b/Users/gines/openpose_train/training_results/2_23_51/best_702k/
OP_BIN=./build/examples/openpose/openpose.bin

# 1 scale
$OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1.json --write_coco_json_variants 3
# 1 scale (body)
$OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1.json --write_coco_json_variants 1
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_max.json --write_coco_json_variants 3 \
# --maximize_positives

# 1 scale (foot)
$OP_BIN --image_dir $IMAGE_FOOT_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1.json --write_coco_json_variants 2

# # 4 scales
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_4.json --write_coco_json_variants 3 \
# --scale_number 4 --scale_gap 0.25 --net_resolution "1312x736"

0 comments on commit b00243a

Please sign in to comment.
You can’t perform that action at this time.