Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TBB can be replaced with cv::parallel_for_ #241

Closed
vinjn opened this issue Aug 29, 2017 · 11 comments
Closed

TBB can be replaced with cv::parallel_for_ #241

vinjn opened this issue Aug 29, 2017 · 11 comments

Comments

@vinjn
Copy link

vinjn commented Aug 29, 2017

The first precondition is to have OpenCV built with a parallel framework. In OpenCV 3.2, the following parallel frameworks are available in that order:

  • Intel Threading Building Blocks (3rdparty library, should be explicitly enabled)
  • C= Parallel C/C++ Programming Language Extension (3rdparty library, should be explicitly enabled)
  • OpenMP (integrated to compiler, should be explicitly enabled)
  • APPLE GCD (system wide, used automatically (APPLE only))
  • Windows RT concurrency (system wide, used automatically (Windows RT only))
  • Windows concurrency (part of runtime, used automatically (Windows only - MSVC++ >= 10))
  • Pthreads (if available)

http://docs.opencv.org/trunk/d7/dff/tutorial_how_to_use_OpenCV_parallel_for_.html

@TadasBaltrusaitis
Copy link
Owner

Thanks for the suggestion, when I update the OpenCV version used I will consider using the cv::parallel for as that could be a bit more portable.

@AntonLinderer
Copy link

any plans of implementing this yet?

@TadasBaltrusaitis
Copy link
Owner

Not anytime soon, I will be moving to OpenCV 3.3 eventually and will explore using the OpenCV parallel option then.

@AntonLinderer
Copy link

seems like opencv 3.4 is now supported...will you consider using the cv::parallel now?

@TadasBaltrusaitis
Copy link
Owner

Definitely considering, it's on my list of things to explore, but I'm focusing on other features at the moment.

@xindongzhang
Copy link

For simplicity, you can straightly replace codes in Patch_experts.cpp.

	parallel_for_(cv::Range(0, n), [&](const cv::Range& range){
		for(int i = range.start; i < range.end; i++)
		{
	
			if(visibilities[scale][view_id].rows == n)
			{
				if(visibilities[scale][view_id].at<int>(i,0) != 0)
				{
	
					// Work out how big the area of interest has to be to get a response of window size
					int area_of_interest_width;
					int area_of_interest_height;
	
					if(use_ccnf)
					{
						area_of_interest_width = window_size + ccnf_expert_intensity[scale][view_id][i].width - 1;
						area_of_interest_height = window_size + ccnf_expert_intensity[scale][view_id][i].height - 1;
					}
					else
					{
						area_of_interest_width = window_size + svr_expert_intensity[scale][view_id][i].width - 1;
						area_of_interest_height = window_size + svr_expert_intensity[scale][view_id][i].height - 1;
					}
	
					// scale and rotate to mean shape to reference frame
					cv::Mat sim = (cv::Mat_<float>(2,3) << a1, -b1, landmark_locations.at<double>(i,0), b1, a1, landmark_locations.at<double>(i+n,0));
	
					// Extract the region of interest around the current landmark location
					cv::Mat_<float> area_of_interest(area_of_interest_height, area_of_interest_width);
	
					// Using C style openCV as it does what we need
					CvMat area_of_interest_o = area_of_interest;
					CvMat sim_o = sim;
					IplImage im_o = grayscale_image;
					cvGetQuadrangleSubPix(&im_o, &area_of_interest_o, &sim_o);
	
					// get the correct size response window
					patch_expert_responses[i] = cv::Mat_<float>(window_size, window_size);
	
					// Get intensity response either from the SVR or CCNF patch experts (prefer CCNF)
					if(!ccnf_expert_intensity.empty())
					{
	
						ccnf_expert_intensity[scale][view_id][i].Response(area_of_interest, patch_expert_responses[i]);
					}
					else
					{
						svr_expert_intensity[scale][view_id][i].Response(area_of_interest, patch_expert_responses[i]);
					}
	
					// if we have a corresponding depth patch and it is visible
					if(!svr_expert_depth.empty() && !depth_image.empty() && visibilities[scale][view_id].at<int>(i,0))
					{
	
						cv::Mat_<float> dProb = patch_expert_responses[i].clone();
						cv::Mat_<float> depthWindow(area_of_interest_height, area_of_interest_width);
	
	
						CvMat dimg_o = depthWindow;
						cv::Mat maskWindow(area_of_interest_height, area_of_interest_width, CV_32F);
						CvMat mimg_o = maskWindow;
	
						IplImage d_o = depth_image;
						IplImage m_o = mask;
	
						cvGetQuadrangleSubPix(&d_o,&dimg_o,&sim_o);
	
						cvGetQuadrangleSubPix(&m_o,&mimg_o,&sim_o);
	
						depthWindow.setTo(0, maskWindow < 1);
	
						svr_expert_depth[scale][view_id][i].ResponseDepth(depthWindow, dProb);
	
						// Sum to one
						double sum = cv::sum(patch_expert_responses[i])[0];
	
						// To avoid division by 0 issues
						if(sum == 0)
						{
							sum = 1;
						}
	
						patch_expert_responses[i] /= sum;
	
						// Sum to one
						sum = cv::sum(dProb)[0];
						// To avoid division by 0 issues
						if(sum == 0)
						{
							sum = 1;
						}
	
						dProb /= sum;
	
						patch_expert_responses[i] = patch_expert_responses[i] + dProb;
	
					}
				}
			}
		}
	});

@AntonLinderer
Copy link

Great thanks for the code but I think it is not enough...there are other files using tbb such as LandmarkDetectionValidator.cpp and FaceDetectorMTCNN.cpp

@xindongzhang
Copy link

@AntonLinderer Of course, and you can easily translate those codes by using cv::parallel_for_ .

@TadasBaltrusaitis
Copy link
Owner

Explicit requirement on TBB has been removed, but it can still be used if OpenCV is compiled with it.

@tangjie77wd
Copy link

For simplicity, you can straightly replace codes in Patch_experts.cpp.

	parallel_for_(cv::Range(0, n), [&](const cv::Range& range){
		for(int i = range.start; i < range.end; i++)
		{
	
			if(visibilities[scale][view_id].rows == n)
			{
				if(visibilities[scale][view_id].at<int>(i,0) != 0)
				{
	
					// Work out how big the area of interest has to be to get a response of window size
					int area_of_interest_width;
					int area_of_interest_height;
	
					if(use_ccnf)
					{
						area_of_interest_width = window_size + ccnf_expert_intensity[scale][view_id][i].width - 1;
						area_of_interest_height = window_size + ccnf_expert_intensity[scale][view_id][i].height - 1;
					}
					else
					{
						area_of_interest_width = window_size + svr_expert_intensity[scale][view_id][i].width - 1;
						area_of_interest_height = window_size + svr_expert_intensity[scale][view_id][i].height - 1;
					}
	
					// scale and rotate to mean shape to reference frame
					cv::Mat sim = (cv::Mat_<float>(2,3) << a1, -b1, landmark_locations.at<double>(i,0), b1, a1, landmark_locations.at<double>(i+n,0));
	
					// Extract the region of interest around the current landmark location
					cv::Mat_<float> area_of_interest(area_of_interest_height, area_of_interest_width);
	
					// Using C style openCV as it does what we need
					CvMat area_of_interest_o = area_of_interest;
					CvMat sim_o = sim;
					IplImage im_o = grayscale_image;
					cvGetQuadrangleSubPix(&im_o, &area_of_interest_o, &sim_o);
	
					// get the correct size response window
					patch_expert_responses[i] = cv::Mat_<float>(window_size, window_size);
	
					// Get intensity response either from the SVR or CCNF patch experts (prefer CCNF)
					if(!ccnf_expert_intensity.empty())
					{
	
						ccnf_expert_intensity[scale][view_id][i].Response(area_of_interest, patch_expert_responses[i]);
					}
					else
					{
						svr_expert_intensity[scale][view_id][i].Response(area_of_interest, patch_expert_responses[i]);
					}
	
					// if we have a corresponding depth patch and it is visible
					if(!svr_expert_depth.empty() && !depth_image.empty() && visibilities[scale][view_id].at<int>(i,0))
					{
	
						cv::Mat_<float> dProb = patch_expert_responses[i].clone();
						cv::Mat_<float> depthWindow(area_of_interest_height, area_of_interest_width);
	
	
						CvMat dimg_o = depthWindow;
						cv::Mat maskWindow(area_of_interest_height, area_of_interest_width, CV_32F);
						CvMat mimg_o = maskWindow;
	
						IplImage d_o = depth_image;
						IplImage m_o = mask;
	
						cvGetQuadrangleSubPix(&d_o,&dimg_o,&sim_o);
	
						cvGetQuadrangleSubPix(&m_o,&mimg_o,&sim_o);
	
						depthWindow.setTo(0, maskWindow < 1);
	
						svr_expert_depth[scale][view_id][i].ResponseDepth(depthWindow, dProb);
	
						// Sum to one
						double sum = cv::sum(patch_expert_responses[i])[0];
	
						// To avoid division by 0 issues
						if(sum == 0)
						{
							sum = 1;
						}
	
						patch_expert_responses[i] /= sum;
	
						// Sum to one
						sum = cv::sum(dProb)[0];
						// To avoid division by 0 issues
						if(sum == 0)
						{
							sum = 1;
						}
	
						dProb /= sum;
	
						patch_expert_responses[i] = patch_expert_responses[i] + dProb;
	
					}
				}
			}
		}
	});

There are multi-threads write on the global variable such as "patch_expert_responses" in the scope of cv::parallel_for_ even though not the same place but the same variable,which is allowed ?

@tangjie77wd
Copy link

For simplicity, you can straightly replace codes in Patch_experts.cpp.

	parallel_for_(cv::Range(0, n), [&](const cv::Range& range){
		for(int i = range.start; i < range.end; i++)
		{
	
			if(visibilities[scale][view_id].rows == n)
			{
				if(visibilities[scale][view_id].at<int>(i,0) != 0)
				{
	
					// Work out how big the area of interest has to be to get a response of window size
					int area_of_interest_width;
					int area_of_interest_height;
	
					if(use_ccnf)
					{
						area_of_interest_width = window_size + ccnf_expert_intensity[scale][view_id][i].width - 1;
						area_of_interest_height = window_size + ccnf_expert_intensity[scale][view_id][i].height - 1;
					}
					else
					{
						area_of_interest_width = window_size + svr_expert_intensity[scale][view_id][i].width - 1;
						area_of_interest_height = window_size + svr_expert_intensity[scale][view_id][i].height - 1;
					}
	
					// scale and rotate to mean shape to reference frame
					cv::Mat sim = (cv::Mat_<float>(2,3) << a1, -b1, landmark_locations.at<double>(i,0), b1, a1, landmark_locations.at<double>(i+n,0));
	
					// Extract the region of interest around the current landmark location
					cv::Mat_<float> area_of_interest(area_of_interest_height, area_of_interest_width);
	
					// Using C style openCV as it does what we need
					CvMat area_of_interest_o = area_of_interest;
					CvMat sim_o = sim;
					IplImage im_o = grayscale_image;
					cvGetQuadrangleSubPix(&im_o, &area_of_interest_o, &sim_o);
	
					// get the correct size response window
					patch_expert_responses[i] = cv::Mat_<float>(window_size, window_size);
	
					// Get intensity response either from the SVR or CCNF patch experts (prefer CCNF)
					if(!ccnf_expert_intensity.empty())
					{
	
						ccnf_expert_intensity[scale][view_id][i].Response(area_of_interest, patch_expert_responses[i]);
					}
					else
					{
						svr_expert_intensity[scale][view_id][i].Response(area_of_interest, patch_expert_responses[i]);
					}
	
					// if we have a corresponding depth patch and it is visible
					if(!svr_expert_depth.empty() && !depth_image.empty() && visibilities[scale][view_id].at<int>(i,0))
					{
	
						cv::Mat_<float> dProb = patch_expert_responses[i].clone();
						cv::Mat_<float> depthWindow(area_of_interest_height, area_of_interest_width);
	
	
						CvMat dimg_o = depthWindow;
						cv::Mat maskWindow(area_of_interest_height, area_of_interest_width, CV_32F);
						CvMat mimg_o = maskWindow;
	
						IplImage d_o = depth_image;
						IplImage m_o = mask;
	
						cvGetQuadrangleSubPix(&d_o,&dimg_o,&sim_o);
	
						cvGetQuadrangleSubPix(&m_o,&mimg_o,&sim_o);
	
						depthWindow.setTo(0, maskWindow < 1);
	
						svr_expert_depth[scale][view_id][i].ResponseDepth(depthWindow, dProb);
	
						// Sum to one
						double sum = cv::sum(patch_expert_responses[i])[0];
	
						// To avoid division by 0 issues
						if(sum == 0)
						{
							sum = 1;
						}
	
						patch_expert_responses[i] /= sum;
	
						// Sum to one
						sum = cv::sum(dProb)[0];
						// To avoid division by 0 issues
						if(sum == 0)
						{
							sum = 1;
						}
	
						dProb /= sum;
	
						patch_expert_responses[i] = patch_expert_responses[i] + dProb;
	
					}
				}
			}
		}
	});

There are multi-threads write on the global variable such as "patch_expert_responses" in the scope of cv::parallel_for_ even though not the same place but the same variable,which is allowed ?

There is no true shared,what about the false shared or Ping Pang situation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants