Making OpenCL radial profile evaluation additive

In af8c045 we changed the way profiles put values into the target Image object from being a straight value copying to be additive. This removed the need for an extra memory allocation in the Model, plus it simplified the process of combining results from different profiles in an effective way. An oversight to that change was the OpenCL implementation of the radial profiles, which were still overwriting values when writing to the target image. This is fixed in this commit. To simplify the solution we are performing an extra memory allocation here to give the OpenCL kernels a clean slate, as usual; otherwise we would need to change all OpenCL kernels to make *them* additive, and we would always have to transfer the current contents of the image first into the OpenCL device before the kernel is launched (which we currently avoid with OpenCL 1.2 devices by doing a fill operation with 0s, when possible). On top of that, for floating-point kernels we would still require an extra allocation anyway. A new unit test has been added to double-check that when evaluating a Model with two radial profiles via OpenCL we still get a sensible image. The test adds two diametrically opposed profiles and then checks that their centers have the same flux, which wouldn't be the case if only one of them makes it into the final image. When doing this work I also realized that when the 'rough' parameter is set, we were not scaling the resulting image by the pixel scale, so I'm fixing that as well. Signed-off-by: Rodrigo Tobar <rtobar@icrar.org>
ICRAR · Jul 12, 2019 · a3499c6 · a3499c6
1 parent cc4c1ef
commit a3499c6
Show file tree

Hide file tree

Showing 3 changed files with 51 additions and 19 deletions.
diff --git a/docs/changelog.rst b/docs/changelog.rst
@@ -5,6 +5,25 @@ Changelog
 .. highlight:: cpp
 .. namespace:: profit
 
+.. rubric:: Development version
+
+* A bug in the OpenCL implementation of the radial profiles
+  prevented Models with multiple profiles
+  from displaying correctly,
+  as the output image would contain
+  only the values of last profile.
+  This was a problem introduced
+  only in the last version of *libprofit*,
+  and not an ongoing issue.
+* When using OpenCL,
+  any radial profile specifying ``rough=true``
+  caused the output image not to be scaled properly,
+  with values not taking into account the profile's magnitude
+  or pixel scale.
+  This seems to have been an issue for a long time,
+  but since ``rough=true`` is not a common option
+  it had gone under the radar for some time.
+
 .. rubric:: 1.9.2
 
 * All profile evaluation has been changed

diff --git a/src/radial.cpp b/src/radial.cpp
@@ -493,19 +493,15 @@ void RadialProfile::evaluate_opencl(Image &image, const Mask & /*mask*/, const P
 	// If FT is double we directly store the result in the profile image
 	// Otherwise we have to copy element by element to convert from float to double
 	cl::vector<cl::Event> read_waiting_evts{kernel_evt};
-	if( float_traits<FT>::is_double ) {
-		read_evt = env->queue_read(image_buffer, image.data(), &read_waiting_evts);
-		read_evt.wait();
-		t_opencl = system_clock::now();
-	}
-	else {
-		std::vector<FT> image_from_kernel(image.size());
-		read_evt = env->queue_read(image_buffer, image_from_kernel.data(), &read_waiting_evts);
-		read_evt.wait();
-		t_opencl = system_clock::now();
-		std::copy(image_from_kernel.begin(), image_from_kernel.end(), image.begin());
-		stats->final_image += to_nsecs(system_clock::now() - t_opencl);
-	}
+	std::vector<FT> image_from_kernel(image.size());
+	read_evt = env->queue_read(image_buffer, image_from_kernel.data(), &read_waiting_evts);
+	read_evt.wait();
+	t_opencl = system_clock::now();
+	FT flux_scale = FT(this->get_pixel_scale(scale));
+	std::transform(image.begin(), image.end(), image_from_kernel.begin(), image.begin(), [flux_scale](double ipixel, double kpixel) {
+		return ipixel + kpixel * flux_scale;
+	});
+	stats->final_image += to_nsecs(system_clock::now() - t_opencl);
 
 	/* These are the OpenCL-related timings so far */
 	cl_times0.kernel_prep = to_nsecs(t_kprep - t0);
@@ -678,18 +674,14 @@ void RadialProfile::evaluate_opencl(Image &image, const Mask & /*mask*/, const P
 
 	t_loopend = system_clock::now();
 
-	std::for_each(subimages_results.begin(), subimages_results.end(), [&image, &scale](const im_result_t &res) {
+	std::for_each(subimages_results.begin(), subimages_results.end(), [&image, &scale, &flux_scale](const im_result_t &res) {
 		FT x = FT(res.point.x / scale.first);
 		FT y = FT(res.point.y / scale.second);
 		unsigned int idx = static_cast<unsigned int>(floor(x)) + static_cast<unsigned int>(floor(y)) * image.getWidth();
-		image[idx] += res.value;
+		image[idx] += res.value * flux_scale;
 	});
 
 	// the image needs to be multiplied by the pixel scale
-	double flux_scale = this->get_pixel_scale(scale);
-	std::transform(image.begin(), image.end(), image.begin(), [flux_scale](double pixel) {
-		return pixel * flux_scale;
-	});
 	t_imgtrans = system_clock::now();
 
 	stats->subsampling.pre_subsampling = to_nsecs(t_loopstart - t_opencl);

diff --git a/tests/test_opencl.h b/tests/test_opencl.h
@@ -342,4 +342,25 @@ class TestOpenCL : public CxxTest::TestSuite {
 		_check_convolver(create_convolver(ConvolverType::OPENCL, prefs));
 	}
 
+	void test_opencl_addition()
+	{
+		_check_opencl_support();
+		// Make sure a Model with two profiles displays both of them
+		// We position two profiles diametrically opposed at positions
+		// 29.5,29.5 (pixel 29,29) and 70.5,70.5 (pixel 70,70)
+		// so their centers should look fairly identical
+		Model m{100, 100};
+		m.set_opencl_env(openCLParameters->opencl_env);
+		for (auto pos: {29.5, 70.5}) {
+			auto sersic = m.add_profile("sersic");
+			sersic->parameter("xcen", pos);
+			sersic->parameter("ycen", pos);
+			sersic->parameter("mag", 13.);
+			sersic->parameter("re", 40.);
+		}
+		auto image = m.evaluate();
+		auto diff = relative_diff(image[Point{29, 29}], image[Point{70, 70}]);
+		TS_ASSERT_LESS_THAN(diff, 1e-9);
+	}
+
 };