Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autodiff gaussian width parameter #4782

Closed
wants to merge 9 commits into from
18 changes: 14 additions & 4 deletions src/shogun/kernel/GaussianKernel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,15 @@
* Tonmoy Saikia, Sergey Lisitsyn, Matt Aasted, Sanuj Sharma
*/

#include <Eigen/Core>
#include <unsupported/Eigen/AutoDiff>
#include <shogun/lib/common.h>
#include <shogun/kernel/GaussianKernel.h>
#include <shogun/features/DotFeatures.h>
#include <shogun/distance/EuclideanDistance.h>
#include <shogun/mathematics/Math.h>


using namespace shogun;

CGaussianKernel::CGaussianKernel() : CShiftInvariantKernel()
Expand Down Expand Up @@ -90,19 +93,26 @@ void CGaussianKernel::set_width(float64_t w)

SGMatrix<float64_t> CGaussianKernel::get_parameter_gradient(const TParameter* param, index_t index)
{
using std::exp;

require(lhs, "Left hand side features must be set!");
require(rhs, "Rightt hand side features must be set!");
require(rhs, "Right hand side features must be set!");

if (!strcmp(param->m_name, "log_width"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I guess a next step could be to start thinking about getting rid of this explicit code, and rather automatically offer this derivative through registering something in the ctors ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for that we should maybe have a base class for classes that have parameters that we can take the derivative wrt. This class registers the gradient parameters in some vector and then we can get the index from there.
Basically when we do watch_param(...) this would add the variable in such a vector if it has the flag GRADIENT

{
SGMatrix<float64_t> derivative=SGMatrix<float64_t>(num_lhs, num_rhs);
using EigenScalar = Eigen::Matrix<float64_t, 1, 1>;
Eigen::AutoDiffScalar<EigenScalar> eigen_log_width = m_log_width;

for (int k=0; k<num_rhs; k++)
{
#pragma omp parallel for
for (int j=0; j<num_lhs; j++)
{
float64_t element=distance(j, k);
derivative(j, k) = std::exp(-element) * element * 2.0;
eigen_log_width.derivatives() = EigenScalar::Unit(1,0);
auto el = CShiftInvariantKernel::distance(j, k);
Eigen::AutoDiffScalar<EigenScalar> kernel = exp(-el / (exp(eigen_log_width * 2.0) * 2.0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kinda sucks that the code for the kernel needs to be in here as well as in the kernel itself dont you think?

derivative(j, k) = kernel.derivatives()(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems pretty automatable to me (at least say for scalar valued kernels with scalar parameters)

}
}
return derivative;
Expand All @@ -118,7 +128,7 @@ float64_t CGaussianKernel::compute(int32_t idx_a, int32_t idx_b)
{
float64_t result=distance(idx_a, idx_b);
return std::exp(-result);
}
}

void CGaussianKernel::load_serializable_post() noexcept(false)
{
Expand Down