Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perhaps avoid validating input dims on AbstractVector{<:AbstractVector{<:Real}} #512

Open
grero opened this issue May 30, 2023 · 7 comments

Comments

@grero
Copy link

grero commented May 30, 2023

I would suggest either changing this, or perhaps better, skipping it all together.

dim(x::AbstractVector{<:AbstractVector{<:Real}}) = length(first(x))

I'm developing kernels that work on pairs of Vector{Float64}, where the kernel essentially sums over all pairwise distances in x and y. For this kernel, the check above does not really make sense, and indeed passes if I construct the kernel matrix like this:

K = kernelmatrix(kernel, x,x)

However it fails if I instead do

K = kernelmatrix(kernel, x,y)

unless length(first(x))==length(first(y)). I realise that I can just create my own type essentially wrapping Vector{Vector{Float64}}, but that seems like unnecessary work, especially since the kernel machinery works beautifully as it is, as long as the inputs pass the validation.

@willtebbutt
Copy link
Member

Ah, interesting. So you've got a case where the dimensionality of the input changes from element to element?

@willtebbutt
Copy link
Member

Are you hitting the checks in the various methods in https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/blob/master/src/matrix/kernelmatrix.jl ? I agree that they're making overly strong assumptions about input dimensionality, but I would like to check that they're actually the source of your problem.

@grero
Copy link
Author

grero commented May 30, 2023

Yes, that's right. For context, I'm working on using kernel methods for analysing spike trains, which are basically sets of event timestamps. The obvious way to represent this is as a vector of vector, where each inner vector represents a spike train for one trial. Since the number of events changes from trial to trial, the dimension of the inner vectors also change.

@willtebbutt
Copy link
Member

willtebbutt commented May 30, 2023

Excellent. This makes complete sense. Would you mind opening a PR which removes the checks, and adds tests which fail without the removal of the checks? I imagine we can have this sorted by the end of the day.

@grero
Copy link
Author

grero commented May 30, 2023

Are you hitting the checks in the various methods in https://github.com/JuliaGaussianProcesses/KernelFunctions.jl/blob/master/src/matrix/kernelmatrix.jl ? I agree that they're making overly strong assumptions about input dimensionality, but I would like to check that they're actually the source of your problem.

Yes, as I indicated at the top, the validate_inputs method fails on my inputs, unless the first element of x and y happens to have the same number of items.

@willtebbutt
Copy link
Member

Great -- I just wasn't sure if that was the only place where we use validate_inputs erroneously, or if there was something else.

@grero
Copy link
Author

grero commented May 31, 2023

PR submitted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants