Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt centroids' initialisation in BallTree #20582

Open
jjerphan opened this issue Jul 21, 2021 · 0 comments
Open

Adapt centroids' initialisation in BallTree #20582

jjerphan opened this issue Jul 21, 2021 · 0 comments
Labels
Moderate Anything that requires some knowledge of conventions and best practices module:neighbors

Comments

@jjerphan
Copy link
Member

As reported by @Witiko in #20097 (reply in thread), the centroids' initialization in BallTree is tight to the case of the euclidean distance regardless of the DistanceMetric set indirectly by the user provided metric argument.

# determine Node centroid
for j in range(n_features):
centroid[j] = 0
if with_sample_weight:
sum_weight_node = 0
for i in range(idx_start, idx_end):
sum_weight_node += sample_weight[idx_array[i]]
this_pt = data + n_features * idx_array[i]
for j from 0 <= j < n_features:
centroid[j] += this_pt[j] * sample_weight[idx_array[i]]
for j in range(n_features):
centroid[j] /= sum_weight_node
else:
for i in range(idx_start, idx_end):
this_pt = data + n_features * idx_array[i]
for j from 0 <= j < n_features:
centroid[j] += this_pt[j]
for j in range(n_features):
centroid[j] /= n_points
# determine Node radius
radius = 0
for i in range(idx_start, idx_end):
radius = fmax(radius,
tree.rdist(centroid,
data + n_features * idx_array[i],
n_features))
tree.node_data[i_node].radius = tree.dist_metric._rdist_to_dist(radius)
tree.node_data[i_node].idx_start = idx_start
tree.node_data[i_node].idx_end = idx_end

The initialisation of those centroids should depends on the effective DistanceMetric. Apart from the euclidean case, are there closed form solutions for computing them?

@jjerphan jjerphan added Moderate Anything that requires some knowledge of conventions and best practices module:neighbors labels Jul 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Moderate Anything that requires some knowledge of conventions and best practices module:neighbors
Projects
None yet
Development

No branches or pull requests

1 participant