Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize_radius not using information of clusters #154

Closed
marco-2023 opened this issue Aug 9, 2023 · 3 comments · Fixed by #155
Closed

optimize_radius not using information of clusters #154

marco-2023 opened this issue Aug 9, 2023 · 3 comments · Fixed by #155
Assignees

Comments

@marco-2023
Copy link
Collaborator

marco-2023 commented Aug 9, 2023

The function optimize_radius in the file utils.py is not using the cluster_idsparameters. The select_from_cluster method of the DirectedSphereExclusion is also not using the cluster_ids. Because of this, this selection algorithm fails when used with cluster labels.

@xychem
Copy link
Collaborator

xychem commented Aug 11, 2023

The cluster_ids don't be used in utils.py, it seems work when you insert following code to the raw code.

    if cluster_ids is not None:
        # the feature matrix(n_samples,n_features) of different cluster
        X = X[cluster_ids]
if X.shape[0] < size:
        raise RuntimeError(
            f"Size of samples to be selected is greater than existing the number of samples; "
            f"{size} > {X.shape[0]}."
        )
    # set the limits on # of selected points according to the tolerance percentage

    if cluster_ids is not None:
        # the feature matrix(n_samples,n_features) of different cluster
        X = X[cluster_ids]

    error = size * obj.tol
    lower_size = round(size - error)
    upper_size = round(size + error)

An other question is the tol of class optisim in dissimilarity.py is wrong, I think it can be 0.05. The iteration in ultis.py will not execute due to the huge tolerance.

def __init__(self, r=None, k=10, tol=5.0, eps=0, p=2, start_id=0, random_seed=42, n_iter=10):

I get the following pictures after doing these.

optisim

image
image

DirectedSphereExclusion

image

It is strange that I get thirteen points in the second picture but it looks like the points are selected by thier cluster_labels.

image

@FarnazH
Copy link
Member

FarnazH commented Nov 12, 2023

Thanks @xychem, this 183e253 fixes one of the issues you reported in #154 (comment). I also encountered it when putting together the quick_start.ipynb notebook in PR #186. Don't know why this item was not addressed before closing the issue.

@FanwangM
Copy link
Collaborator

This issue was closed because #155 attempted to fix the problem. Accodring to @xychem's code, it seemed working. I will take a closer look at the quick_start.ipynb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants