Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Setting weight coefficients in OPTICS algorithm. #16128

Closed
wants to merge 2 commits into from
Closed

[WIP] Setting weight coefficients in OPTICS algorithm. #16128

wants to merge 2 commits into from

Conversation

trabbart
Copy link

@trabbart trabbart commented Jan 15, 2020

Reference Issues/PRs

See issue #12394

What does this implement/fix? Explain your changes.

This implement settings weight coefficients in OPTICS algorithm. Changes are similar to solution in DBSCAN.

Any other comments?

@adrinjalali
Copy link
Member

Thanks for your PR @trabbart

there are a few issues which need to be addressed here:

  • sample_weight should be passed to fit, and not init
  • you need to add tests for the functionality you add
  • I'm not sure how to handle min_samples when we have sample_weight.
    Should min_samples be the weighted number of samples?

@trabbart
Copy link
Author

trabbart commented Jan 21, 2020

Thanks for your review @adrinjalali

I will add test and correct mistake about passing sample_weight.
About third dot I think is good to make it consistent with implementation in DBSCAN.
In DBSCAN min_samples is required minimal sum of weight of samples in cluster (Is this what you call “weighted number of samples”?).
There is one difference between OPTICS and DBSCAN – in OPTICS min_samples can also be float. (Then it is interpreted as fraction of number of all samples). I propose to treat in similar way. So minimal sum of weight will be equal to fraction of sum of weights of all samples. I am not sure if this is intuitive. What do you think?

@adrinjalali
Copy link
Member

There is one difference between OPTICS and DBSCAN – in OPTICS min_samples can also be float. (Then it is interpreted as fraction of number of all samples). I propose to treat in similar way. So minimal sum of weight will be equal to fraction of sum of weights of all samples. I am not sure if this is intuitive. What do you think?

yeah that makes sense

@armenabnousi
Copy link

Hi,
This issue is closed but it's not on the dev version yet. Would appreciate if you could give a timeline of when it will be integrated.

Thanks

@adrinjalali
Copy link
Member

I don't think anybody's actively working on it, so there's no ETA at the moment.

@MitalPattani
Copy link

MitalPattani commented Mar 19, 2021

@adrinjalali @trabbart has used radiusNeighbors instead of simply kneighbors for sample weights what is the purpose behind that? and won't that be inconsistent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants