New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KernelDensity docstring #3270
Comments
Just realized that Jake had written a nice blog post answering most of my questions. Would be nice to improve the documentation based on this blog post. |
Yes, I agree! Part of the reason this info is not in the doc string is that I hadn't explored it in detail yet: the blog post was the result of me trying to figure that out 😀 |
Hi, Is this issue open for resolving ? |
@Hasil-Sharma All issues are open for resolving :) |
Working on improving the docs based on the blog post mentioned above, |
Awesome - thanks @Winterflower |
One thing I'd thought of here: the default parameter value asks for exact results, which is basically the slowest possible algorithm. Most users will not likely dig into the doc strings to figure this out... perhaps we should change it to use some reasonable error threshold as the default? |
🤔 I'm working on this |
interestingly enough the BinaryTree.kernel_density function has the default rtol of 1E-8 but documents 0... |
I think we actually might need @jakevdp on this |
Note: This file (https://github.com/scikit-learn/scikit-learn/blob/506b12b2761ad88039114dec1c6c4fcec4a7a021/sklearn/neighbors/_binary_tree.pxi) has rtol : float, default=1e-8
Specify the desired relative tolerance of the result.
If the true result is `K_true`, then the returned result `K_ret`
satisfies ``abs(K_true - K_ret) < atol + rtol * K_ret``
The default is `1e-8` (i.e. machine precision). and |
Docstrings have evolved and improved. Closing this, happy to have a new issue if something's still unclear. |
I had troubles understanding how parameters related to KD-tree and Ball tree affect DensityEstimation.
In my understanding, KDE uses the average evaluation of kernels centered on every training point so I didn't quite understand how KD-tree and Ball-tree are useful (I would understand if it used, say, the average of the k nearest neighbors only).
The docstring says that we can specify tolerance options
rtol
andatol
. With respect to what stopping criterion are these constants used?Regarding the breadth_first option, the docstring says "use a breadth-first approach to the problem". I didn't understand what "problem" it refers to.
What is the practical impact of of the tree related parameters? Do they only affect speed or can they also affect quality of estimation?
Sorry for the naive questions but it would help my understanding if we could add a couple of words to clarify the docstring.
BTW, thanks @jakevdp for this great module!
The text was updated successfully, but these errors were encountered: