-
Notifications
You must be signed in to change notification settings - Fork 432
Better documentation for LocationScale #1482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Better documentation for LocationScale #1482
Conversation
|
Instead one should use |
|
|
||
| Whereas this package already provides a large collection of common distributions out of box, there are still occasions where you want to create new distributions (*e.g* your application requires a special kind of distributions, or you want to contribute to this package). | ||
|
|
||
| **Note:** if you only want to change the location and scale of a univariate distribution, see [`LocationScale`](@ref). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| **Note:** if you only want to change the location and scale of a univariate distribution, see [`LocationScale`](@ref). | |
| !!! note | |
| to change the location and scale of a univariate distribution, use `+` and `*`, see [`AffineDistribution`](@ref) for details. |
Please check the syntax locally to verify that everything looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a too specific remark. One can define new Distributions but there are several ways to create “derived distributions” such as truncated and location-scale and mixtures, no need to single one out here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So perhaps a new subsection in "Create New Samplers and Distributions", called "Derived distributions"? With an overview on all the ways to create new derived distributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that more, linked from https://github.com/JuliaStats/Distributions.jl/pull/1470/files#diff-36c852523b6a7513a10b8a6bae21f1037eaa27484efc9042f75bbda46d9f529bR10 where we also started using that language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good solution to me. I hate to simply leave the work to others, but I am unfortunately not one to write such a section.
I would invite any use of my suggested section on scaling and shifting, in a comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So perhaps a new subsection in "Create New Samplers and Distributions"
This section is (at least currently) for documenting how one can implement completely new distributions and samplers according to the interface.
I think a separate section that talks about such derived distributions would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are already separate sections for some derived distributions, e.g. reshaped distributions, mixture models, and convolutions, and some are part of other sections such as truncated and product distributions. These could all be moved to or at least linked from such a section of derived distributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it shouldn't be a subsection of "Create New Samplers and Distributions"
| dof(d) # Get the degrees of freedom, i.e. ν | ||
| ``` | ||
| To create a TDist with a different location and scale, see `LocationScale`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To create a TDist with a different location and scale, see `LocationScale`. | |
| To create a TDist with a different location and scale, use `+` and `*`, see `AffineDistribution` for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People who read the source code probably have a terminal open and can do ?AffineDistribution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would not help me much...
julia> using Distributions
(@v1.7) pkg> st Distributions
Status `C:\Users\densb\.julia\environments\v1.7\Project.toml`
[31c24e10] Distributions v0.25.37
help?> AffineDistribution
search:
Couldn't find AffineDistribution
Perhaps you meant MatrixDistribution
No documentation found.
Binding AffineDistribution does not exist.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AffineDistribution is not exported because you're not supposed to use the constructor. It's just a fallback, similar to Truncated or Product which one also should not use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So referring to it for documentation is a bad idea, right?
|
Cool! That is pretty neat. I am not technically well versed on the details, but here is my stab at writing such a section: Scaling and shifting a distributionSome distributions have parameters that allow scaling and shifting them. Examples include the mean and standard deviation of a julia> d = TDist(10);
julia> scaled_d = d * 2;
julia> shifted_and_scaled_d = scaled_d + 2;
julia> shifted_and_scaled_d == d * 2 + 2
trueNote that the scaling occurs along the x-axis, just like the shift. This means that the scaling factor and shift play exactly the same role as the standard deviation and mean for a normal distribution, respectively. Additionally, distributions that have scaling and shifting parameters can still be scaled and shifted: julia> d1 = Normal(10, 10);
julia> d2 = Normal()*10 + 10; #Normal() defaults to mean=0 and stddev=1
julia> d1 == d2
true |
julia> d1 = Normal(10, 10);
julia> d2 = Normal()*10 + 10;
julia> d1 == d2
trueThis requires the reader to know the default arguments for the Normal distribution by heart which is a bit weird. Overall, I'm not sure whether such a section adds much. The whole idea seems pretty intuitive EDIT: I do very much agree that adding a very more references would be a good idea. I've read large parts of the documentation about scaling recently and never spotted the |
I added a comment, which should make things clear. The point is that you can simply scale and shift "Standardized distributions", which makes writing it this way make sense. If you disagree, I am fine with setting the parameters explicitly.
The idea is very inntuitive - really cool functionality! However, how should a user know that it is possible, and what should they look for to discover that functionality? Also, since I am suggesting a new section, it is very easy to skip it entirely if you understand it. Overdocumentation is IMHO much better than underdocumentation, because not having needed docs hurts so much more than having redundant docs. As I see it, the docs are for the dullest and least competent users imaginable. I do not feel that I am that user, but I would still very much appriciate such a section. I therefore see a need for it.
Where would you feel that it is best to add such references? The alternative to a new section, as I see it, would be to add instructions on how to scale and shift every univariate distribution in their docstring. This strikes me as a less elegant solution... |
|
I second changing the documentation. The use of * and + for scaling and location changes is compact, but far from obvious. |

In the Statistics-course I had last semester, I early on found that I have an issue with normalizing distributions, computing normalized test-statistics, only to rescale the test statistic.
To that end, I looked for a way to change the location and scale of a TDist. I even went as far as trying to define my own generalized
GTDist, based on https://juliastats.org/Distributions.jl/stable/extends/. But I never succeded.Lo and behold, there is a function that does exactly what I wanted all along. Or, more precisly, a type:
LocationScale.This PR aims to make
LocationScalemore discoverable, byLocationScalein the "Create New Samplers and Distributions" section of the docs.LocationScalein the docstring for TDist.I feel that 2) might be overkill. But as the same time, it was the only distribution that I found the need to scale, and I quickly encountered that need. I also think that overdocumenting is better that underdocumenting, as one can not expect every user to read the entire documentation.
I do not know which (if any) other distributions
LocationScaleis particularly relevant for, but the same pointer should be added if there are.These are my suggestions on how to make
LocationScalemore discoverable, but others are of course very welcome.