Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move stats functions from Base to StatsBase, move StatsBase to stdlib? #27140

Closed
JeffBezanson opened this issue May 17, 2018 · 6 comments
Closed
Labels
kind:excision Removal of code from Base or the repository stdlib Julia's standard library

Comments

@JeffBezanson
Copy link
Sponsor Member

It doesn't seem ideal to have a tiny number of stats functions in Base, and also have a StatsBase package. These could be consolidated and moved to the stdlib, possibly as Statistics.jl.

@JeffBezanson JeffBezanson added kind:excision Removal of code from Base or the repository stdlib Julia's standard library labels May 17, 2018
@nalimilan
Copy link
Member

nalimilan commented May 18, 2018

I've made that proposal before, so +1. A few things to discuss, some of which make the situation more complex:

  • Which functions should be moved to stdlib? I guess cor, cov, var, std and quantile, but probably not mean nor middle?
  • StatsBase currently depends on SortingAlgorithms (for radix sort), which would also have to be moved to the stdlib. I think that's a good idea anyway. There's also a dependency on DataStructures, but it's only used in one function (for heaps), so maybe we can work around it.
  • We plan to have (actually, update) a Stats.jl meta-package which will load all packages providing what people generally expect to have available by default for a statistical environment. So StatsBase is a better name than Statistics to avoid confusion; or we should call the meta-package AllStats, StatsEnvironment or something like that.
  • I'm not sure we are completely happy with all the APIs provided in StatsBase. I know stdlib isn't guaranteed to remain completely stable, but we could also move only parts of StatsBase to the stdlib, and keep the StatsBase package in parallel for some time.

EDIT: see also #27152 (comment)

@JeffBezanson
Copy link
Sponsor Member Author

I think we should move everything listed under statistics in export.jl, which includes mean. Anything else seems too arbitrary.

@ViralBShah
Copy link
Member

I would think that mean is more generally used and it could perhaps continue to be in Base as a reducer.

@fredrikekre
Copy link
Member

Seems a bit iffy to to have a *Base library as a stdlib. Why can't we just move them out completely to a package, like we did with e.g. FFTW, SpecialFunctions, QuadGK etc?

fredrikekre added a commit that referenced this issue May 20, 2018
this commit removes cor, cov, median, median!,
middle, quantile, quantile!, std, stdm, var,
varm and linreg and moves them to StatsBase

fix #25571 (comment) (included in StatsBase.jl/#379)
fix #23769 (included in StatsBase.jl/#379)
fix #27140
fredrikekre added a commit that referenced this issue May 20, 2018
this commit removes cor, cov, median, median!,
middle, quantile, quantile!, std, stdm, var,
varm and linreg and moves them to StatsBase

fix #25571 (comment)
    (included in JuliaStats/StatsBase.jl#379)
fix #23769 (included in JuliaStats/StatsBase.jl#379)
fix #27140
@alanedelman
Copy link
Contributor

Alright i'm obviously 6 months late to this discussion, but I really miss mean. Can we tell users where to look for some of the most common functions instead of the standard error message? I'm thinking of undergrad users of Julia in classes, and new Julia users elsewhere?

@StefanKarpinski
Copy link
Sponsor Member

using Statistics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:excision Removal of code from Base or the repository stdlib Julia's standard library
Projects
None yet
Development

No branches or pull requests

6 participants