Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If rng is set, then shuffle=true ought to be implicit #258

Closed
ablaom opened this issue Oct 7, 2019 · 1 comment
Closed

If rng is set, then shuffle=true ought to be implicit #258

ablaom opened this issue Oct 7, 2019 · 1 comment
Labels
enhancement New feature or request

Comments

@ablaom
Copy link
Member

ablaom commented Oct 7, 2019

In resampling strategies and in partition and perhaps elsewhere there is option to set shuffle=true, and also an option to set rng to integer or specific RNG object. If the latter is set, then shuffle=true ought to be forced. Currently shuffle has default false and if rng only is set then rng is ignored and no shuffling happens.

@ablaom ablaom added the enhancement New feature or request label Oct 7, 2019
@ablaom
Copy link
Member Author

ablaom commented Nov 27, 2019

Here specifically is the proposed logic for thepartition function:


partition(rows::AbstractVector{Int}, fractions...; 
          shuffle=nothing, rng=Random.GLOBAL_RNG)

Splits the vector rows into a tuple of vectors whose lengths are
given by the corresponding fractions of length(rows). The last
fraction is not provided, as it is inferred from the preceding
ones. So, for example,

julia> partition(1:1000, 0.2, 0.7)
(1:200, 201:900, 901:1000)

Pre-shuffling of rows always occurs if rng is specified, unless
shuffle=false is also specified. Ifrng an integer, then
MersenneTwister(rng) is used as a random number generator; otherwise
some AbstractRNG object is expected.

To use the global random generator it suffices to specify
shuffle=true.


Todo:

  • partition (in MLJBase)
  • CV resampling strategy
  • Holdout resampling strategy
  • StratifiedCV

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant