changed random state definition as a temp fix... issue stemming from … #192

ZainNasrullah · 2020-05-18T22:12:59Z

As per our discussion, minor fix which resolves failure of running compare_all_models.py

The true issue seems to be in utility.py, where the sample_without_replacement function (called in generate_indices) produces inconsistent results with np.random.RandomState(42). Take a deeper look at this interaction to solve permanently.

…interaction between np.Random.RandomState and sample_without_replacement()

coveralls · 2020-05-18T22:26:20Z

Pull Request Test Coverage Report for Build 1495

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.02%) to 95.621%

Totals
Change from base Build 1476:	-0.02%
Covered Lines:	4411
Relevant Lines:	4613

💛 - Coveralls

Shihab-Shahriar · 2020-05-19T16:24:15Z

If the problem is in sample_without_replacement function, can't it be simply removed from generate_indices? Something like below:

def generate_indices(random_state, bootstrap, n_population, n_samples):
    return random_state.choice(n_population,size=n_samples,replace=bootstrap)

Couldn't test this, so sorry if I'm missing something.

ZainNasrullah · 2020-05-19T16:34:05Z

If the problem is in sample_without_replacement function, can't it be simply removed from generate_indices? Something like below:
def generate_indices(random_state, bootstrap, n_population, n_samples):
    return random_state.choice(n_population,size=n_samples,replace=bootstrap)    
Couldn't test this, so sorry if I'm missing something.

Hey Shihab, as you mentioned, this is a potential fix. However, it is a good idea for us to understand what the underlying problem is between sample_without_replacement and RandomState since it can have implications in other contexts.

Additionally, an issue with the proposed fix is that the API for these methods allows random_state to be an integer or a RandomState object; if the user passes in an integer, the proposed code will eventually hit an error. Although we can manually force the int to RandomState, I'll defer to @yzhao062 on design decisions which can have impacts beyond the scope of this model.

Shihab-Shahriar · 2020-05-19T16:48:36Z

If I understood correctly, current implementation forces random_state to be a RandomState object. This can be, I think, relaxed by adding random_state = check_random_state(random_state) line at the beginning.

I agree that investigating first why this error is thrown is a good idea.

ZainNasrullah · 2020-05-19T18:43:18Z

Taking a look at the implementation, both the fit and generate_bagging_indices methods call check_random_state(self.random_state) and return a RandomState object. For reasons I don't understand yet, this RandomState object seems to behave differently within sample_without_replacement when it is defined in compare_all_models.py versus when it is created by check_random_state.

You are correct though, your proposed fix would work but we would need to ensure that, in all uses of generate_indices, random_state is always input as a RandomState object or it gets converted into one.

changed random state definition as a temp fix... issue stemming from …

0aa0c38

…interaction between np.Random.RandomState and sample_without_replacement()

ZainNasrullah requested a review from yzhao062 May 18, 2020 22:12

yzhao062 mentioned this pull request May 18, 2020

Locally Selective Combination (LSCP) ValueError: x and y must have length at least 2 #180

Open

yzhao062 merged commit 5bcd52c into development May 23, 2020

yzhao062 deleted the issue-180-temp-fix branch May 23, 2020 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changed random state definition as a temp fix... issue stemming from … #192

changed random state definition as a temp fix... issue stemming from … #192

ZainNasrullah commented May 18, 2020

coveralls commented May 18, 2020 •

edited

Loading

Shihab-Shahriar commented May 19, 2020 •

edited

Loading

ZainNasrullah commented May 19, 2020

Shihab-Shahriar commented May 19, 2020

ZainNasrullah commented May 19, 2020 •

edited

Loading

changed random state definition as a temp fix... issue stemming from … #192

changed random state definition as a temp fix... issue stemming from … #192

Conversation

ZainNasrullah commented May 18, 2020

coveralls commented May 18, 2020 • edited Loading

Pull Request Test Coverage Report for Build 1495

💛 - Coveralls

Shihab-Shahriar commented May 19, 2020 • edited Loading

ZainNasrullah commented May 19, 2020

Shihab-Shahriar commented May 19, 2020

ZainNasrullah commented May 19, 2020 • edited Loading

coveralls commented May 18, 2020 •

edited

Loading

Shihab-Shahriar commented May 19, 2020 •

edited

Loading

ZainNasrullah commented May 19, 2020 •

edited

Loading