Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to supply a random seed #153

Closed
oyamad opened this issue May 20, 2015 · 15 comments
Closed

Option to supply a random seed #153

oyamad opened this issue May 20, 2015 · 15 comments

Comments

@oyamad
Copy link
Member

oyamad commented May 20, 2015

Given the discussion in #147, I would like to discuss the way how to give the user the option to supply a random seed; we better fix a common interface and build a module that provides that interface.

A simple form would be something like this:

def some_function(main_arguments, random_state=None):
    """
    Parameters
    ----------
    ...

    random_state : scalar(int) or np.random.RandomState

    """
    if random_state is None or isinstance(random_state, int):
        random_state_ = np.random.RandomState(random_state)
    elif isinstance(random_state, np.random.RandomState):
        random_state_ = random_state
    else:
        raise ValueError('some message')

    # Generate a random number
    u = random_state_.random_sample()

For example, scikit-learn has a guideline on random numbers and a common routine check_random_state that converts the input into a np.random.RandomState instance.

@mmcky
Copy link
Contributor

mmcky commented May 22, 2015

I think this is a good approach. I like the idea of using random state object as in a recent test I had to ensure numpy's internal state was consistent before running two different functions for comparison which required setting the seed twice. This would avoid this problem. We should wrap the instance checks into a utility or support function as skikit-learn has done.

@mmcky
Copy link
Contributor

mmcky commented May 29, 2015

@oyamad I have pushed a branch called reorg-utilities within which I have included a copy of the scikit-learn function for checking random seeds as part of a broader reorg of the quantecon utilities. I think for future growth it is probably better they are contained in a subpackage which allows for categorization of tools while retaining a common api.

from quantecon.util import check_random_state

oyamad added a commit that referenced this issue May 29, 2015
@oyamad
Copy link
Member Author

oyamad commented May 29, 2015

@mmcky Thanks! I have incorporated it into the mc_random branch (#154).

@oyamad
Copy link
Member Author

oyamad commented May 29, 2015

We need to rewrite from quantecon.utilities import searchsorted in test_utilities.py to import util.array.searchsorted. We may rename the file to test_array.py. In which directory should that file go?

@mmcky
Copy link
Contributor

mmcky commented May 30, 2015

@oyamad Thanks for pointing that out.

  1. Migrated test_utilities.py to util\tests\test_array.py
  2. Also re: Reference you can use from quantecon.util import searchsorted as the __init__.py controls the api and I have imported it to the util namespace.

I have migrated the tests to a util/tests to keep the tests within the utilities subpackage. Since nosetests does a recursive search I think this is the best location.

Note: I know this hasn't been done with test_models as yet but it also probably should be moved to the models subpackge.

@oyamad
Copy link
Member Author

oyamad commented May 31, 2015

@mmcky Thanks!

It may be helpful if we have a template and provide it in a developer documentation.
Would it be like this?

from .util import check_random_state

def some_function(main_arguments, random_state=None):
    """
    Parameters
    ----------
    ...

    random_state : int or np.random.RandomState, optional
        Random seed (integer) or np.random.RandomState instance to set
        the initial state of the random number generator for
        reproducibility. If None, a randomly initialized RandomState is
        used.

    """
    random_state_ = check_random_state(random_state)  # Or `rng` or something else?

    # Generate a random number
    u = random_state_.random_sample()

(EDIT: according to #287 (comment))

@mmcky
Copy link
Contributor

mmcky commented May 31, 2015

@oyamad Good Idea - Agreed. I have added an action item in QuantEcon.site repo for addition of this to the developer documentation.

@oyamad
Copy link
Member Author

oyamad commented Jun 1, 2015

@mmcky Great, thanks.

Will you delete the comment "Or rng or something else?" after picking a "best" name for the np.random.RandomState instance? Would random_state_ be ok (is there any python convention regarding underscore at the end of a string?)? I guessrandom_state may be not good, for the case where one wants to use the input random_state more than once. Or should it be rng or prng? (I know there will be no single "best" solution, but it may become a focal point, and I am not good at choosing a good name..) Thanks!

@mmcky
Copy link
Contributor

mmcky commented Jun 1, 2015

@oyamad I am not sure I follow why you would add a _ to the name. I think random_state is a good descriptive name. As a rule I actually don't mind over writing the original input to produce a consistent internal reference object. Do you think there will be cases when there will be name conflicts within a function with random_state?

I would propose:

random_state = check_random_state(random_state)

If random_state is an int then random_state will be RandomState instance internally otherwise it will be the instance of RandomState that is passed in.

An alternative name would be just to use seed or random_seed - but I think random_state is more descriptive.

@oyamad
Copy link
Member Author

oyamad commented Jun 1, 2015

@mmcky You are right. random_state = check_random_state(random_state) is sufficient and appropriate.

I may have mixed it up with the case of a class; in a function there will be no case where one wants to initialize RandomState more than once. But even with a class, self.random_state/random_state would be just fine.

What about a possible template for a class? Would it be like this?

class SomeClass(object):
    """
    Parameters
    ----------
    ...

    random_state : int or np.random.RandomState, optional
        Random seed (integer) or np.random.RandomState instance to set
        the initial state of the random number generator for
        reproducibility. If None, a randomly initialized RandomState is
        used.

    """
    def __init__(self, main_inputs, random_state=None):
        self.random_state = random_state

    def simulate(self, main_arguments):
        random_state = check_random_state(self.random_state)

        # Generate a random number
        u = random_state.random_sample()

(EDIT: according to #287 (comment))

An issue is what function(s) should accept a random_state argument:

  • only __init__;
  • each method (and not __init__); or
  • both __init__ and each method.

@mmcky
Copy link
Contributor

mmcky commented Jun 2, 2015

@oyamad Good question - I think I would be more in favour of setting the seed once for an object. Therefore I would probably support including it in the __init__ function. I would probably also let it be an accessible and resettable property as you have done in the example above which allows:

SomeClassInstance.random_state = RandomState

I also wouldn't be against leaving this up to the developer to implement in which way they see fit re: __init__ and/or methods. But I think it is very important to have a consistent parameter name random_state.

Do you know of a case where an object would require two or more RandomStates within a class?

@oyamad
Copy link
Member Author

oyamad commented Jun 5, 2015

@mmcky Right, this shouldn't be considered as a "rule", and it will be up to the developer. This is just for discussion about what to write in a template on the developer documentation. For the moment let's just leave it as is, including the seed option only in __init__.

One may want to give different seeds for different simulation trials, but in that case, he/she can do SomeClassInstance.random_state = RandomState each time as you mention, or create new instances with different seeds across trials.

@mmcky
Copy link
Contributor

mmcky commented Jun 5, 2015

@oyamad Sounds good. I have added your example to an Issue on the QuantEcon.site() repository.
https://github.com/QuantEcon/QuantEcon.site/issues/25. I don't quite have time to update this today - but I will add your excellent examples to a section Style guide for using Random States on the following page: http://quantecon.org/developers/

oyamad added a commit that referenced this issue Aug 4, 2015
@oyamad
Copy link
Member Author

oyamad commented Aug 27, 2015

Option to supply random_state has been incorporated: examples MarkovChain.simulate, random.probvec.

I don't know if we want to add random_state to every function/method that involves randomness, but just to make a list:

  • arma.py
  • discrete_rv.py
  • lqcontrol.py
  • lqnash.py
  • lss.py
  • quad.py
  • markov/core.py
  • markov/random.py
  • random/utilities.py

@mmcky
Copy link
Contributor

mmcky commented Mar 14, 2016

See: https://github.com/QuantEcon/QuantEcon.site/issues/25 for action related to QuantEcon.site

@mmcky mmcky closed this as completed in #346 Oct 5, 2017
mmcky pushed a commit that referenced this issue Oct 5, 2017
* FEAT: Add random_state option to discrete_rv.py

* TEST: Add test for discrete_rv.py random state option

* FEAT: Add random_state option to lqcontrol.py

* TEST: Add test for lqcontrol.py random state option

* FEAT: Add random_state option to lqnash.py

* FIX: Make formatting of test_lqnash.py consistent with other tests

* FEAT: Add random_state option to lss.py

* TEST: Add test for lss.py random state option

* FEAT: Add random_state option to quad.py

* TEST: Add test for quad.py random state option

* FIX: Correct typo

* FIX: Correct the docstring for the random_state options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants