Actions exploration #38

methenol · 2018-08-10T21:43:06Z

I'm working outside of hypersearch right now so these are probably not ideal parameters. It seems the model becomes a little more flexible to less than perfect parameters (and the random associated with the model's initial state) with actions exploration defined.

https://reinforce.io/blog/introduction-to-tensorforce/
actions_exploration=dict(
type='ornstein_uhlenbeck',
sigma=0.1,
mu=0.0,
theta=0.1
),
these parameters are from the example in the above link and are not optimized

Any benefit to adding parameters for actions exploration to hypersearch?

methenol · 2018-08-12T23:11:44Z

Testing this modification to hypersearch.py, had to clear the runs database so it's going to be a bit before I can tell if it affected anything.

hypers['agent'] = {
    # 'states_preprocessing': None,
    # 'actions_exploration': None,
    
    'actions_exploration.type':'ornstein_uhlenbeck',
    'actions_exploration.sigma': {
        'type': 'bounded',
        'vals': [0., 1.],
        'guess': .2,
        'hydrate': min_threshold(.05, None)
    },
    'actions_exploration.mu':{
        'type': 'bounded',
        'vals': [0., 1.],
        'guess': .2,
        'hydrate': min_threshold(.05, None)
    },
    'actions_exploration.theta':{
        'type': 'bounded',
        'vals': [0., 1.],
        'guess': .2,
        'hydrate': min_threshold(.05, None)
    },
    # 'reward_preprocessing': None,

    # I'm pretty sure we don't want to experiment any less than .99 for non-terminal reward-types (which are 1.0).
    # .99^500 ~= .6%, so looses value sooner than makes sense for our trading horizon. A trade now could effect
    # something 2-5k steps later. So .999 is more like it (5k steps ~= .6%)
    'discount': 1.,  # {
    #     'type': 'bounded',
    #     'vals': [.9, .99],
    #     'guess': .97
    # },
}

First time tweaking the hypers, if there's a better way let me know.

UPDATE 08/14/18: The above code is not compatible with v0.2 as-is. The ranges to be searched are valid but the syntax is not compatible with the hyperopt implementation in v0.2.

methenol · 2018-08-15T23:40:28Z

This is able to run for v0.2:
Would like for it to toggle on/off like the baseline section, working on that.

    'actions_exploration': {
        'type': 'ornstein_uhlenbeck',
        'sigma': hp.quniform('exploration.sigma', 0, 1, 0.05),
        'mu': hp.quniform('exploration.mu', 0, 1, 0.05),
        'theta':hp.quniform('exploration.theta', 0, 1, 0.05)
        },

Updated 08/19/18 to use use quniform

A brief explanation of the parmaters from here:
https://www.maplesoft.com/support/help/maple/view.aspx?path=Finance%2FOrnsteinUhlenbeckProcess
The parameter theta is the speed of mean-reversion. The parameter mu is the long-running mean. The parameter sigma is the volatility.

lefnire · 2018-08-16T01:10:48Z

Feel free to add in a pull request, or even just commit to master if you feel confident about it

methenol · 2018-08-16T02:20:55Z

Going to try and get the values to a little more realistic first before submitting a PR for it. Letting the hypersearch run for a bit so it does it's thing.

methenol self-assigned this Aug 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions exploration #38

Actions exploration #38

methenol commented Aug 10, 2018

methenol commented Aug 12, 2018 •

edited

Loading

methenol commented Aug 15, 2018 •

edited

Loading

lefnire commented Aug 16, 2018

methenol commented Aug 16, 2018

Actions exploration #38

Actions exploration #38

Comments

methenol commented Aug 10, 2018

methenol commented Aug 12, 2018 • edited Loading

methenol commented Aug 15, 2018 • edited Loading

lefnire commented Aug 16, 2018

methenol commented Aug 16, 2018

methenol commented Aug 12, 2018 •

edited

Loading

methenol commented Aug 15, 2018 •

edited

Loading