Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actions exploration #38

Open
methenol opened this issue Aug 10, 2018 · 4 comments
Open

Actions exploration #38

methenol opened this issue Aug 10, 2018 · 4 comments
Assignees

Comments

@methenol
Copy link
Collaborator

I'm working outside of hypersearch right now so these are probably not ideal parameters. It seems the model becomes a little more flexible to less than perfect parameters (and the random associated with the model's initial state) with actions exploration defined.

https://reinforce.io/blog/introduction-to-tensorforce/
actions_exploration=dict(
type='ornstein_uhlenbeck',
sigma=0.1,
mu=0.0,
theta=0.1
),
these parameters are from the example in the above link and are not optimized

Any benefit to adding parameters for actions exploration to hypersearch?

@methenol
Copy link
Collaborator Author

methenol commented Aug 12, 2018

Testing this modification to hypersearch.py, had to clear the runs database so it's going to be a bit before I can tell if it affected anything.

hypers['agent'] = {
    # 'states_preprocessing': None,
    # 'actions_exploration': None,
    
    'actions_exploration.type':'ornstein_uhlenbeck',
    'actions_exploration.sigma': {
        'type': 'bounded',
        'vals': [0., 1.],
        'guess': .2,
        'hydrate': min_threshold(.05, None)
    },
    'actions_exploration.mu':{
        'type': 'bounded',
        'vals': [0., 1.],
        'guess': .2,
        'hydrate': min_threshold(.05, None)
    },
    'actions_exploration.theta':{
        'type': 'bounded',
        'vals': [0., 1.],
        'guess': .2,
        'hydrate': min_threshold(.05, None)
    },
    # 'reward_preprocessing': None,

    # I'm pretty sure we don't want to experiment any less than .99 for non-terminal reward-types (which are 1.0).
    # .99^500 ~= .6%, so looses value sooner than makes sense for our trading horizon. A trade now could effect
    # something 2-5k steps later. So .999 is more like it (5k steps ~= .6%)
    'discount': 1.,  # {
    #     'type': 'bounded',
    #     'vals': [.9, .99],
    #     'guess': .97
    # },
}

First time tweaking the hypers, if there's a better way let me know.

UPDATE 08/14/18: The above code is not compatible with v0.2 as-is. The ranges to be searched are valid but the syntax is not compatible with the hyperopt implementation in v0.2.

@methenol
Copy link
Collaborator Author

methenol commented Aug 15, 2018

This is able to run for v0.2:
Would like for it to toggle on/off like the baseline section, working on that.

    'actions_exploration': {
        'type': 'ornstein_uhlenbeck',
        'sigma': hp.quniform('exploration.sigma', 0, 1, 0.05),
        'mu': hp.quniform('exploration.mu', 0, 1, 0.05),
        'theta':hp.quniform('exploration.theta', 0, 1, 0.05)
        },

Updated 08/19/18 to use use quniform

A brief explanation of the parmaters from here:
https://www.maplesoft.com/support/help/maple/view.aspx?path=Finance%2FOrnsteinUhlenbeckProcess
The parameter theta is the speed of mean-reversion. The parameter mu is the long-running mean. The parameter sigma is the volatility.

@lefnire
Copy link
Owner

lefnire commented Aug 16, 2018

Feel free to add in a pull request, or even just commit to master if you feel confident about it

@methenol
Copy link
Collaborator Author

Going to try and get the values to a little more realistic first before submitting a PR for it. Letting the hypersearch run for a bit so it does it's thing.

@methenol methenol self-assigned this Aug 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants