New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async n-step q-learning and one step sarsa #1084

Merged
merged 4 commits into from Aug 23, 2017

Conversation

Projects
None yet
2 participants
@ShangtongZhang
Member

ShangtongZhang commented Aug 8, 2017

No description provided.

@zoq

I see what you meant with "share many common code snippets", and I think it's fine to duplicate some code here, to improve the readability.

ShangtongZhang added some commits Aug 10, 2017

@ShangtongZhang

This comment has been minimized.

Show comment
Hide comment
@ShangtongZhang

ShangtongZhang Aug 18, 2017

Member

Would there be more comments on this PR?

Member

ShangtongZhang commented Aug 18, 2017

Would there be more comments on this PR?

Show outdated Hide outdated ...lpack/methods/reinforcement_learning/worker/n_step_q_learning_worker.hpp
using ActionType = typename EnvironmentType::Action;
using TransitionType = std::tuple<StateType, ActionType, double, StateType>;
/**

This comment has been minimized.

@zoq

zoq Aug 18, 2017

Member

Do you mind to add a method description here; something like should do:

Construct N step Q-Learning worker with the given parameters and environment.
@zoq

zoq Aug 18, 2017

Member

Do you mind to add a method description here; something like should do:

Construct N step Q-Learning worker with the given parameters and environment.
network.Backward(actionValue, gradients);
// Accumulate gradients.
totalGradients += gradients;

This comment has been minimized.

@zoq

zoq Aug 18, 2017

Member

We should initialize totalGradients with zero.

@zoq

zoq Aug 18, 2017

Member

We should initialize totalGradients with zero.

Show outdated Hide outdated src/mlpack/methods/reinforcement_learning/worker/one_step_sarsa_worker.hpp
Show outdated Hide outdated src/mlpack/methods/reinforcement_learning/worker/one_step_sarsa_worker.hpp
if (terminal || pendingIndex >= config.UpdateInterval())
{
// Initialize the gradient storage.
arma::mat totalGradients(learningNetwork.Parameters().n_rows,

This comment has been minimized.

@zoq

zoq Aug 18, 2017

Member

We should initialize totalGradients with zeros here.

@zoq

zoq Aug 18, 2017

Member

We should initialize totalGradients with zeros here.

Show outdated Hide outdated src/mlpack/methods/reinforcement_learning/worker/one_step_sarsa_worker.hpp
Show outdated Hide outdated ...lpack/methods/reinforcement_learning/worker/n_step_q_learning_worker.hpp
@ShangtongZhang

This comment has been minimized.

Show comment
Hide comment
@ShangtongZhang

ShangtongZhang Aug 19, 2017

Member

Thanks for your feedback. Hope it's ready to merge now.

Member

ShangtongZhang commented Aug 19, 2017

Thanks for your feedback. Hope it's ready to merge now.

@zoq

zoq approved these changes Aug 20, 2017

I think this is ready to go; let's go ahead and wait two more days before merging it in, in case anyone else has comments.

@zoq zoq merged commit c8110ab into mlpack:master Aug 23, 2017

4 checks passed

Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@zoq

This comment has been minimized.

Show comment
Hide comment
@zoq

zoq Aug 23, 2017

Member

Thanks for another great contribution.

Member

zoq commented Aug 23, 2017

Thanks for another great contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment