Skip to content

Commit

Permalink
Merge pull request #140 from pockerman/ros_examples
Browse files Browse the repository at this point in the history
Add new example
  • Loading branch information
pockerman committed Apr 1, 2022
2 parents 110b486 + f5a627a commit 34455fb
Show file tree
Hide file tree
Showing 3 changed files with 91 additions and 2 deletions.
88 changes: 88 additions & 0 deletions docs/source/ExamplesCpp/rl/rl_example_10.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
Q-learning on ``CliffWalking-v0`` (C++)
==================================

Overview
--------

Code
----

.. code-block::
#include "cubeai/base/cubeai_types.h"
#include "cubeai/rl/algorithms/td/q_learning.h"
#include "cubeai/rl/policies/epsilon_greedy_policy.h"
#include "cubeai/rl/trainers/rl_serial_agent_trainer.h"
#include "gymfcpp/gymfcpp_types.h"
#include "gymfcpp/cliff_world_env.h"
#include "gymfcpp/time_step.h"
#include <deque>
#include <iostream>
.. code-block::
namespace rl_example_10{
using cubeai::real_t;
using cubeai::uint_t;
using cubeai::rl::policies::EpsilonGreedyPolicy;
using cubeai::rl::algos::td::QLearning;
using cubeai::rl::algos::td::QLearningConfig;
using cubeai::rl::policies::EpsilonDecayOption;
using cubeai::rl::RLSerialAgentTrainer;
using cubeai::rl::RLSerialTrainerConfig;
}
.. code-block::
int main(){
using namespace example;
try{
Py_Initialize();
auto gym_module = boost::python::import("__main__");
auto gym_namespace = gym_module.attr("__dict__");
gymfcpp::CliffWorld env("v0", gym_namespace);
env.make();
std::cout<<"Number of states="<<env.n_states()<<std::endl;
std::cout<<"Number of actions="<<env.n_actions()<<std::endl;
EpsilonGreedyPolicy policy(1.0, env.n_actions(), EpsilonDecayOption::INVERSE_STEP);
QLearningConfig qlearn_config;
qlearn_config.gamma = 1.0;
qlearn_config.eta = 0.01;
qlearn_config.tolerance = 1.0e-8;
qlearn_config.max_num_iterations_per_episode = 1000;
qlearn_config.path = "qlearning_cliff_walking_v0.csv";
QLearning<gymfcpp::CliffWorld, EpsilonGreedyPolicy> algorithm(qlearn_config, policy);
RLSerialTrainerConfig trainer_config = {10, 10000, 1.0e-8};
RLSerialAgentTrainer<gymfcpp::CliffWorld,
QLearning<gymfcpp::CliffWorld, EpsilonGreedyPolicy>> trainer(trainer_config, algorithm);
auto info = trainer.train(env);
std::cout<<info<<std::endl;
}
catch(std::exception& e){
std::cout<<e.what()<<std::endl;
}
catch(...){
std::cout<<"Unknown exception occured"<<std::endl;
}
return 0;
}
Results
=======
2 changes: 1 addition & 1 deletion docs/source/ExamplesCpp/rl/rl_example_9.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
SARSA on ``CliffWalking-v0`` (C++)
==========================================
==================================

Overview
--------
Expand Down
3 changes: 2 additions & 1 deletion docs/source/rl_examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@ The following is a list of reinforcement learning examples a user can go through
ExamplesCpp/rl/rl_example_8
Examples/Dp/value_iteration_frozen_lake_v0
ExamplesCpp/rl/rl_example_9
Examples/Td/q_learning_frozen_lake_v0
ExamplesCpp/rl/rl_example_10
Examples/Td/q_learning_cliff_walking_v0
Examples/Td/q_learning_frozen_lake_v0
Examples/Td/q_learning_cart_pole_v0
Examples/Td/double_q_learning_cart_pole_v0
Examples/Td/sarsa_cart_pole_v0
Expand Down

0 comments on commit 34455fb

Please sign in to comment.