Q-learning + Arduino Uno based robot
The aim of this project was to create model using Reinforced Machine learning, that is able to keep uniaxial robot balanced.
Project uses communication interface cloned from repository.
Robot (figure below) is a physical uniaxial scooter, with Arudino Uno microcontroller and shield. Communication is carried by a serial port emulated with USB adapter. Robot sends periodically information from its onboard gyroscope and accelerometer. As a response, agent can through serial port can set rotation speed of wheels.
Fig.1 Robot (credits github.com/kwanty)Project consists of three major parts:
- ML
- firmware (cloned)
- communication interface (cloned)
Firmware (/firmware
) and communication interface (/sbr-qt
and sbr-py/util
)
is part of another repository,
created along robot.
Machine learning part can be found in /sbr-py/ml
.
All ML classes can be found inside /sbr-py/ml/source
directory.
Agent class RobotModel
used by ML is declared in TF_interface.py
.
It can communicate with robot using RobotInterface
class defined in robot_interface.py
.
QModel
defines policy and other methods used in Q-learning.
Core methods used to learn model are defined in /sbr-py/ml/tests/test_robot_model.py
:
test_model_qlearn_real()
learns inn_episodes
and saves results inside./tmp
.test_load_qlearn_real()
loads saved results and continues learning untiln_episodes
.
We selected Q-learning as a learning technique used in this project. The main benefits of this approach were:
- simplicity of usage
- relatively short time to learn
Based on used reward and policy, very different results were obtained. Reward used to obtain the final results was:
where s()
is swing as function of step,
n
is current step,
delta t
is time measured from start of the episode.
Chart below presents reward over episodes. The bigger a reward was the longer robot would balance on its own.
Code was tested on Python 3.8.
After installing all necessary python modules navigate to
cd ./sbr-py/ml/tests
and run test_model_qlearn_real()
method defined
inside test_robot_model.py
by
python test_robot_model.py TestRobotInterface.test_model_qlearn_real
Learning results is automatically saved inside ./tmp
directory.
In order to load result and continue learning:
- set appropriate
n_episodes
- change input filename
- run
python test_robot_model.py TestRobotInterface.test_load_qlearn_real
To plot obtained result navigate to /sbr-py/util
, change input filename
and run
python plot.py test_plot_reward