Predictive-Model Delay-Correction Reinforcement Learning

Abstract Local-remote systems enable robots to perform complex tasks in hazardous environments such as space, and nuclear power stations. However, mapping positions between local and remote devices can be challenging due to time delays compromising system performance and stability. Improving the synchronicity and stability of local-remote systems is crucial to enabling robots to interact with environments at greater distances and under highly challenging network conditions (e.g. time delay). We propose an adaptive control method using reinforcement learning to address the problem of time delayed control. By adjusting controller parameters in real-time, this adaptive controller compensates for stochastic delays and improves synchronicity between local and remote robotic manipulators.

To increase the performance of the adaptive PD controller, we develop a model-based reinforcement learning technique which efficiently incorporates multi-steps delays into the learning framework. Using the proposed technique the performance of the local-remote system is stabilised for stochastic communication time-delays up to 290ms. The results show that the proposed model-based reinforcement learning method outperforms Soft-Actor Critic and augmented state Soft-Actor Critic methods.

Setup

pip install -r ./requirements.txt

Run

python delay_correcting_training.py

In order to use PMDC on custom environments you can simply wrap the environment in the PMDC wrapper which will train and correct for constant action delays. In order to handle stochastic delays call the Augmented wrapper with given stochastic range.

import AugmentedRandomDelayWrapper, UnseenRandomDelayWrapper from wrappers_rd
from PMDC_wrapper import PMDC

env = PMDC( UnseenRandomDelayWrapper (gym.make(env_id), act_delay_range=range(ACT_DELAY-1, ACT_DELAY)), delay=ACT_DELAY, env_id=env_id, n_models=n_models)
env = AugmentedRandomDelayWrapper(env, obs_delay_range=range(OBS_DELAY,OBS_DELAY+DELAY_RANGE))

Where ACT_DELAY, OBS_DELAY and DELAY_RANGE adjust the degree of delay imposed on the environment.

The implementation of the random delay wrapper used to implement random action and observation delays was modified from Bouteiller et al. "Reinforcement Learning with Random Delays" - Arxiv - GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
PMDC_wrapper.py		PMDC_wrapper.py
README.md		README.md
delay_correcting_nn.py		delay_correcting_nn.py
delay_correcting_training.py		delay_correcting_training.py
requirements.txt		requirements.txt
robo_local_remote_env.py		robo_local_remote_env.py
wrappers_rd.py		wrappers_rd.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PMDC_wrapper.py

PMDC_wrapper.py

README.md

README.md

delay_correcting_nn.py

delay_correcting_nn.py

delay_correcting_training.py

delay_correcting_training.py

requirements.txt

requirements.txt

robo_local_remote_env.py

robo_local_remote_env.py

wrappers_rd.py

wrappers_rd.py

Repository files navigation

Predictive-Model Delay-Correction Reinforcement Learning

Setup

Run

About

Releases

Packages

Contributors 2

Languages

CAV-Research-Lab/Predictive-Model-Delay-Correction

Folders and files

Latest commit

History

Repository files navigation

Predictive-Model Delay-Correction Reinforcement Learning

Setup

Run

About

Resources

Stars

Watchers

Forks

Languages