Skip to content

CAV-Research-Lab/Predictive-Model-Delay-Correction

Repository files navigation

Predictive-Model Delay-Correction Reinforcement Learning

Alt Text

Abstract Local-remote systems enable robots to perform complex tasks in hazardous environments such as space, and nuclear power stations. However, mapping positions between local and remote devices can be challenging due to time delays compromising system performance and stability. Improving the synchronicity and stability of local-remote systems is crucial to enabling robots to interact with environments at greater distances and under highly challenging network conditions (e.g. time delay). We propose an adaptive control method using reinforcement learning to address the problem of time delayed control. By adjusting controller parameters in real-time, this adaptive controller compensates for stochastic delays and improves synchronicity between local and remote robotic manipulators.

To increase the performance of the adaptive PD controller, we develop a model-based reinforcement learning technique which efficiently incorporates multi-steps delays into the learning framework. Using the proposed technique the performance of the local-remote system is stabilised for stochastic communication time-delays up to 290ms. The results show that the proposed model-based reinforcement learning method outperforms Soft-Actor Critic and augmented state Soft-Actor Critic methods.

Setup

pip install -r ./requirements.txt

Run

python delay_correcting_training.py

In order to use PMDC on custom environments you can simply wrap the environment in the PMDC wrapper which will train and correct for constant action delays. In order to handle stochastic delays call the Augmented wrapper with given stochastic range.

import AugmentedRandomDelayWrapper, UnseenRandomDelayWrapper from wrappers_rd
from PMDC_wrapper import PMDC

env = PMDC( UnseenRandomDelayWrapper (gym.make(env_id), act_delay_range=range(ACT_DELAY-1, ACT_DELAY)), delay=ACT_DELAY, env_id=env_id, n_models=n_models)
env = AugmentedRandomDelayWrapper(env, obs_delay_range=range(OBS_DELAY,OBS_DELAY+DELAY_RANGE))

Where ACT_DELAY, OBS_DELAY and DELAY_RANGE adjust the degree of delay imposed on the environment.

The implementation of the random delay wrapper used to implement random action and observation delays was modified from Bouteiller et al. "Reinforcement Learning with Random Delays" - Arxiv - GitHub

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages