GitHub - shukla-yash/Few-Shot-Policy-Transfer: This repository contains the code base for the paper 'Zero-Shot Policy Transfer through Observation mapping and Behavior Cloning'

Readme file for the Source Code of ' A Framework for Few-Shot Policy Transfer through Observation Mapping and Behavior Cloning'

Overview of the paper:

Despite recent progress in Reinforcement Learning for robotics applications, many tasks remain prohibitively difficult to solve because of the expensive interaction cost. Transfer learning helps mitigate this problem by transferring knowledge learned in a source task domain, reducing the training time in the target domain. Sim2Real transfer helps transfer knowledge from a simulated robotic domain as a source to a physical target domain. This reduces the time required to train a task in the physical world, where the cost of interactions is high. However, most existing approaches assume exact correspondence in the task structure and the physical properties in the two domains. In this work, we propose a framework for Few-Shot Policy Transfer between two domains through Observation Mapping and Behavior Cloning. We use Generative Adversarial Networks (GANs) along with a cycle-consistency loss to map the observations between the source and target domains, and later use this learned mapping to clone the successful source task behavior policy to the target domain. We observe successful zero-shot transfer with a limited number of target task interactions, and in cases where the source and target task are semantically dissimilar.

First, to train a source policy:

$ python PPO_source/train.py

To test the learned policy on the source task:

$ python PPO_source/test.py

To train CycleGAN for image translation:

$ cd Real_robot_CycleGAN/pytorch-CycleGAN-and-pix2pix $ python train.py --dataroot ./datasets/turtlebot --name turtlebot_cyclegan --model cycle_gan

The image datasets need to be in the folder ./datasets/turtlebot

Then, generate trajectories for behavior cloning:

Copy the CycleGAN output from Real_robot_CycleGAN/pytorch-CycleGAN-and-pix2pix/checkpoints to data_gan/checkpoints $ cd data_gen $ python main.py --collect --name turtlebot_cyclegan --dataroot ./datasets/turtlebot_test --dataset_mode unaligned --model cycle_gan

Finally, clone the policy given the trajectories:

Copy the trajectory pickle file from data_gen/trajectories to SB_BC/trajectories

$ python SB_BC/BC2.py

And then test the cloned policy on target task:

$ python SB_BC/test.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
PPO_source		PPO_source
Real_robot_CycleGAN		Real_robot_CycleGAN
SB_BC		SB_BC
data_gen		data_gen
turtlebot_transfer_learning		turtlebot_transfer_learning
README.md		README.md
appendix.pdf		appendix.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPO_source

PPO_source

Real_robot_CycleGAN

Real_robot_CycleGAN

SB_BC

SB_BC

data_gen

data_gen

turtlebot_transfer_learning

turtlebot_transfer_learning

README.md

README.md

appendix.pdf

appendix.pdf

Repository files navigation

Readme file for the Source Code of ' A Framework for Few-Shot Policy Transfer through Observation Mapping and Behavior Cloning'

First, to train a source policy:

To test the learned policy on the source task:

To train CycleGAN for image translation:

Then, generate trajectories for behavior cloning:

Finally, clone the policy given the trajectories:

And then test the cloned policy on target task:

About

Releases

Packages

Languages

shukla-yash/Few-Shot-Policy-Transfer

Folders and files

Latest commit

History

Repository files navigation

Readme file for the Source Code of ' A Framework for Few-Shot Policy Transfer through Observation Mapping and Behavior Cloning'

First, to train a source policy:

To test the learned policy on the source task:

To train CycleGAN for image translation:

Then, generate trajectories for behavior cloning:

Finally, clone the policy given the trajectories:

And then test the cloned policy on target task:

About

Resources

Stars

Watchers

Forks

Languages