Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating policy on real-world setup #9

Closed
AntonBock opened this issue May 2, 2022 · 4 comments
Closed

Evaluating policy on real-world setup #9

AntonBock opened this issue May 2, 2022 · 4 comments

Comments

@AntonBock
Copy link

AntonBock commented May 2, 2022

Hello,

We have trained a policy that we would like to test on a real-world setup. Does SKRL have any built-in support for this, or do you have any recommended method of doing this?

-Anton

@Toni-SM
Copy link
Owner

Toni-SM commented May 2, 2022

Hi @AntonBock

To test the policy in simulation or real-world you only need to compose the observation/state space with information from the environment (sensors, etc.) and apply the actions taken by the policy to the environment (controllers)...

So, it all depends on the environment you have...
Could you please provide more information about your simulated environment and the real one? Are you planning to use a middleware to control the objective in your environment, for example, ROS?

@AntonBock
Copy link
Author

Hi again,

Thanks for the quick response!

We use ROS to control a Franka Panda arm.
Using your ppo_Franka_Cabinet.py example, we have trained a Franka robot in Isaac Gym.

How could we modify the franka_cabinet example to use observations we get from ROS, instead of wrapping Gym and getting the observations from there?

@Toni-SM
Copy link
Owner

Toni-SM commented May 2, 2022

Hi @AntonBock

I think the following questions are relevant for testing in real-world:

  • Are you able to get, using ROS all the data required to build the observation space (according to the FrankaCabinet task it is self.cfg["env"]["numObservations"] = 23)? How do you pretend to control the robot: setting the joints directly using a ROS topic or with MoveIt?
  • How important is it to return the instantaneous reward at each time step in the real-world test environment? Are you able to compute it using the information from the real world?
  • Is your real-world system capable of working (retrieving sensor information and controlling the robot using ROS) at the frequency at which the agent was trained in the simulation (dt: 0.0166 # 1/60 seconds)?

Well... I think creating a separate environment, for testing in real-world, that uses ROS to build the observation space and control the robot, is the best solution... Currently, I am creating a testing environment (in the real world) using ROS... I think we can discuss its implementation here the day after tomorrow...

@AntonBock
Copy link
Author

Hi @Toni-SM
Those are some of the same considerations we have made, and we should be able to get all the required information and rewards at the correct frequency.

We look forward to hearing about your ROS environment tomorrow

Repository owner locked and limited conversation to collaborators May 5, 2022
@Toni-SM Toni-SM converted this issue into discussion #10 May 5, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants