Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add more scenarios #452

Open
wants to merge 62 commits into
base: development
Choose a base branch
from

Conversation

jmatejcz
Copy link

@jmatejcz jmatejcz commented Mar 10, 2025

Purpose

#433

  • Make creation of new tasks faster
  • Add more tasks for o3deTestBenchmark
  • Add more simulation configs for o3deTestBenchmark

Proposed Changes

  • ManipulationTask for common logic
  • Made tasks more generic and parametrized them so they can operate on different types of objects -> this will result in much more variants of a task available, for exampe: instead of move carrot to the left side -> move {object types} to the left side, where object types is passed as argument
  • Refactored tasks -> MoveObjectToLeftTask, GroupObjectsTask
  • New type of tasks -> BuildCubeTowerTask, PlaceObjectAtCoordTask, RotateObjectTask(not aplicable right now)
  • New scenes configs, renamed them to specify what objects are present.
  • Unit tests for helper functions and tasks to ensure proper score calculation
  • Packets of scenarios, that are subjectively grouped into 5 level of difficulty: trivial, easy, medium, hard, very_hard (can and probably will be adjusted). For now there are 10 trivial, 42 easy, 23 medium, 38 hard and 47 very hard scenarios.
  • Resetting arm to base position
  • Launching new binaries

Issues

  • Not able to test manually if scores are calculated properly in different scenarios, as number of possible scenarios grows significantly. Depending on unit tests to ensure that scores are calculated properly. These tests must cover all cases.

  • Not enough trivial scenarios (like move object to coords when few objects). Harder scenarios like creating structures from many objects can be created in various number of ways. It isn't the same for easy scenarios where the scenario should include 1, maybe 2 moves and scene setup that is not complicated. If you have ideas for other easy tasks, let me know in comments!

  • For now ManipulatorMoveTo tool does not allow changing orientation of the gripper, which makes some tasks like RotateObjectTask, or scenes with rotated objects not usable.

  • Corn entities are to small for gripper, always falling out, which result in them being avoided when defining tasks

  • When gripper is holding an object it can place it into other object

Testing

Setup

  1. Setup the repository
  2. Install dependencies listed in:
    https://github.com/RobotecAI/rai/blob/main/docs/demos/manipulation.md

and:

poetry install --with openset
vcs import < demos.repos
rosdep install --from-paths src/examples/rai-manipulation-demo/ros2_ws/src --ignore-src -r -y
colcon build --symlink-install
source setup_shell.sh
  1. Download GameLauncher binary: humble or jazzy

  2. Populate src/rai_bench/rai_bench/o3de_test_bench/configs/o3de_config.yaml with:

binary_path: /path/to/your/GameLauncher
level: RoboticManipulationBenchmark
robotic_stack_command: ros2 launch examples/manipulation-demo-no-binary.launch.py
required_simulation_ros2_interfaces:
  services:
    - /spawn_entity
    - /delete_entity
  topics:
    - /color_image5
    - /depth_image5
    - /color_camera_info5
  actions: []
required_robotic_ros2_interfaces:
  services:
    - /grounding_dino_classify
    - /grounded_sam_segment
    - /manipulator_move_to
  topics: []
  actions: []
robotic_stack_command: ros2 launch examples/manipulation-demo-no-binary.launch.py
required_simulation_ros2_interfaces:
  services:
    - /spawn_entity
    - /delete_entity
  topics:
    - /color_image5
    - /depth_image5
    - /color_camera_info5
  actions: []
required_robotic_ros2_interfaces:
  services:
    - /grounding_dino_classify
    - /grounded_sam_segment
    - /manipulator_move_to
  topics: []
  actions: []

Run examples

Run the benchmark with example scenarios. I do not recommend running full packets as this will take some time.
In src/rai_bench/rai_bench/examples/o3de_test_benchmark.py there are prepared packets of scenarios.
You can swap running all_scenarios for for example 3 trivial scenarios -> t_scenarios[:3] here:

    benchmark = Benchmark(
        simulation_bridge=o3de,
        scenarios=all_scenarios,
        logger=bench_logger,
        results_filename=results_filename,
    )

then:

python src/rai_bench/rai_bench/examples/o3de_test_benchmark.py

what to look for:

  • check if all packets are running
  • check how are they different, if you think the grading should be changed or some scenarios are not suited for the level, let me know in the comment
  • logs and results can be found in src/rai_bench/rai_bench/experiments/
  • check if arm resets to base position properly after every scenario

Tests

Run unit tests:

pytest tests/rai_bench/tasks/
  • Check if they all pass
  • Check the tests, if you can think of cases that are not covered, please let me know in comment.

@jmatejcz jmatejcz force-pushed the jm/feat/o3de-bench-more-tasks branch from cab3f76 to bf77f7c Compare March 10, 2025 11:31
@jmatejcz jmatejcz changed the title Jm/feat/o3de bench more tasks feat: more tasks Mar 10, 2025
@jmatejcz jmatejcz force-pushed the jm/feat/o3de-bench-more-tasks branch 10 times, most recently from ce51009 to 63a45ad Compare March 17, 2025 13:47
@jmatejcz jmatejcz changed the title feat: more tasks feat: add more scenarios Mar 17, 2025
@jmatejcz jmatejcz marked this pull request as ready for review March 17, 2025 16:46
@boczekbartek
Copy link
Member

@jmatejcz Thank you for the PR
I can see that "This branch is out-of-date with the base branch" - could you please "Update with rebase"?

@boczekbartek boczekbartek self-requested a review March 18, 2025 07:19
@jmatejcz jmatejcz force-pushed the jm/feat/o3de-bench-more-tasks branch from 6472625 to 1302a41 Compare March 18, 2025 08:04
@jmatejcz
Copy link
Author

@jmatejcz Thank you for the PR I can see that "This branch is out-of-date with the base branch" - could you please "Update with rebase"?

done

@boczekbartek
Copy link
Member

@jmatejcz
Thank you for this PR. I tried example commands, but didn't manage to run the benchmark. I think some nodes might not start correctly.

Quick note: In step 2, before running colcon build I did vcs import < demos.repos; rosdep ... to download rai-manipulation-demo.

I configured the path the the demo binary as described. When I run this command:

python src/rai_bench/rai_bench/examples/o3de_test_benchmark.py

I can see that simulator started, but some ros2 nodes seem to be missing:

2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required topics: []
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required actions: []
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO available actions: {'/move_action', '/panda_arm_controller/follow_joint_trajectory', '/panda_hand_controller/gripper_cmd', '/execute_trajectory'}
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']

Here is a full log:
log.log

I think one of the reasons is that I get this error. Did you also encounter it?

$ ros2 launch examples/manipulation-demo-no-binary.launch.py
...
[grounded_sam-2] [ERROR] [1742334299.934854130] [grounded_sam]: Could not load model

Also, in the logs of the benchmark I can see a lot (402 in total) of logs like:

Could not create Scenario from task: Manipulate objects, so that ........

Are they expected? If they are expected - should user know all of them?

Copy link
Member

@boczekbartek boczekbartek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmatejcz added some high-level comments/questions to the code

TypeError
If any of the provided object types is not allowed.
"""
# TODO (jm) what if allowable_displament is greater then the size of object?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this TODO?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I' m not sure how to treat and how to calculate cases when displacement is that large that it allows objects to be not stacked properly. This issue emerges, because Task does not know what is the size of the object, only coordinates of it's center. I had an idea to check if next object's 'z' coord is greater, but how much? i would need the size of the object to estimate that. If we assume any greater 'z' is valid, what about cases like:
buildtowertask1
where the cube on top has greater z than the right cube so it can be treated as valid tower on top of the right cube, because allowable displacement allows it.
or :
buildtowertask2
where the middle cube has greater 'z' but it doesn't form a proper tower.

So i postponed this issue, as maybe in future we will have access to objects sizes via SimulationBridge?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and allowable displacement is required as cubes won't be perfectly stacked

Copy link
Member

@boczekbartek boczekbartek Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! But in case 2 I think the center would be shifted on x/y axis and entities will no be grouped (code), right?

Also, for O3DE Manipulation benchmark we can easily check the size of the available 3d models of cubes and set sensible allowable_displacement in this task. It's manual checking in the O3DE Editor, but we can have a config for this task, that is well tuned to current o3de setup, even without support for object sizes by the SimulationBridge.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! But in case 2 I think the center would be shifted on x/y axis and entities will no be grouped (code), right?

It depends on the allowable displacement, if it allows such big shift, they will be assigned to one group, like in the image, the center is in the range of allowable displacement.

Copy link
Author

@jmatejcz jmatejcz Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for O3DE Manipulation benchmark we can easily check the size of the available 3d models of cubes and set sensible allowable_displacement in this task. It's manual checking in the O3DE Editor, but we can have a config for this task, that is well tuned to current o3de setup, even without support for object sizes by the SimulationBridge.

Yes we can do it like this, but it would require providing this value separately for every object type and would limit the possibility of scaling the difficulty, as smaller allowable_displacement requires more precise stacking.

Though we could enforce upper limit like this, which makes sense to me

Copy link
Author

@jmatejcz jmatejcz Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideal solution would be getting object size directly from simulation and scaling this value accordingly. Not sure if this is possible

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed in person, i added maximal allowable displacement to object type
applied here:
60a7577

@jmatejcz
Copy link
Author

@jmatejcz Thank you for this PR. I tried example commands, but didn't manage to run the benchmark. I think some nodes might not start correctly.

Quick note: In step 2, before running colcon build I did vcs import < demos.repos; rosdep ... to download rai-manipulation-demo.

I configured the path the the demo binary as described. When I run this command:

python src/rai_bench/rai_bench/examples/o3de_test_benchmark.py

I can see that simulator started, but some ros2 nodes seem to be missing:

2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required topics: []
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required actions: []
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO available actions: {'/move_action', '/panda_arm_controller/follow_joint_trajectory', '/panda_hand_controller/gripper_cmd', '/execute_trajectory'}
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']

Here is a full log: log.log

I think one of the reasons is that I get this error. Did you also encounter it?

$ ros2 launch examples/manipulation-demo-no-binary.launch.py
...
[grounded_sam-2] [ERROR] [1742334299.934854130] [grounded_sam]: Could not load model

Also, in the logs of the benchmark I can see a lot (402 in total) of logs like:

Could not create Scenario from task: Manipulate objects, so that ........

Are they expected? If they are expected - should user know all of them?

yes it seems like you can't properly load grounded dino and grounded sam models, which results in:
WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']

try removing any existing weights:

rm -rf build/ install/ log/

then once again:

colcon build --symlink-install
source setup_shell.sh
python src/rai_bench/rai_bench/examples/o3de_test_benchmark.py

@jmatejcz
Copy link
Author

@jmatejcz Thank you for this PR. I tried example commands, but didn't manage to run the benchmark. I think some nodes might not start correctly.

Quick note: In step 2, before running colcon build I did vcs import < demos.repos; rosdep ... to download rai-manipulation-demo.

I configured the path the the demo binary as described. When I run this command:

python src/rai_bench/rai_bench/examples/o3de_test_benchmark.py

I can see that simulator started, but some ros2 nodes seem to be missing:

2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required topics: []
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO required actions: []
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] INFO available actions: {'/move_action', '/panda_arm_controller/follow_joint_trajectory', '/panda_hand_controller/gripper_cmd', '/execute_trajectory'}
2025-03-18 22:41:00 robo-pc-005 Agent logger[234244] WARNING Waiting for missing services ['/grounding_dino_classify', '/grounded_sam_segment'] out of required services: ['/grounding_dino_classify', '/grounded_sam_segment', '/manipulator_move_to']

Here is a full log: log.log

I think one of the reasons is that I get this error. Did you also encounter it?

$ ros2 launch examples/manipulation-demo-no-binary.launch.py
...
[grounded_sam-2] [ERROR] [1742334299.934854130] [grounded_sam]: Could not load model

Also, in the logs of the benchmark I can see a lot (402 in total) of logs like:

Could not create Scenario from task: Manipulate objects, so that ........

Are they expected? If they are expected - should user know all of them?

to the second part of the question, the logs about Could not create Scenario from task: are expected but should be in debug level, i will adjust that. Thank you for the note.

@boczekbartek
Copy link
Member

boczekbartek commented Mar 19, 2025

@jmatejcz

rm -rf build/ install/ log/

I tested with a fresh install. Probalby gdino weights got corrupted for some reason... but it worked now. Thank you!

to the second part of the question, the logs about Could not create Scenario from task: are expected but should be in debug level, I will adjust that. Thank you for the note.

Could you tell me a bit more about what this log message means?

@jmatejcz
Copy link
Author

jmatejcz commented Mar 19, 2025

@jmatejcz

rm -rf build/ install/ log/

I tested with a fresh install. Probalby gdino weights got corrupted for some reason... but it worked now. Thank you!

to the second part of the question, the logs about Could not create Scenario from task: are expected but should be in debug level, I will adjust that. Thank you for the note.

Could you tell me a bit more about what this log message means?

Every task validates if given simulation config is suitable by checking if required objects are present and if any of them is placed incorrectly:
https://github.com/RobotecAI/rai/blob/jm/feat/o3de-bench-more-tasks/src/rai_bench/rai_bench/o3de_test_bench/tasks/manipulation_task.py#L81-L104
If not this means scenario wont be created out of these 2.

This is especially useful when scenarios are created automatically like stated in README. This means you can pass a list of tasks, list of simulation configs and get every possible combination of task x sim_config as scenarios.

@jmatejcz jmatejcz force-pushed the jm/feat/o3de-bench-more-tasks branch from 2a92e0d to c9133f5 Compare March 19, 2025 12:05
logger=bench_logger,
results_filename=results_filename,
)
for i, s in enumerate(scenarios):
# custom request to arm
base_arm_pose = Pose(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't ensure that base_arm_pose has been reached after each task.
I noticed that sometimes arm is in a blocked state and moveit can't fix it.
I would check if the base_arm_pose is achieved and restart the game launcher if there are problems
Otherwise benchmark task will fail, because moveit can't figure out the trajectory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knicked will create feature that allows to reset arm and all joints to base position, so waiting for that

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@jmatejcz jmatejcz Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied here:
999d0e1

@boczekbartek
Copy link
Member

@jmatejcz I didn't run all the tasks myself, but I think next step after this PR is to select the final set of tasks. Besides tasks and llm/rai capabilities that they check we should take into account execution time.
Could you share here execution times from your experiments?

@boczekbartek
Copy link
Member

boczekbartek commented Mar 19, 2025

@jmatejcz Response to this message.
Yes, debug log should be enough for this information.

@jmatejcz
Copy link
Author

jmatejcz commented Mar 19, 2025

@jmatejcz I didn't run all the tasks myself, but I think next step after this PR is to select the final set of tasks. Besides tasks and llm/rai capabilities that they check we should take into account execution time. Could you share here execution times from your experiments?

i didn't run all myself too as this will take very long time.
From my experiments i can tell that easier scenarios require less time ~30second-1minute, harder scenarios can take even up to 3 minutes. Sometimes agent won't do task at all which will take much less time.

For example running 29 trivial and easy tasks took only ~15minutes ~30 second per task:
results.csv

and 30 very hard tasks took ~28 minutes , ~1minute per task, but it will certainly grow as in majority of scenarios agent didn't perform whole task .
results.csv

@jmatejcz jmatejcz requested a review from boczekbartek March 19, 2025 14:50
@boczekbartek
Copy link
Member

boczekbartek commented Mar 19, 2025

@jmatejcz Thanks! Thats very useful info!

From my experiments i can tell that easier scenarios require less time ~30second-1minute, harder scenarios can take even
up to 3 minutes. Sometimes agent won't do task at all which will take much less time.

Also, I think hard task timeout should be added, I don't see it in the code right now.

O3DE Test Benchmark (`src/rai_bench/rai_bench/o3de_test_bench/`), contains 2 Tasks(`tasks/`) - GrabCarrotTask and PlaceCubesTask (these tasks implement calculating scores) and 4 scene_configs(`configs/`) for O3DE robotic arm simulation.

Both tasks calculate score, taking into consideration 4 values:
The O3DE Test Benchmark [o3de_test_benchmark_module](./rai_bench/o3de_test_bench/) provides tasks and scene configurations for robotic arm manipulation task. The tasks use a common `ManipulationTask` logic and can be parameterized, which allows for many task variants. The current tasks include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think O3DE binary release for benchmarks should be mentioned in the readme.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not available publicly for now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied here:
3e3abee

@jmatejcz
Copy link
Author

jmatejcz commented Mar 19, 2025

@jmatejcz Thanks! Thats very useful info!

From my experiments i can tell that easier scenarios require less time ~30second-1minute, harder scenarios can take even
up to 3 minutes. Sometimes agent won't do task at all which will take much less time.

Also, I think hard task timeout should be added, I don't see it in the code right now.

I can't think of an easy and fast solution right now, it would require probably terminating whole simulation.
I will add it as a separate issue

#462

@boczekbartek
Copy link
Member

@jmatejcz Please change links from the PR description to public links:

Download GameLauncher binary: humble or jazzy

@jmatejcz
Copy link
Author

jmatejcz commented Mar 20, 2025

adjusted lunching for new binaries:
bbc6bc2

@MagdalenaKotynia please check this change, as this is part of rai_sim package

Changed o3de_config.yaml, so don't forget to update it. I edited description of PR to include new field in config

@MagdalenaKotynia
Copy link
Member

@jmatejcz The change bbc6bc2 looks good. When this PR is merged, I will add the example content of o3de_config.yaml with links to appropriate binaries to rai_sim README.

@jmatejcz
Copy link
Author

added resetting arm to base position which is available here: RobotecAI/rai-manipulation-demo#8
applied here: 999d0e1

Please check once again if everything works with new binaries and resetting arm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants