Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] V-D4RL #1756

Merged
merged 8 commits into from
Dec 21, 2023
Merged

[Feature] V-D4RL #1756

merged 8 commits into from
Dec 21, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 20, 2023

Example usage from the doc

        >>> import torch
        >>> torch.manual_seed(0)
        >>> from torchrl.data.datasets import VD4RLExperienceReplay
        >>> d = VD4RLExperienceReplay("main/walker_walk/random/64px", batch_size=32,
        ...     image_size=50)
        >>> for batch in d:
        ...     break
        >>> print(batch)
        TensorDict(
            fields={
                action: Tensor(shape=torch.Size([32, 6]), device=cpu, dtype=torch.float32, is_shared=False),
                done: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                index: Tensor(shape=torch.Size([32]), device=cpu, dtype=torch.int64, is_shared=False),
                is_init: Tensor(shape=torch.Size([32]), device=cpu, dtype=torch.bool, is_shared=False),
                next: TensorDict(
                    fields={
                        done: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                        observation: TensorDict(
                            fields={
                                height: Tensor(shape=torch.Size([32]), device=cpu, dtype=torch.float32, is_shared=False),
                                orientations: Tensor(shape=torch.Size([32, 14]), device=cpu, dtype=torch.float32, is_shared=False),
                                velocity: Tensor(shape=torch.Size([32, 9]), device=cpu, dtype=torch.float32, is_shared=False)},
                            batch_size=torch.Size([32]),
                            device=cpu,
                            is_shared=False),
                        pixels: Tensor(shape=torch.Size([32, 3, 50, 50]), device=cpu, dtype=torch.float32, is_shared=False),
                        reward: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.float32, is_shared=False),
                        terminated: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                        truncated: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
                    batch_size=torch.Size([32]),
                    device=cpu,
                    is_shared=False),
                observation: TensorDict(
                    fields={
                        height: Tensor(shape=torch.Size([32]), device=cpu, dtype=torch.float32, is_shared=False),
                        orientations: Tensor(shape=torch.Size([32, 14]), device=cpu, dtype=torch.float32, is_shared=False),
                        velocity: Tensor(shape=torch.Size([32, 9]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([32]),
                    device=cpu,
                    is_shared=False),
                pixels: Tensor(shape=torch.Size([32, 3, 50, 50]), device=cpu, dtype=torch.float32, is_shared=False),
                terminated: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                truncated: Tensor(shape=torch.Size([32, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
            batch_size=torch.Size([32]),
            device=cpu,
            is_shared=False)

cc @conglu1997

Copy link

pytorch-bot bot commented Dec 20, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1756

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (30 Unrelated Failures)

As of commit 63ea342 with merge base d125dd9 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2023
Copy link

github-actions bot commented Dec 20, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 65.4997ms 65.1035ms 15.3602 Ops/s 15.1331 Ops/s $\color{#35bf28}+1.50\%$
test_sync 45.8310ms 37.7170ms 26.5132 Ops/s 27.9878 Ops/s $\textbf{\color{#d91a1a}-5.27\%}$
test_async 0.1027s 36.0414ms 27.7459 Ops/s 28.6637 Ops/s $\color{#d91a1a}-3.20\%$
test_simple 0.5298s 0.4594s 2.1769 Ops/s 2.1338 Ops/s $\color{#35bf28}+2.02\%$
test_transformed 0.6955s 0.6262s 1.5970 Ops/s 1.5749 Ops/s $\color{#35bf28}+1.40\%$
test_serial 1.4712s 1.3967s 0.7160 Ops/s 0.7068 Ops/s $\color{#35bf28}+1.30\%$
test_parallel 1.4185s 1.3661s 0.7320 Ops/s 0.7383 Ops/s $\color{#d91a1a}-0.85\%$
test_step_mdp_speed[True-True-True-True-True] 0.1539ms 21.4745μs 46.5668 KOps/s 44.6205 KOps/s $\color{#35bf28}+4.36\%$
test_step_mdp_speed[True-True-True-True-False] 40.8660μs 13.1725μs 75.9159 KOps/s 71.0167 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_step_mdp_speed[True-True-True-False-True] 43.4810μs 12.7528μs 78.4143 KOps/s 75.6367 KOps/s $\color{#35bf28}+3.67\%$
test_step_mdp_speed[True-True-True-False-False] 49.0010μs 7.7743μs 128.6289 KOps/s 117.7389 KOps/s $\textbf{\color{#35bf28}+9.25\%}$
test_step_mdp_speed[True-True-False-True-True] 49.4430μs 23.1217μs 43.2494 KOps/s 42.0339 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[True-True-False-True-False] 56.2440μs 14.4221μs 69.3378 KOps/s 65.4304 KOps/s $\textbf{\color{#35bf28}+5.97\%}$
test_step_mdp_speed[True-True-False-False-True] 43.2810μs 13.9810μs 71.5257 KOps/s 68.3154 KOps/s $\color{#35bf28}+4.70\%$
test_step_mdp_speed[True-True-False-False-False] 35.6270μs 8.9094μs 112.2408 KOps/s 104.9943 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_step_mdp_speed[True-False-True-True-True] 62.7870μs 24.3470μs 41.0728 KOps/s 39.2329 KOps/s $\color{#35bf28}+4.69\%$
test_step_mdp_speed[True-False-True-True-False] 46.1860μs 15.8152μs 63.2302 KOps/s 59.6917 KOps/s $\textbf{\color{#35bf28}+5.93\%}$
test_step_mdp_speed[True-False-True-False-True] 64.3600μs 14.1212μs 70.8155 KOps/s 69.1194 KOps/s $\color{#35bf28}+2.45\%$
test_step_mdp_speed[True-False-True-False-False] 34.7640μs 8.9780μs 111.3829 KOps/s 104.8197 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_step_mdp_speed[True-False-False-True-True] 87.4830μs 25.5851μs 39.0853 KOps/s 37.5777 KOps/s $\color{#35bf28}+4.01\%$
test_step_mdp_speed[True-False-False-True-False] 42.3790μs 17.0314μs 58.7151 KOps/s 55.7940 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_step_mdp_speed[True-False-False-False-True] 44.7230μs 15.2507μs 65.5708 KOps/s 63.0442 KOps/s $\color{#35bf28}+4.01\%$
test_step_mdp_speed[True-False-False-False-False] 31.6390μs 10.2972μs 97.1139 KOps/s 92.3410 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_step_mdp_speed[False-True-True-True-True] 0.1000ms 24.4416μs 40.9139 KOps/s 39.4766 KOps/s $\color{#35bf28}+3.64\%$
test_step_mdp_speed[False-True-True-True-False] 55.3130μs 16.0017μs 62.4932 KOps/s 59.6503 KOps/s $\color{#35bf28}+4.77\%$
test_step_mdp_speed[False-True-True-False-True] 52.3080μs 16.5473μs 60.4329 KOps/s 58.6257 KOps/s $\color{#35bf28}+3.08\%$
test_step_mdp_speed[False-True-True-False-False] 52.8280μs 10.2277μs 97.7734 KOps/s 92.7206 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_step_mdp_speed[False-True-False-True-True] 53.4900μs 25.2703μs 39.5722 KOps/s 37.5837 KOps/s $\textbf{\color{#35bf28}+5.29\%}$
test_step_mdp_speed[False-True-False-True-False] 62.6060μs 17.0463μs 58.6638 KOps/s 54.5628 KOps/s $\textbf{\color{#35bf28}+7.52\%}$
test_step_mdp_speed[False-True-False-False-True] 48.8310μs 17.5473μs 56.9887 KOps/s 54.9912 KOps/s $\color{#35bf28}+3.63\%$
test_step_mdp_speed[False-True-False-False-False] 35.5360μs 11.4807μs 87.1028 KOps/s 82.3509 KOps/s $\textbf{\color{#35bf28}+5.77\%}$
test_step_mdp_speed[False-False-True-True-True] 67.2960μs 27.0350μs 36.9891 KOps/s 36.3444 KOps/s $\color{#35bf28}+1.77\%$
test_step_mdp_speed[False-False-True-True-False] 49.9030μs 18.3544μs 54.4828 KOps/s 51.8023 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_step_mdp_speed[False-False-True-False-True] 45.9550μs 17.7788μs 56.2469 KOps/s 55.5045 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[False-False-True-False-False] 58.8790μs 11.5534μs 86.5547 KOps/s 82.7840 KOps/s $\color{#35bf28}+4.55\%$
test_step_mdp_speed[False-False-False-True-True] 78.1560μs 27.9624μs 35.7623 KOps/s 34.9541 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[False-False-False-True-False] 52.0170μs 19.4456μs 51.4256 KOps/s 48.8830 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_step_mdp_speed[False-False-False-False-True] 54.5720μs 18.6471μs 53.6276 KOps/s 52.9531 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[False-False-False-False-False] 38.0400μs 12.6636μs 78.9667 KOps/s 75.7065 KOps/s $\color{#35bf28}+4.31\%$
test_values[generalized_advantage_estimate-True-True] 14.4964ms 12.0981ms 82.6578 Ops/s 83.2500 Ops/s $\color{#d91a1a}-0.71\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.9194ms 28.4552ms 35.1429 Ops/s 35.0489 Ops/s $\color{#35bf28}+0.27\%$
test_values[td0_return_estimate-False-False] 0.2671ms 0.2052ms 4.8722 KOps/s 4.5671 KOps/s $\textbf{\color{#35bf28}+6.68\%}$
test_values[td1_return_estimate-False-False] 27.3112ms 25.7111ms 38.8938 Ops/s 39.8126 Ops/s $\color{#d91a1a}-2.31\%$
test_values[vec_td1_return_estimate-False-False] 36.7065ms 28.4553ms 35.1428 Ops/s 35.4936 Ops/s $\color{#d91a1a}-0.99\%$
test_values[td_lambda_return_estimate-True-False] 87.1743ms 36.9753ms 27.0451 Ops/s 28.8392 Ops/s $\textbf{\color{#d91a1a}-6.22\%}$
test_values[vec_td_lambda_return_estimate-True-False] 36.8560ms 28.4105ms 35.1982 Ops/s 35.2982 Ops/s $\color{#d91a1a}-0.28\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.1516ms 7.9304ms 126.0964 Ops/s 128.4470 Ops/s $\color{#d91a1a}-1.83\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 11.6968ms 1.9742ms 506.5259 Ops/s 543.5289 Ops/s $\textbf{\color{#d91a1a}-6.81\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5861ms 0.4299ms 2.3261 KOps/s 2.2923 KOps/s $\color{#35bf28}+1.47\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 51.6245ms 41.9386ms 23.8444 Ops/s 24.1113 Ops/s $\color{#d91a1a}-1.11\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 13.3532ms 2.6804ms 373.0749 Ops/s 377.8417 Ops/s $\color{#d91a1a}-1.26\%$
test_dqn_speed 97.2042ms 8.5290ms 117.2466 Ops/s 116.6807 Ops/s $\color{#35bf28}+0.49\%$
test_ddpg_speed 22.8799ms 15.0538ms 66.4283 Ops/s 65.2132 Ops/s $\color{#35bf28}+1.86\%$
test_sac_speed 35.7509ms 30.7407ms 32.5302 Ops/s 32.3848 Ops/s $\color{#35bf28}+0.45\%$
test_redq_speed 46.3482ms 37.9934ms 26.3203 Ops/s 26.5555 Ops/s $\color{#d91a1a}-0.89\%$
test_redq_deprec_speed 28.8914ms 27.1834ms 36.7872 Ops/s 36.3065 Ops/s $\color{#35bf28}+1.32\%$
test_td3_speed 31.8039ms 21.5605ms 46.3810 Ops/s 45.7959 Ops/s $\color{#35bf28}+1.28\%$
test_cql_speed 99.8982ms 91.5587ms 10.9220 Ops/s 10.7450 Ops/s $\color{#35bf28}+1.65\%$
test_a2c_speed 38.2781ms 28.8117ms 34.7081 Ops/s 34.5838 Ops/s $\color{#35bf28}+0.36\%$
test_ppo_speed 38.7940ms 29.2157ms 34.2281 Ops/s 34.4064 Ops/s $\color{#d91a1a}-0.52\%$
test_reinforce_speed 36.3506ms 27.5499ms 36.2978 Ops/s 36.4348 Ops/s $\color{#d91a1a}-0.38\%$
test_iql_speed 71.9427ms 68.1511ms 14.6733 Ops/s 14.8134 Ops/s $\color{#d91a1a}-0.95\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.7521ms 2.0884ms 478.8369 Ops/s 437.8617 Ops/s $\textbf{\color{#35bf28}+9.36\%}$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1350s 2.6037ms 384.0735 Ops/s 370.3414 Ops/s $\color{#35bf28}+3.71\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 5.0789ms 2.3148ms 431.9944 Ops/s 419.2311 Ops/s $\color{#35bf28}+3.04\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.2495ms 2.1663ms 461.6175 Ops/s 430.4996 Ops/s $\textbf{\color{#35bf28}+7.23\%}$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1404s 2.6062ms 383.6937 Ops/s 349.9287 Ops/s $\textbf{\color{#35bf28}+9.65\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.8046ms 2.2887ms 436.9254 Ops/s 434.8959 Ops/s $\color{#35bf28}+0.47\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7796ms 2.1576ms 463.4859 Ops/s 452.6115 Ops/s $\color{#35bf28}+2.40\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1346s 2.6027ms 384.2219 Ops/s 374.9220 Ops/s $\color{#35bf28}+2.48\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.9222ms 2.2692ms 440.6916 Ops/s 423.2572 Ops/s $\color{#35bf28}+4.12\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.3570ms 2.1622ms 462.5003 Ops/s 443.2435 Ops/s $\color{#35bf28}+4.34\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1370s 2.6715ms 374.3180 Ops/s 374.8841 Ops/s $\color{#d91a1a}-0.15\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.7097ms 2.3222ms 430.6247 Ops/s 415.3185 Ops/s $\color{#35bf28}+3.69\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1670ms 2.1290ms 469.6954 Ops/s 412.6218 Ops/s $\textbf{\color{#35bf28}+13.83\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1369s 2.6148ms 382.4331 Ops/s 358.4373 Ops/s $\textbf{\color{#35bf28}+6.69\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.3393ms 2.3637ms 423.0706 Ops/s 426.0012 Ops/s $\color{#d91a1a}-0.69\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.7605ms 2.1185ms 472.0337 Ops/s 460.0398 Ops/s $\color{#35bf28}+2.61\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1370s 2.6015ms 384.3971 Ops/s 384.6374 Ops/s $\color{#d91a1a}-0.06\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.2743ms 2.3306ms 429.0722 Ops/s 434.3377 Ops/s $\color{#d91a1a}-1.21\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.2031s 19.1404ms 52.2455 Ops/s 51.7270 Ops/s $\color{#35bf28}+1.00\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1291s 17.5680ms 56.9218 Ops/s 54.8673 Ops/s $\color{#35bf28}+3.74\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1311s 17.5802ms 56.8823 Ops/s 55.3742 Ops/s $\color{#35bf28}+2.72\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1300s 17.5838ms 56.8706 Ops/s 54.2727 Ops/s $\color{#35bf28}+4.79\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1300s 17.6499ms 56.6576 Ops/s 55.2462 Ops/s $\color{#35bf28}+2.55\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1315s 15.3499ms 65.1470 Ops/s 55.0640 Ops/s $\textbf{\color{#35bf28}+18.31\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1304s 17.6462ms 56.6694 Ops/s 63.4197 Ops/s $\textbf{\color{#d91a1a}-10.64\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1307s 17.5852ms 56.8660 Ops/s 54.0511 Ops/s $\textbf{\color{#35bf28}+5.21\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1327s 17.8817ms 55.9232 Ops/s 55.2325 Ops/s $\color{#35bf28}+1.25\%$

Copy link

github-actions bot commented Dec 20, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1250s 0.1234s 8.1037 Ops/s 7.6848 Ops/s $\textbf{\color{#35bf28}+5.45\%}$
test_sync 0.1825s 0.1103s 9.0643 Ops/s 8.8026 Ops/s $\color{#35bf28}+2.97\%$
test_async 0.2095s 99.9912ms 10.0009 Ops/s 10.0262 Ops/s $\color{#d91a1a}-0.25\%$
test_single_pixels 0.1357s 0.1329s 7.5225 Ops/s 6.7187 Ops/s $\textbf{\color{#35bf28}+11.96\%}$
test_sync_pixels 96.2597ms 95.1157ms 10.5135 Ops/s 10.4488 Ops/s $\color{#35bf28}+0.62\%$
test_async_pixels 0.2515s 91.6877ms 10.9066 Ops/s 10.7901 Ops/s $\color{#35bf28}+1.08\%$
test_simple 0.9632s 0.8906s 1.1228 Ops/s 1.0831 Ops/s $\color{#35bf28}+3.66\%$
test_transformed 1.2063s 1.1411s 0.8763 Ops/s 0.8593 Ops/s $\color{#35bf28}+1.98\%$
test_serial 2.5118s 2.5105s 0.3983 Ops/s 0.3783 Ops/s $\textbf{\color{#35bf28}+5.30\%}$
test_parallel 2.5861s 2.5236s 0.3963 Ops/s 0.3886 Ops/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[True-True-True-True-True] 99.2420μs 32.9837μs 30.3180 KOps/s 30.5298 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-True-True-True-False] 37.7200μs 19.6691μs 50.8411 KOps/s 50.3375 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[True-True-True-False-True] 35.5400μs 18.6425μs 53.6408 KOps/s 50.0910 KOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_step_mdp_speed[True-True-True-False-False] 27.1000μs 11.4621μs 87.2441 KOps/s 82.0760 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_step_mdp_speed[True-True-False-True-True] 54.7610μs 34.3916μs 29.0769 KOps/s 27.6882 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_step_mdp_speed[True-True-False-True-False] 93.6610μs 21.4216μs 46.6819 KOps/s 45.8582 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[True-True-False-False-True] 46.9510μs 20.5254μs 48.7200 KOps/s 45.8714 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_step_mdp_speed[True-True-False-False-False] 47.8300μs 12.9130μs 77.4416 KOps/s 72.2980 KOps/s $\textbf{\color{#35bf28}+7.11\%}$
test_step_mdp_speed[True-False-True-True-True] 67.3100μs 36.7954μs 27.1773 KOps/s 27.0295 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[True-False-True-True-False] 47.8110μs 23.3865μs 42.7596 KOps/s 41.6131 KOps/s $\color{#35bf28}+2.76\%$
test_step_mdp_speed[True-False-True-False-True] 44.0400μs 20.7062μs 48.2948 KOps/s 48.3174 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[True-False-True-False-False] 81.2620μs 13.1413μs 76.0960 KOps/s 75.4011 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[True-False-False-True-True] 84.3820μs 38.6755μs 25.8561 KOps/s 26.2823 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[True-False-False-True-False] 42.9910μs 24.9234μs 40.1229 KOps/s 38.9519 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-False-False-False-True] 40.1100μs 22.2636μs 44.9164 KOps/s 43.4231 KOps/s $\color{#35bf28}+3.44\%$
test_step_mdp_speed[True-False-False-False-False] 29.9610μs 14.9073μs 67.0812 KOps/s 66.5169 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-True-True-True] 0.1169ms 36.6826μs 27.2609 KOps/s 27.6890 KOps/s $\color{#d91a1a}-1.55\%$
test_step_mdp_speed[False-True-True-True-False] 48.8110μs 23.6409μs 42.2996 KOps/s 44.0328 KOps/s $\color{#d91a1a}-3.94\%$
test_step_mdp_speed[False-True-True-False-True] 50.0410μs 25.1121μs 39.8214 KOps/s 39.8152 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-True-True-False-False] 34.5110μs 14.8897μs 67.1607 KOps/s 67.1581 KOps/s $+0.00\%$
test_step_mdp_speed[False-True-False-True-True] 77.0620μs 38.2908μs 26.1159 KOps/s 25.8626 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-True-False-True-False] 88.7410μs 25.7450μs 38.8425 KOps/s 39.4063 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[False-True-False-False-True] 0.1054ms 26.6260μs 37.5573 KOps/s 37.3465 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[False-True-False-False-False] 33.7310μs 16.8701μs 59.2766 KOps/s 59.5939 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[False-False-True-True-True] 59.5710μs 39.9452μs 25.0343 KOps/s 24.5114 KOps/s $\color{#35bf28}+2.13\%$
test_step_mdp_speed[False-False-True-True-False] 65.8010μs 27.2803μs 36.6564 KOps/s 36.3894 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-False-True-False-True] 42.8800μs 26.4491μs 37.8085 KOps/s 35.9872 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_step_mdp_speed[False-False-True-False-False] 97.2910μs 16.6065μs 60.2174 KOps/s 58.7592 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[False-False-False-True-True] 66.7300μs 41.0980μs 24.3321 KOps/s 23.6175 KOps/s $\color{#35bf28}+3.03\%$
test_step_mdp_speed[False-False-False-True-False] 51.8810μs 28.6404μs 34.9158 KOps/s 34.3302 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-False-False-True] 43.6000μs 27.8009μs 35.9700 KOps/s 34.5581 KOps/s $\color{#35bf28}+4.09\%$
test_step_mdp_speed[False-False-False-False-False] 40.6100μs 18.4650μs 54.1565 KOps/s 53.2112 KOps/s $\color{#35bf28}+1.78\%$
test_values[generalized_advantage_estimate-True-True] 26.5155ms 25.1639ms 39.7395 Ops/s 38.2349 Ops/s $\color{#35bf28}+3.94\%$
test_values[vec_generalized_advantage_estimate-True-True] 89.5785ms 3.3664ms 297.0509 Ops/s 294.9886 Ops/s $\color{#35bf28}+0.70\%$
test_values[td0_return_estimate-False-False] 96.9410μs 64.2749μs 15.5582 KOps/s 15.1866 KOps/s $\color{#35bf28}+2.45\%$
test_values[td1_return_estimate-False-False] 54.2396ms 53.9146ms 18.5479 Ops/s 17.8977 Ops/s $\color{#35bf28}+3.63\%$
test_values[vec_td1_return_estimate-False-False] 2.1156ms 1.7733ms 563.9175 Ops/s 556.2104 Ops/s $\color{#35bf28}+1.39\%$
test_values[td_lambda_return_estimate-True-False] 88.5023ms 86.6728ms 11.5376 Ops/s 11.1813 Ops/s $\color{#35bf28}+3.19\%$
test_values[vec_td_lambda_return_estimate-True-False] 2.1529ms 1.7722ms 564.2743 Ops/s 556.3200 Ops/s $\color{#35bf28}+1.43\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.0337ms 23.8603ms 41.9106 Ops/s 40.0224 Ops/s $\color{#35bf28}+4.72\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8735ms 0.7152ms 1.3983 KOps/s 1.3484 KOps/s $\color{#35bf28}+3.70\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7454ms 0.6677ms 1.4977 KOps/s 1.4540 KOps/s $\color{#35bf28}+3.00\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5327ms 1.4675ms 681.4116 Ops/s 671.1547 Ops/s $\color{#35bf28}+1.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9643ms 0.6940ms 1.4409 KOps/s 1.4100 KOps/s $\color{#35bf28}+2.19\%$
test_dqn_speed 7.7705ms 7.3621ms 135.8300 Ops/s 130.3773 Ops/s $\color{#35bf28}+4.18\%$
test_ddpg_speed 15.3955ms 14.5905ms 68.5376 Ops/s 61.2510 Ops/s $\textbf{\color{#35bf28}+11.90\%}$
test_sac_speed 30.1899ms 29.3455ms 34.0768 Ops/s 33.0293 Ops/s $\color{#35bf28}+3.17\%$
test_redq_speed 36.3468ms 35.2377ms 28.3787 Ops/s 27.7312 Ops/s $\color{#35bf28}+2.33\%$
test_redq_deprec_speed 0.1132s 26.3563ms 37.9416 Ops/s 40.1796 Ops/s $\textbf{\color{#d91a1a}-5.57\%}$
test_td3_speed 20.0393ms 19.8810ms 50.2992 Ops/s 48.7109 Ops/s $\color{#35bf28}+3.26\%$
test_cql_speed 86.3563ms 84.3217ms 11.8593 Ops/s 11.5363 Ops/s $\color{#35bf28}+2.80\%$
test_a2c_speed 27.5131ms 27.0692ms 36.9424 Ops/s 36.1185 Ops/s $\color{#35bf28}+2.28\%$
test_ppo_speed 0.1251s 30.1630ms 33.1532 Ops/s 35.8021 Ops/s $\textbf{\color{#d91a1a}-7.40\%}$
test_reinforce_speed 26.9459ms 26.2307ms 38.1233 Ops/s 37.3180 Ops/s $\color{#35bf28}+2.16\%$
test_iql_speed 59.1332ms 57.9284ms 17.2627 Ops/s 16.8314 Ops/s $\color{#35bf28}+2.56\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 0.1074s 2.8362ms 352.5861 Ops/s 380.1298 Ops/s $\textbf{\color{#d91a1a}-7.25\%}$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.7660ms 2.7256ms 366.8950 Ops/s 356.4053 Ops/s $\color{#35bf28}+2.94\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.1511ms 2.7261ms 366.8178 Ops/s 357.3167 Ops/s $\color{#35bf28}+2.66\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1238ms 2.5328ms 394.8266 Ops/s 381.4516 Ops/s $\color{#35bf28}+3.51\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.7382ms 2.7130ms 368.6000 Ops/s 358.3658 Ops/s $\color{#35bf28}+2.86\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.7372ms 2.7220ms 367.3812 Ops/s 357.8241 Ops/s $\color{#35bf28}+2.67\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.2009ms 2.5347ms 394.5218 Ops/s 385.3070 Ops/s $\color{#35bf28}+2.39\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.8507ms 2.7235ms 367.1720 Ops/s 356.1903 Ops/s $\color{#35bf28}+3.08\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.3178ms 2.7197ms 367.6881 Ops/s 358.7957 Ops/s $\color{#35bf28}+2.48\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.4745ms 2.5532ms 391.6577 Ops/s 382.8705 Ops/s $\color{#35bf28}+2.30\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.2131ms 2.7289ms 366.4472 Ops/s 357.1765 Ops/s $\color{#35bf28}+2.60\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.1206s 3.0398ms 328.9661 Ops/s 357.3631 Ops/s $\textbf{\color{#d91a1a}-7.95\%}$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.7580ms 2.5291ms 395.4009 Ops/s 384.4089 Ops/s $\color{#35bf28}+2.86\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.6494ms 2.7338ms 365.7847 Ops/s 356.6625 Ops/s $\color{#35bf28}+2.56\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.9116ms 2.7124ms 368.6736 Ops/s 354.4144 Ops/s $\color{#35bf28}+4.02\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.0328ms 2.5547ms 391.4399 Ops/s 382.3109 Ops/s $\color{#35bf28}+2.39\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.6724ms 2.7196ms 367.6952 Ops/s 356.3412 Ops/s $\color{#35bf28}+3.19\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.3417ms 2.7160ms 368.1893 Ops/s 355.7816 Ops/s $\color{#35bf28}+3.49\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1745s 18.2222ms 54.8782 Ops/s 52.1449 Ops/s $\textbf{\color{#35bf28}+5.24\%}$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1248s 17.2032ms 58.1288 Ops/s 64.7170 Ops/s $\textbf{\color{#d91a1a}-10.18\%}$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1242s 17.1647ms 58.2593 Ops/s 56.7383 Ops/s $\color{#35bf28}+2.68\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1253s 17.1878ms 58.1808 Ops/s 56.8869 Ops/s $\color{#35bf28}+2.27\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1247s 17.1824ms 58.1990 Ops/s 57.3695 Ops/s $\color{#35bf28}+1.45\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1222s 14.8741ms 67.2312 Ops/s 57.0246 Ops/s $\textbf{\color{#35bf28}+17.90\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1247s 17.1957ms 58.1541 Ops/s 57.0025 Ops/s $\color{#35bf28}+2.02\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1269s 17.2750ms 57.8873 Ops/s 65.2321 Ops/s $\textbf{\color{#d91a1a}-11.26\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1239s 17.1267ms 58.3883 Ops/s 56.5853 Ops/s $\color{#35bf28}+3.19\%$

@vmoens vmoens added the enhancement New feature or request label Dec 21, 2023
@vmoens vmoens linked an issue Dec 21, 2023 that may be closed by this pull request
1 task
@vmoens vmoens added the Data Data-related PR, will launch data-related jobs label Dec 21, 2023
@vmoens vmoens marked this pull request as ready for review December 21, 2023 11:26
@vmoens vmoens merged commit 6d217c6 into main Dec 21, 2023
11 of 26 checks passed
@vmoens vmoens deleted the vd4rl branch December 21, 2023 13:14
@conglu1997
Copy link

conglu1997 commented Dec 25, 2023

Awesome, thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Data Data-related PR, will launch data-related jobs enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] V-D4RL datasets for offline RL.
3 participants