Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train from cache without dataset needs to be updated #128

Closed
mh0797 opened this issue Sep 15, 2022 · 4 comments
Closed

train from cache without dataset needs to be updated #128

mh0797 opened this issue Sep 15, 2022 · 4 comments

Comments

@mh0797
Copy link

mh0797 commented Sep 15, 2022

Describe the bug

Training from cache without dataset (by setting cache.use_cache_without_dataset=true) does no longer work. The reason for this is that the structure of the cache path was updated by putting each scenario-type in a separate folder. When loading scenarios from cache this is not considered so far.
The bug can be fixed by changing this line from
candidate_scenario_dirs = [path for log_dir in cache_dir.iterdir() for path in log_dir.iterdir()]
to
candidate_scenario_dirs = [path for log_dir in cache_dir.iterdir() for type_dir in log_dir.iterdir() for path in type_dir.iterdir()]

Setup

  • devkit-0.6

Steps To Reproduce

Steps to reproduce the behavior:

  1. Train with cache.use_cache_without_dataset=true

Stack Trace

Error executing job with overrides: ['+training=training_vector_model', 'py_func=train', 'cache.cache_path=/path/to/cache', 'scenario_filter.limit_total_scenarios=0.1', 'data_loader.params.batch_size=16', 'data_loader.params.num_workers=8', 'scenario_builder=nuplan_mini', 'scenario_builder.data_root=/path/to/data', 'lightning.trainer.params.max_epochs=25', 'optimizer.lr=1e-4', 'callbacks=train_callbacks_no_visualization', 'experiment_name=baseline_multi_lgcn', 'group=/path/to/exp/', 'cache.use_cache_without_dataset=true']
Traceback (most recent call last):
  File "/home/aah1si/nuplan/nuplan-devkit/nuplan/planning/script/run_training.py", line 57, in main
    engine = build_training_engine(cfg, worker)
  File "/home/aah1si/nuplan/nuplan-devkit/nuplan/planning/training/experiments/training.py", line 56, in build_training_engine
    datamodule = build_lightning_datamodule(cfg, worker, torch_module_wrapper)
  File "/home/aah1si/nuplan/nuplan-devkit/nuplan/planning/script/builders/training_builder.py", line 56, in build_lightning_datamodule
    scenarios = build_scenarios(cfg, worker, model)
  File "/home/aah1si/nuplan/nuplan-devkit/nuplan/planning/script/builders/scenario_builder.py", line 171, in build_scenarios
    assert len(scenarios) > 0, 'No scenarios were retrieved for training, check the scenario_filter parameters!'
AssertionError: No scenarios were retrieved for training, check the scenario_filter parameters!```
@mspryn-motional
Copy link

You are correct, the scenario name was added to the cache during that release. If you regenerate the cache, it should have the proper structure.

At the moment, we do not enforce backward compatibility for cache structures, so if the devkit is updated, caches will have to be regenerated. We may revisit this once the devkit becomes more stable.

@mh0797
Copy link
Author

mh0797 commented Sep 19, 2022

You are correct, the scenario name was added to the cache during that release. If you regenerate the cache, it should have the proper structure.

At the moment, we do not enforce backward compatibility for cache structures, so if the devkit is updated, caches will have to be regenerated. We may revisit this once the devkit becomes more stable.

Regenerating the cache does not solve this issue because since the last release the cache is structured as follows cache_folder/log_name/scenario_type/token/feature.gz whereas the line mentioned above expects it to be cache_folder/log_name/token/feature.gz

@simmelpatrick
Copy link

This fix helped me out as well! So I guess it's still relevant.

@HeYDwane3
Copy link

HeYDwane3 commented Mar 20, 2023

candidate_scenario_dirs = [path for log_dir in cache_dir.iterdir() for type_dir in log_dir.iterdir() for path in type_dir.iterdir()]

This fix helped me on 2023 03 20

Can't believe this issue is open for half a year

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants