-
-
Notifications
You must be signed in to change notification settings - Fork 196
Closed
Description
I am facing an error while trying to compute the normalization stats during the consolidate() step of the AGIBot dataset for a single task (327).
Here's the full traceback for the error:
Loading dataset shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 169/169 [00:00<00:00, 12281.05it/s] [2/164]
Traceback (most recent call last):
File "/home/ubuntu/AgiBot-World/scripts/convert_to_lerobot.py", line 670, in <module>
task_id = args.task_id
File "/home/ubuntu/AgiBot-World/scripts/convert_to_lerobot.py", line 633, in main
raw_datasets_chunk = None
File "/home/ubuntu/AgiBot-World/scripts/convert_to_lerobot.py", line 444, in consolidate
self.meta.stats = compute_stats(self)
File "/home/ubuntu/AgiBot-World/scripts/convert_to_lerobot.py", line 234, in compute_stats
stats_patterns = get_stats_einops_patterns(dataset, num_workers)
File "/home/ubuntu/AgiBot-World/scripts/convert_to_lerobot.py", line 197, in get_stats_einops_patterns
batch = next(iter(dataloader))
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 708, in __next__
data = self._next_data()
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1480, in _next_data
return self._process_data(data)
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1505, in _process_data
data.reraise()
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/_utils.py", line 733, in reraise
raise exception
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/.local/lib/python3.10/site-packages/lerobot/common/datasets/lerobot_dataset.py", line 645, in __getitem__
item = self.hf_dataset[idx]
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2782, in __getitem__
return self._getitem(key)
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 2767, in _getitem
formatted_output = format_table(
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 658, in format_table
return formatter(pa_table, query_type=query_type)
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 411, in __call__
return self.format_row(pa_table)
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 511, in format_row
formatted_batch = self.format_batch(pa_table)
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 540, in format_batch
batch = self.python_features_decoder.decode_batch(batch)
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/formatting/formatting.py", line 231, in decode_batch
return self.features.decode_batch(batch) if self.features else batch
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/features/features.py", line 2091, in decode_batch
[
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/features/features.py", line 2092, in <listcomp>
decode_nested_example(self[column_name], value, token_per_repo_id=token_per_repo_id)
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/features/features.py", line 1407, in decode_nested_example
return schema.decode_example(obj, token_per_repo_id=token_per_repo_id) if obj is not None else None
File "/home/ubuntu/.local/lib/python3.10/site-packages/datasets/features/image.py", line 189, in decode_example
if image.getexif().get(PIL.Image.ExifTags.Base.Orientation) is not None:
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 1360, in getexif
self._exif.load_from_fp(self.fp, self.tag_v2._offset)
File "/usr/lib/python3/dist-packages/PIL/Image.py", line 3410, in load_from_fp
self.fp.seek(offset)
AttributeError: 'NoneType' object has no attribute 'seek'
Is this a known issue due to a version/data loading problem? I have added some garbage collection to the main script here, which might be related to the problem.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels