Hey,
Unfortunately, I keep getting different errors (all of the following happened multiple times) when downloading different datasets using graphbench:
loader = graphbench.Loader(GRAPHBENCH_ROOT, "socialnetwork") dataset = loader.load() [INFO] Downloading https://zenodo.org/records/14669616/files/graphs.tar.gz -> datasets/graphbench/bluesky/bluesky_graphs/raw/graphs.tar.gz Traceback (most recent call last): File "<stdin>", line 1, in <module> File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 111, in load data = self._loader(dataset) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 135, in _loader train_dataset = BlueSkyDataset(root=self.root, name=dataset_name, pre_transform=self.pre_transform, transform=self.transform, split="train", follower_subgraph=False, cleanup_raw=True,load_preprocessed=True) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/bluesky.py", line 279, in __init__ self._prepare() # (i) downloads, unpacks, load data + (ii) timestep handle + (e) subgraph + collate File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/bluesky.py", line 286, in _prepare _download_and_unpack(self.source, self._raw_dir, Path(self.processed_dir), logger=logger) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/helpers/download.py", line 26, in _download_and_unpack _stream_download(url, local_path, logger) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/helpers/download.py", line 48, in _stream_download r.raise_for_status() File "miniforge3/envs/gfm/lib/python3.10/site-packages/requests/models.py", line 1026, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://zenodo.org/records/14669616/files/graphs.tar.gz
The combinatorial optimization dataset either crashes Python completely (I tried different kernels) or has some weird error if some of the dataset was already downloaded:
loader = graphbench.Loader(GRAPHBENCH_ROOT, "co") ds = loader.load() [INFO] Downloading https://huggingface.co/datasets/log-rwth-aachen/Graphbench_CO/resolve/main/ba_large_mis_labeled.tar.gz -> datasets/graphbench/co/co_ba_large/raw/ba_large_mis_labeled.tar.gz Downloaded and unpacked data to datasets/graphbench/co/co_ba_large/raw [INFO] Saved processed dataset -> datasets/graphbench/co/co_ba_large/processed/[data.pt](http://data.pt/) Done! [INFO] Cleaning up: datasets/graphbench/co/co_ba_large/raw Processing... [INFO] Downloading https://huggingface.co/datasets/log-rwth-aachen/Graphbench_CO/resolve/main/ba_small_mis_labeled.tar.gz -> datasets/graphbench/co/co_ba_small/raw/ba_small_mis_labeled.tar.gz Downloaded and unpacked data to datasets/graphbench/co/co_ba_small/raw [INFO] Saved processed dataset -> datasets/graphbench/co/co_ba_small/processed/[data.pt](http://data.pt/) Done! [INFO] Cleaning up: datasets/graphbench/co/co_ba_small/raw Processing... [INFO] Downloading https://huggingface.co/datasets/log-rwth-aachen/Graphbench_CO/resolve/main/er_large_mis_labeled.tar.gz -> datasets/graphbench/co/co_er_large/raw/er_large_mis_labeled.tar.gz Downloaded and unpacked data to datasets/graphbench/co/co_er_large/raw Killed
And
loader = graphbench.Loader(GRAPHBENCH_ROOT, "co") ds = loader.load() Processing... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 111, in load data = self._loader(dataset) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 151, in _loader dataset = CODataset(root=self.root, name=dataset_name, pre_transform=self.pre_transform, transform=self.transform, split="train", generate=self.generate) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/combinatorial_optimization.py", line 134, in __init__ super().__init__(str(self.algoreas_dir), transform, pre_transform) File "miniforge3/envs/gfm/lib/python3.10/site-packages/torch_geometric/data/in_memory_dataset.py", line 81, in __init__ super().__init__(root, transform, pre_transform, pre_filter, log, File "miniforge3/envs/gfm/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 115, in __init__ self._process() File "miniforge3/envs/gfm/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 265, in _process self.process() File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/combinatorial_optimization.py", line 257, in process self._prepare() File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/combinatorial_optimization.py", line 191, in _prepare _download_and_unpack(source=self.source, raw_dir=self._raw_dir, processed_dir=self.processed_path, logger=logger) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/helpers/download.py", line 24, in _download_and_unpack if not processed_dir.exists() or not any(Path(processed_dir).iterdir()): File "miniforge3/envs/gfm/lib/python3.10/pathlib.py", line 1017, in iterdir for name in self._accessor.listdir(self): NotADirectoryError: [Errno 20] Not a directory: 'datasets/graphbench/co/co_ba_large/processed/[data.pt](http://data.pt/)'
Hey,
Unfortunately, I keep getting different errors (all of the following happened multiple times) when downloading different datasets using graphbench:
loader = graphbench.Loader(GRAPHBENCH_ROOT, "socialnetwork") dataset = loader.load() [INFO] Downloading https://zenodo.org/records/14669616/files/graphs.tar.gz -> datasets/graphbench/bluesky/bluesky_graphs/raw/graphs.tar.gz Traceback (most recent call last): File "<stdin>", line 1, in <module> File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 111, in load data = self._loader(dataset) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 135, in _loader train_dataset = BlueSkyDataset(root=self.root, name=dataset_name, pre_transform=self.pre_transform, transform=self.transform, split="train", follower_subgraph=False, cleanup_raw=True,load_preprocessed=True) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/bluesky.py", line 279, in __init__ self._prepare() # (i) downloads, unpacks, load data + (ii) timestep handle + (e) subgraph + collate File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/bluesky.py", line 286, in _prepare _download_and_unpack(self.source, self._raw_dir, Path(self.processed_dir), logger=logger) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/helpers/download.py", line 26, in _download_and_unpack _stream_download(url, local_path, logger) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/helpers/download.py", line 48, in _stream_download r.raise_for_status() File "miniforge3/envs/gfm/lib/python3.10/site-packages/requests/models.py", line 1026, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://zenodo.org/records/14669616/files/graphs.tar.gzThe combinatorial optimization dataset either crashes Python completely (I tried different kernels) or has some weird error if some of the dataset was already downloaded:
loader = graphbench.Loader(GRAPHBENCH_ROOT, "co") ds = loader.load() [INFO] Downloading https://huggingface.co/datasets/log-rwth-aachen/Graphbench_CO/resolve/main/ba_large_mis_labeled.tar.gz -> datasets/graphbench/co/co_ba_large/raw/ba_large_mis_labeled.tar.gz Downloaded and unpacked data to datasets/graphbench/co/co_ba_large/raw [INFO] Saved processed dataset -> datasets/graphbench/co/co_ba_large/processed/[data.pt](http://data.pt/) Done! [INFO] Cleaning up: datasets/graphbench/co/co_ba_large/raw Processing... [INFO] Downloading https://huggingface.co/datasets/log-rwth-aachen/Graphbench_CO/resolve/main/ba_small_mis_labeled.tar.gz -> datasets/graphbench/co/co_ba_small/raw/ba_small_mis_labeled.tar.gz Downloaded and unpacked data to datasets/graphbench/co/co_ba_small/raw [INFO] Saved processed dataset -> datasets/graphbench/co/co_ba_small/processed/[data.pt](http://data.pt/) Done! [INFO] Cleaning up: datasets/graphbench/co/co_ba_small/raw Processing... [INFO] Downloading https://huggingface.co/datasets/log-rwth-aachen/Graphbench_CO/resolve/main/er_large_mis_labeled.tar.gz -> datasets/graphbench/co/co_er_large/raw/er_large_mis_labeled.tar.gz Downloaded and unpacked data to datasets/graphbench/co/co_er_large/raw KilledAnd
loader = graphbench.Loader(GRAPHBENCH_ROOT, "co") ds = loader.load() Processing... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 111, in load data = self._loader(dataset) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/loader.py", line 151, in _loader dataset = CODataset(root=self.root, name=dataset_name, pre_transform=self.pre_transform, transform=self.transform, split="train", generate=self.generate) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/combinatorial_optimization.py", line 134, in __init__ super().__init__(str(self.algoreas_dir), transform, pre_transform) File "miniforge3/envs/gfm/lib/python3.10/site-packages/torch_geometric/data/in_memory_dataset.py", line 81, in __init__ super().__init__(root, transform, pre_transform, pre_filter, log, File "miniforge3/envs/gfm/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 115, in __init__ self._process() File "miniforge3/envs/gfm/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 265, in _process self.process() File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/combinatorial_optimization.py", line 257, in process self._prepare() File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/datasets/combinatorial_optimization.py", line 191, in _prepare _download_and_unpack(source=self.source, raw_dir=self._raw_dir, processed_dir=self.processed_path, logger=logger) File "miniforge3/envs/gfm/lib/python3.10/site-packages/graphbench/helpers/download.py", line 24, in _download_and_unpack if not processed_dir.exists() or not any(Path(processed_dir).iterdir()): File "miniforge3/envs/gfm/lib/python3.10/pathlib.py", line 1017, in iterdir for name in self._accessor.listdir(self): NotADirectoryError: [Errno 20] Not a directory: 'datasets/graphbench/co/co_ba_large/processed/[data.pt](http://data.pt/)'