Skip to content

Commit

Permalink
fix remote datasets loading in datasets 2.14
Browse files Browse the repository at this point in the history
  • Loading branch information
LoicGrobol committed Jul 28, 2023
1 parent d347c5f commit 95d3111
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion zeldarose/datasets/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,10 @@ def encode_dataset(
logger.info(f"Loading data from {text_path}")
try:
full_dataset = datasets.load_dataset("text", data_files=str(text_path), split="train")
except FileNotFoundError as e:
# So far the cleaner way to detect that a dataset is remote???
# in datasets < 2.14 this was FileNotFoundError, in 2.14 it's the other one
# in the future? Who's to say,,,
except (FileNotFoundError, datasets.builder.DatasetGenerationError) as e:
if isinstance(text_path, str):
dataset_name, dataset_config, dataset_split = text_path.split(":")
full_dataset = datasets.load_dataset(
Expand Down

0 comments on commit 95d3111

Please sign in to comment.