Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fallback mirrors to dataset API #3168

Merged
merged 3 commits into from
Mar 2, 2023
Merged

Add fallback mirrors to dataset API #3168

merged 3 commits into from
Mar 2, 2023

Conversation

abidwael
Copy link
Contributor

@abidwael abidwael commented Mar 1, 2023

To mitigate the flakiness of external APIs and downtime from the main mirrors, this PR introduces fallback mirrors to the dataset API. These mirrors are defined in the dataset configs under ludwig/datasets/configs by adding

fallback_mirrors:
  - name: mirror_1
    download_paths:
      - path/to/dataset/file_1.json
      - path/to/dataset/file_2.json
  - name: mirror_2
    download_paths: 
     - ...

Only paths to filesystems are supported.

When downloading the dataset, the loader will try to retrieve the dataset from the main source (download_urls or kaggle_competition). If any exception is encountered, it will attempt to load the dataset from the mirror in the order specified.

@github-actions
Copy link

github-actions bot commented Mar 1, 2023

Unit Test Results

         6 files  ±0           6 suites  ±0   5h 37m 59s ⏱️ - 12m 22s
  3 983 tests +3    3 939 ✔️ +3    44 💤 ±0  0 ±0 
11 970 runs  +9  11 829 ✔️ +9  141 💤 ±0  0 ±0 

Results for commit 771d420. ± Comparison against base commit 04efab4.

♻️ This comment has been updated with latest results.

@abidwael abidwael merged commit fff8ab3 into master Mar 2, 2023
@abidwael abidwael deleted the add-dataset-backup branch March 2, 2023 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants