Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persisting CreateCommitError #2766

Open
severo opened this issue May 2, 2024 · 4 comments
Open

persisting CreateCommitError #2766

severo opened this issue May 2, 2024 · 4 comments
Labels
bug Something isn't working P1 Not as needed as P0, but still important/wanted

Comments

@severo
Copy link
Collaborator

severo commented May 2, 2024

For dataset https://huggingface.co/datasets/venetis/VMMRdb_make_model_test, we get the same error after 30 retries: CreateCommitError. Another one: https://huggingface.co/datasets/celsowm/stack-exchange-paired-mini-1k

Also, two other datasets have the CreateCommitError for more than 1 month, so it does not seem to be the same issue:

@severo severo added bug Something isn't working P1 Not as needed as P0, but still important/wanted labels May 2, 2024
@severo
Copy link
Collaborator Author

severo commented May 2, 2024

see #2758 (comment)

@severo
Copy link
Collaborator Author

severo commented May 2, 2024

Same for error LockedDatasetTimeoutError: 6 entries are never retried, from more than one month. I'm not sure why they are not removed during the daily backfill.

@severo
Copy link
Collaborator Author

severo commented May 2, 2024

For https://huggingface.co/datasets/re-align/UnifiedChat, for example, we have entries, including an error with LockedDatasetTimeoutError, for config default. But this config does not exist anymore:

Capture d’écran 2024-05-02 à 14 59 02

So, the issue seems to be in the backfill process: we are not checking if cache entries exist when they should be deleted.

It's a different issue, so I opened #2767

@severo
Copy link
Collaborator Author

severo commented May 3, 2024

For https://huggingface.co/datasets/venetis/VMMRdb_make_model_test, the traceback is:

{
    "error": "Commit 0/1 could not be created on the Hub (after 6 attempts).",
    "cause_exception": "BadRequestError",
    "cause_message": " (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)\n\nBad request for commit endpoint:\nYour push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet",
    "cause_traceback": [
        "Traceback (most recent call last):\n",
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status\n response.raise_for_status()\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status\n raise HTTPError(http_error_msg, response=self)\n',
        "requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://huggingface.co/api/datasets/venetis/VMMRdb_make_model_test/commit/refs%2Fconvert%2Fparquet\n",
        "\nThe above exception was the direct cause of the following exception:\n\n",
        "Traceback (most recent call last):\n",
        ' File "/src/libs/libcommon/src/libcommon/utils.py", line 183, in decorator\n return func(*args, **kwargs)\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn\n return fn(*args, **kwargs)\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 1230, in _inner\n return fn(self, *args, **kwargs)\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 3812, in create_commit\n hf_raise_for_status(commit_resp, endpoint_name="commit")\n',
        ' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status\n raise BadRequestError(message, response=response) from e\n',
        "huggingface_hub.utils._errors.BadRequestError: (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)\n\nBad request for commit endpoint:\nYour push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet\n",
        "\nThe above exception was the direct cause of the following exception:\n\n",
        "Traceback (most recent call last):\n",
        ' File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 1003, in create_commits\n commit_info = retry_create_commit(\n',
        ' File "/src/libs/libcommon/src/libcommon/utils.py", line 188, in decorator\n raise RuntimeError(f"Give up after {attempt} attempts. The last one raised {type(last_err)}") from last_err\n',
        "RuntimeError: Give up after 6 attempts. The last one raised <class 'huggingface_hub.utils._errors.BadRequestError'>\n",
    ],
}

Hence, the specific error is:

huggingface_hub.utils._errors.BadRequestError: (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)

Bad request for commit endpoint:
Your push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet

The dataset only has one data file (Parquet): https://huggingface.co/datasets/venetis/VMMRdb_make_model_test/tree/main/data

The current content of the refs/convert/parquet branch is:

Capture d’écran 2024-05-03 à 12 21 39

Do you have an idea of what can be occurring @lhoestq?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Not as needed as P0, but still important/wanted
Projects
None yet
Development

No branches or pull requests

1 participant