-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persisting CreateCommitError #2766
Comments
see #2758 (comment) |
Same for error |
For https://huggingface.co/datasets/re-align/UnifiedChat, for example, we have entries, including an error with ![]() So, the issue seems to be in the backfill process: we are not checking if cache entries exist when they should be deleted. It's a different issue, so I opened #2767 |
For https://huggingface.co/datasets/venetis/VMMRdb_make_model_test, the traceback is: {
"error": "Commit 0/1 could not be created on the Hub (after 6 attempts).",
"cause_exception": "BadRequestError",
"cause_message": " (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)\n\nBad request for commit endpoint:\nYour push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet",
"cause_traceback": [
"Traceback (most recent call last):\n",
' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status\n response.raise_for_status()\n',
' File "/src/services/worker/.venv/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status\n raise HTTPError(http_error_msg, response=self)\n',
"requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://huggingface.co/api/datasets/venetis/VMMRdb_make_model_test/commit/refs%2Fconvert%2Fparquet\n",
"\nThe above exception was the direct cause of the following exception:\n\n",
"Traceback (most recent call last):\n",
' File "/src/libs/libcommon/src/libcommon/utils.py", line 183, in decorator\n return func(*args, **kwargs)\n',
' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn\n return fn(*args, **kwargs)\n',
' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 1230, in _inner\n return fn(self, *args, **kwargs)\n',
' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/hf_api.py", line 3812, in create_commit\n hf_raise_for_status(commit_resp, endpoint_name="commit")\n',
' File "/src/services/worker/.venv/lib/python3.9/site-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status\n raise BadRequestError(message, response=response) from e\n',
"huggingface_hub.utils._errors.BadRequestError: (Request ID: Root=1-6634b9f0-49d2d36a185cf9530aac8e1f;1742b1b1-75f5-45d7-99da-95a638b09d29)\n\nBad request for commit endpoint:\nYour push was rejected because an LFS pointer pointed to a file that does not exist. For instance, this can happen if you used git push --no-verify to push your changes. Offending file: - default/train/0000.parquet\n",
"\nThe above exception was the direct cause of the following exception:\n\n",
"Traceback (most recent call last):\n",
' File "/src/services/worker/src/worker/job_runners/config/parquet_and_info.py", line 1003, in create_commits\n commit_info = retry_create_commit(\n',
' File "/src/libs/libcommon/src/libcommon/utils.py", line 188, in decorator\n raise RuntimeError(f"Give up after {attempt} attempts. The last one raised {type(last_err)}") from last_err\n',
"RuntimeError: Give up after 6 attempts. The last one raised <class 'huggingface_hub.utils._errors.BadRequestError'>\n",
],
} Hence, the specific error is:
The dataset only has one data file (Parquet): https://huggingface.co/datasets/venetis/VMMRdb_make_model_test/tree/main/data The current content of the ![]() Do you have an idea of what can be occurring @lhoestq? |
For dataset https://huggingface.co/datasets/venetis/VMMRdb_make_model_test, we get the same error after 30 retries:
CreateCommitError
. Another one: https://huggingface.co/datasets/celsowm/stack-exchange-paired-mini-1kAlso, two other datasets have the
CreateCommitError
for more than 1 month, so it does not seem to be the same issue:The text was updated successfully, but these errors were encountered: