You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I always use ClearML to store the latest checkpoint from my model training, and I noticed last week my experiments were not saving correctly the models. I checked the console logs of my experiments and I found this every time I upload an artifact:
2024-06-24 11:46:37,590 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object /cho/Brumas_v2-256-allNormal-allPerClass-2-aiCrowd-food201-allWithDrink-fromPretraining.91735e850ed245449e328149647e0a17/artifacts/latest.ckpt/last.ckpt (413): <!doctype html>
<html lang=en>
<title>413 Request Entity Too Large</title>
<h1>Request Entity Too Large</h1>
<p>The data value transmitted exceeds the capacity limit.</p>
2024-06-24 11:46:37,591 - clearml.metrics - WARNING - Failed uploading to https://files.clear.ml (Failed uploading object /cho/Brumas_v2-256-allNormal-allPerClass-2-aiCrowd-food201-allWithDrink-fromPretraining.91735e850ed245449e328149647e0a17/artifacts/latest.ckpt/last.ckpt (413): <!doctype html>
<html lang=en>
<title>413 Request Entity Too Large</title>
<h1>Request Entity Too Large</h1>
<p>The data value transmitted exceeds the capacity limit.</p>
)
2024-06-24 11:46:37,594 - clearml.metrics - ERROR - Not uploading 1/1 events because the data upload failed
I did not change anything in my code and looks like big .ckpt are not being uploaded. My .ckpt files have around 400 MB, but I could test files with 180 MB and they also failed.
Was there any changes in the ClearML server deployment? I am using the cloud free tier option and uploading the files to https://files.clear.ml. (I still have 50 GB free for storage)
To reproduce
from clearml import Task
Task.add_requirements("requirements.txt")
task = Task.init(project_name="cho", task_name="test",
output_uri=None) # set output_uri=True to log all lightning models in clearml
task.upload_artifact(name="adjustedIds.json", artifact_object="adjustedIds.json") # it works (small file)
task.upload_artifact(name="model.ckpt", artifact_object="model.ckpt") # it fails (big file)
Expected behaviour
Expected behaviour would be able to download the models from ClearML UI. But since the upload fails, I receive 404 NOT FOUND.
Environment
Server type = app.clear.ml
ClearML SDK Version = 1.16.2
Python Version = 3.10
OS (Windows \ Linux \ Macos) = Linux
The text was updated successfully, but these errors were encountered:
Describe the bug
Hello,
I always use ClearML to store the latest checkpoint from my model training, and I noticed last week my experiments were not saving correctly the models. I checked the console logs of my experiments and I found this every time I upload an artifact:
I did not change anything in my code and looks like big .ckpt are not being uploaded. My .ckpt files have around 400 MB, but I could test files with 180 MB and they also failed.
Was there any changes in the ClearML server deployment? I am using the cloud free tier option and uploading the files to https://files.clear.ml. (I still have 50 GB free for storage)
To reproduce
Expected behaviour
Expected behaviour would be able to download the models from ClearML UI. But since the upload fails, I receive 404 NOT FOUND.
Environment
The text was updated successfully, but these errors were encountered: