Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload EntityTooSmall #1075

Closed
lackray opened this issue Sep 29, 2022 · 9 comments · Fixed by #1081
Closed

upload EntityTooSmall #1075

lackray opened this issue Sep 29, 2022 · 9 comments · Fixed by #1081

Comments

@lackray
Copy link

lackray commented Sep 29, 2022

After #1070,now i can upload.During uploading,comes another exception.
this is my console error:
image
so,i go to check the docker-walrus,it gives me this:
image

There is no minimum limitation of multipart upload on S3 API of minio server.
image

How can i change the code inside shared/data_tools_core_s3.py line 153? @PJEstrada

Here is my temporary solution,i comment out the MultipartUpload inside shared/data_tools_core_s3.py line 148-154 related.
When i start to run the upload script,minio server connection error occurs.This is werid,i dont know why.Is this about message queue jam issue?Sometimes when i start upload script,the message queue dont handle it.MQ always do the heartbeat check,just miss the real job!
image

After several tries,finally i get upload worked.I dont do anything except restarting all diffgram docker services.
image

Saddly there is always a but,here comes an error.
image
docker hosted diffgram view:
image
I go to check the minio buckets page,there is no uploaded file.
image

Does it mean my file uploading fail?I dont know how to edit code inside shared/data_tools_core_s3.py,it seems that my temporary solution will lead to unexpected errors.

@PJEstrada
Copy link
Contributor

Hey @lackray thanks a lot for the thorough description.

I think the solution might be easier than we think. We have an env variable that control the chunk size for large uploads on the UI:

LARGE_API_CHUNK_SIZE = int(os.getenv('LARGE_API_CHUNK_SIZE', 5))

You can set this env variable to a larger size and hopefully this error should be gone.

@lackray
Copy link
Author

lackray commented Sep 30, 2022

Hey @lackray thanks a lot for the thorough description.

I think the solution might be easier than we think. We have an env variable that control the chunk size for large uploads on the UI:

LARGE_API_CHUNK_SIZE = int(os.getenv('LARGE_API_CHUNK_SIZE', 5))

You can set this env variable to a larger size and hopefully this error should be gone.

It didnt work.I add 'LARGE_API_CHUNK_SIZE=15' inside .env,same error EntityTooSmall.Do we have to use complete_multipart_upload with MultipartUpload ?I think if file is less than 5GiB,we should upload directly without MultipartUpload .And I am not sure about LARGE_API_CHUNK_SIZE=5 means 5x1000x1000 or 5x1024x1024.

@PJEstrada
Copy link
Contributor

One more question are you using the UI or the Python SDK for upload? If its the SDK can you share the code snippet? I'll be taking a look and get back to you with a solution.

@lackray
Copy link
Author

lackray commented Sep 30, 2022

One more question are you using the UI or the Python SDK for upload? If its the SDK can you share the code snippet? I'll be taking a look and get back to you with a solution.

Im using the Python SDK for uploading point cloud file,which is the only way to upload right now .Same code snippet as https://diffgram.readme.io/docs/3d-lidar-annotation-guide (upload_3d_file_to_diffgram), im using the example code.My python console just returns 'internal error' for such error,so i have to check with docker-walrus console output,which i mentioned above.

@PJEstrada
Copy link
Contributor

Thanks for the info!

I think the problem might be with the SDK. Previous version had a chunk size of 5mb for the 3D uploads by default.

I have fixed this and added a chunk_size param to the upload() function on version 0.9.5 of the SDK (default also is now 6MB).

Can you try doing:

pip install diffgram==0.9.5

And trying the upload again by re running the script?

@lackray
Copy link
Author

lackray commented Oct 3, 2022

Thanks for the info!

I think the problem might be with the SDK. Previous version had a chunk size of 5mb for the 3D uploads by default.

I have fixed this and added a chunk_size param to the upload() function on version 0.9.5 of the SDK (default also is now 6MB).

Can you try doing:

pip install diffgram==0.9.5

And trying the upload again by re running the script?

Not working.Sample code runs fine,but i use my ply file larger than 94MB,it gives me this:
image

image

my read function:
def read_ply_o3d(file_name):
ply = o3d.io.read_triangle_mesh(file_name)
out_arr = np.array(ply.vertices)
return out_arr

other code is same as sample.Dont know why,it looks like the SDK dont expect json format content.

@PJEstrada
Copy link
Contributor

PJEstrada commented Oct 3, 2022

Can you share your full script please? That way we can fully replicate the error. It would also help if you can see the final JSON to see if there is any issue with the JSON being generated by the client

@lackray
Copy link
Author

lackray commented Oct 3, 2022

here is my code:

from diffgram.file.file_3d import File3D
from diffgram.core.core import Project

from diffgram import *
import open3d as o3d
import numpy as np


def read_pcd_o3d(file_name):
    pcd = o3d.io.read_point_cloud(file_name)
    out_arr = np.asarray(pcd.points)
    return out_arr


def read_ply_o3d(file_name):
    ply = o3d.io.read_triangle_mesh(file_name)
    out_arr = np.array(ply.vertices)
    return out_arr


project = Project(
    # host="http://192.168.124.20:8085",
    host="http://192.168.31.174:8085",
    project_string_id="papercrane",
    client_id="LIVE__8v3cfiwr9rapp4ryxchl",
    client_secret="mhzy6aay4jn03awrx9zad15e3y5b4xqz5e54rr72qksn17a137pv6nfni0no"
)
# project = Project(
#     host="http://13.251.106.16:18088",
#     project_string_id="weedbell",
#     client_id="LIVE__180hlfk2k3p3yp0s1m68",
#     client_secret="60977ffnb7ku6jf97wwksaqqb3vig1t2djcun137wm0c7owjo2azjzki9ijf"
# )


# filename = 'Zaghetto.pcd'
filename = 'scan_0016.ply'

# points_arr = read_pcd_o3d(filename)
points_arr = read_ply_o3d(filename)

diffgram_3d_file = File3D(client=project, name=filename)

for i in range(0, len(points_arr)):
    point = points_arr[i]
    diffgram_3d_file.add_point(
        x=point[0],
        y=point[1],
        z=point[2],
    )
print('read point done')
diffgram_3d_file.upload()

@PJEstrada PJEstrada linked a pull request Oct 3, 2022 that will close this issue
@PJEstrada
Copy link
Contributor

here is my code:

from diffgram.file.file_3d import File3D
from diffgram.core.core import Project

from diffgram import *
import open3d as o3d
import numpy as np


def read_pcd_o3d(file_name):
    pcd = o3d.io.read_point_cloud(file_name)
    out_arr = np.asarray(pcd.points)
    return out_arr


def read_ply_o3d(file_name):
    ply = o3d.io.read_triangle_mesh(file_name)
    out_arr = np.array(ply.vertices)
    return out_arr


project = Project(
    # host="http://192.168.124.20:8085",
    host="http://192.168.31.174:8085",
    project_string_id="papercrane",
    client_id="LIVE__8v3cfiwr9rapp4ryxchl",
    client_secret="mhzy6aay4jn03awrx9zad15e3y5b4xqz5e54rr72qksn17a137pv6nfni0no"
)
# project = Project(
#     host="http://13.251.106.16:18088",
#     project_string_id="weedbell",
#     client_id="LIVE__180hlfk2k3p3yp0s1m68",
#     client_secret="60977ffnb7ku6jf97wwksaqqb3vig1t2djcun137wm0c7owjo2azjzki9ijf"
# )


# filename = 'Zaghetto.pcd'
filename = 'scan_0016.ply'

# points_arr = read_pcd_o3d(filename)
points_arr = read_ply_o3d(filename)

diffgram_3d_file = File3D(client=project, name=filename)

for i in range(0, len(points_arr)):
    point = points_arr[i]
    diffgram_3d_file.add_point(
        x=point[0],
        y=point[1],
        z=point[2],
    )
print('read point done')
diffgram_3d_file.upload()

Hey @lackray,

Thanks for all the details on this. I've identify a bug on the chunking process for MinIO specifically. So I deployed v 1.8.22 please try upgrading by changing the tags, the issue should be fixed with the new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants