Skip to content

Retrieving large files causes unacceptable resource usage on catalog server #376

@jpmcfarland

Description

@jpmcfarland

When using put() to upload files, the resource usage on the catalog server per transfer is minimal, but can add up for large numbers of (large) files (see #375). For get(), the resource usage can be unacceptably high.

Uploading a large file (GBs in size) via put() transfers the data to the catalog server which streams it to a resource server (no redirected connection). This operation appears as efficient as it can be. When the file is downloaded with get(), the data transfer is reversed. While the resource server's resources are not taxed during this operation, the catalog server's are severely so. When more than 1 thread is used, both the RAM and CPU usage quickly spike to near maximum and the server will crash completely if the transfer is not ended within a short period of time. When using only one thread, the same thing happens, only more slowly. It is also of note that the transfer speed is strongly affected by the resource usage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions