Limit maximum number of edges #595

hardbyte · 2020-12-18T08:24:53Z

I propose adding a feature where the service has a configured limit on the maximum number of edges. This would prevent the server from dying a slow and horrible death of swapping to a halt.

Probably makes sense to have a globally set limit or limits.

As the solver runs fully in memory the deployed high memory worker's memory limit imposes an upper limit.
Say the max memory was 16GiB, and estimating that each candidate pair is 40 bytes (8 Byte score, 4 x 8 Byte indicies into dataset and records):

(16 gibibytes) / (40 bytes) =
429 496 730

In the case where a user is attempting to download raw similarity scores instead of using the solver the limit may be much higher, but it still makes sense to perhaps configure the system to avoid creating arbitrarily large outputs.

I think it could also be useful to override this limit, less than the global limit, per run.

How about these names and defaults?

SOLVER_MAX_CANDIDATE_PAIRS = 100M
SIMILARITY_SCORES_MAX_ CANDIDATE_PAIRS = 500M (Could instead limit the file size in GiB)

The backend task would discard the data and cause the run to fail if these limits are exceeded.

The text was updated successfully, but these errors were encountered:

hardbyte mentioned this issue Feb 16, 2021

Limit the maximum number of candidate pairs #605

Merged

hardbyte closed this as completed in #605 Feb 23, 2021

hardbyte added this to the Entity Service v1.14 milestone Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit maximum number of edges #595

Limit maximum number of edges #595

hardbyte commented Dec 18, 2020 •

edited

Loading

Limit maximum number of edges #595

Limit maximum number of edges #595

Comments

hardbyte commented Dec 18, 2020 • edited Loading

hardbyte commented Dec 18, 2020 •

edited

Loading