Concurrent Crawl Limits #866

ikreymer · 2023-05-22T18:01:08Z

We need to be able to set a limit for concurrent crawls that can be started within a single organization.
The goal of the limit is to support different tiers of usage and better balance limited resources.

The concurrent crawl limit would be a property on the Org object on the backend. The limit is checked when starting a crawl, if crawl is above limit, the crawl enters a 'Waiting (Crawl Limit)' state until other crawls are completed, in which case the crawl is allowed to start. The limit should be configurable by the superadmin on the Organization settings page.

Tasks

Superadmin set org limits #867

ikreymer · 2023-05-23T01:08:17Z

UI support tracked via #867

ikreymer · 2023-05-23T01:13:27Z

A question: should there be a separate setting for max queued crawls as well as max concurrent crawls?
Without the latter, the user can still set an unlimited number of crawls to be waiting in the queue, even if only 1 or 2 may run at a time. With this limit, we could also limit how many crawls can be in waiting state (beyond that, it would be an error). Relatedly, this could apply to how many crawls could be set to be scheduled (though not necessarily run) at once.
This would then have further limits of:

number of concurrent crawls that can run at once
number of crawls that could be queued at once
number of crawls that could be scheduled at once (should be <= queued count?)

Or is this perhaps too complicated at this stage?

Shrinks99 · 2023-05-23T03:17:57Z

Should we see if we want to limit max queued crawls first? Overall not a bad idea, I imagine this would disable certian options in the workflow actions & config screens, so there would be front end impacts, but would be nice to see what average usage is like before we set more restrictions?

- support limits on concurrent crawls that can be run within a single org - change 'waiting' state to 'waiting_org_limit' for concurrent crawl limit and 'waiting_capacity' for capacity-based limits - frontend: show different waiting states on frontend: 'Waiting (Crawl Limit) and 'Waiting (At Capacity)' - operator: add all crawljobs as related, appear to be returned in creation order - operator: if concurrent crawl limit set, ensures current job is in the first N set of crawljobs (as provided via 'related' list of crawljob objects) before it can proceed to 'starting', otherwise set to 'waiting_org_limit' - api: add org /quotas endpoint for configuring quotas - remove 'new' state, always start with 'starting' - crawljob: add 'oid' to crawljob spec and label for easier querying

concurrent crawl limits: (addresses #866) - support limits on concurrent crawls that can be run within a single org - change 'waiting' state to 'waiting_org_limit' for concurrent crawl limit and 'waiting_capacity' for capacity-based limits orgs: - add 'maxConcurrentCrawl' to new 'quotas' object on orgs - add /quotas endpoint for updating quotas object operator: - add all crawljobs as related, appear to be returned in creation order - operator: if concurrent crawl limit set, ensures current job is in the first N set of crawljobs (as provided via 'related' list of crawljob objects) before it can proceed to 'starting', otherwise set to 'waiting_org_limit' - api: add org /quotas endpoint for configuring quotas - remove 'new' state, always start with 'starting' - crawljob: add 'oid' to crawljob spec and label for easier querying - more stringent state transitions: add allowed_from to set_state() - ensure state transitions only happened from allowed states, while failed/canceled can happen from any state - ensure finished and state synched from db if transition not allowed - add crawl indices by oid and cid frontend: - show different waiting states on frontend: 'Waiting (Crawl Limit) and 'Waiting (At Capacity)' - add gear icon on orgs admin page - and initial popup for setting org quotas, showing all properties from org 'quotas' object tests: - add concurrent crawl limit nightly tests - fix state waiting -> waiting_capacity - ci: add logging of operator output on test failure

Shrinks99 · 2023-06-08T02:36:42Z

This is also done! 🎉

ikreymer added the feature design This issue tracks smaller sub issues that compose a feature label May 22, 2023

Shrinks99 mentioned this issue May 23, 2023

Superadmin set org limits #867

Closed

ikreymer mentioned this issue May 23, 2023

Concurrent Crawl Limit #874

Merged

Shrinks99 closed this as completed Jun 8, 2023

Shrinks99 mentioned this issue Aug 3, 2023

Business Features V0 #1048

Closed

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent Crawl Limits #866

Concurrent Crawl Limits #866

ikreymer commented May 22, 2023 •

edited by Shrinks99

Loading

ikreymer commented May 23, 2023

ikreymer commented May 23, 2023

Shrinks99 commented May 23, 2023 •

edited

Loading

Shrinks99 commented Jun 8, 2023

Concurrent Crawl Limits #866

Concurrent Crawl Limits #866

Comments

ikreymer commented May 22, 2023 • edited by Shrinks99 Loading

Tasks

ikreymer commented May 23, 2023

ikreymer commented May 23, 2023

Shrinks99 commented May 23, 2023 • edited Loading

Shrinks99 commented Jun 8, 2023

ikreymer commented May 22, 2023 •

edited by Shrinks99

Loading

Shrinks99 commented May 23, 2023 •

edited

Loading