-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit the maximum number of candidate pairs #605
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nicer to not even start a run if it is likely to exceed the resources. However, I don't have a good idea on how to estimate the number of candidate pairs above the threshold...
What I am a bit concerned about with your solution is that the server quietly fails. What's gonna be the first thing a user will do when he sees that his run errored? Try again.
Would it be possible to have some sort of state-info field in the run table that we can include in the output of the run status endpoint?
if global_candidates_for_run is not None and global_candidates_for_run > Config.SIMILARITY_SCORES_MAX_CANDIDATE_PAIRS: | ||
log.warning(f"This run has created more than the global limit of candidate pairs. Setting state to 'error'") | ||
with DBConn() as conn: | ||
update_run_mark_failure(conn, run_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the user will never know why the run failed?
if len(candidate_pairs[0]) > config.SOLVER_MAX_CANDIDATE_PAIRS: | ||
log.warning(f"Attempting to solve with more than the global limit of candidate pairs.") | ||
with DBConn() as conn: | ||
update_run_mark_failure(conn, run_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. Has the run table the ability to store an error message, and could that be passed on to the user?
Yeah that would be nice, a very simple additional protection that can be pre-computed is including a limit on the number of comparisons?
It is certainly possible to add a new column to store error details in the
IMO I think that should be tackled separately though. |
* catching NoSuchBucket exception at cleanup (#576) * fix NoSuchBucket error (#577) * use the same helper function as the other tests * more logging on server side * we need read access to check if bucket exists * creating bucket if it does not exist * Bump marshmallow from 3.6.0 to 3.6.1 in /base Bumps [marshmallow](https://github.com/marshmallow-code/marshmallow) from 3.6.0 to 3.6.1. - [Release notes](https://github.com/marshmallow-code/marshmallow/releases) - [Changelog](https://github.com/marshmallow-code/marshmallow/blob/dev/CHANGELOG.rst) - [Commits](marshmallow-code/marshmallow@3.6.0...3.6.1) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * Bump pytest from 5.3.5 to 5.4.3 in /base Bumps [pytest](https://github.com/pytest-dev/pytest) from 5.3.5 to 5.4.3. - [Release notes](https://github.com/pytest-dev/pytest/releases) - [Changelog](https://github.com/pytest-dev/pytest/blob/master/CHANGELOG.rst) - [Commits](pytest-dev/pytest@5.3.5...5.4.3) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * Bump anonlink-client from 0.1.2 to 0.1.3 in /base Bumps [anonlink-client](https://github.com/data61/anonlink-client) from 0.1.2 to 0.1.3. - [Release notes](https://github.com/data61/anonlink-client/releases) - [Changelog](https://github.com/data61/anonlink-client/blob/master/CHANGELOG.md) - [Commits](https://github.com/data61/anonlink-client/commits) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * update python3 version (#582) * update python3 version * update anonlink-client, as old version was broken * that overwrites the version from base for no good reason. * Bump ijson from 3.0.4 to 3.1.1 in /base Bumps [ijson](https://github.com/ICRAR/ijson) from 3.0.4 to 3.1.1. - [Release notes](https://github.com/ICRAR/ijson/releases) - [Changelog](https://github.com/ICRAR/ijson/blob/master/CHANGELOG.md) - [Commits](ICRAR/ijson@v3.0.4...v3.1.1) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * Bump celery from 4.4.2 to 4.4.7 in /base Bumps [celery](https://github.com/celery/celery) from 4.4.2 to 4.4.7. - [Release notes](https://github.com/celery/celery/releases) - [Changelog](https://github.com/celery/celery/blob/master/Changelog.rst) - [Commits](celery/celery@4.4.2...v4.4.7) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * Fix case sensitivity in minio metadata Field names in HTTP headers are case-insensitive, some networks decide that means that can normalize them however they like. Minio's stat.metadata is a dict of custom HTTP headers. This small change ensures that queries will get the header regardless of the case. * Migrate off deprecated K8s dependencies (#596) * Update helm minio chart by several major versions * Migrate off deprecated redis-ha repository * Provide a fallback UPLOAD_OBJECT_STORE_SERVER option as an ingress isn't required for minio to work. * Documents upload object store configuration. * Update azure pipelines * Update base image deps * Pin an older version of bitarray * Update minio image used with docker compose * Bump the chart version * Update ingress to include path Remove defaults from values file for ingress settings Fixed two typos in the templates. * Documents ingress configuration * Updates base and Python dependencies (#601) * Updates base alpine image * Updates python requirements * Use latest release of anonlink and minio * Fix docker build script and benchmark image * Adjusts to a new minio. Noticed that minio has a bug if the assume role duration is less than an hour. * Expose similarities via object store (#594) Sparse similarity results can be extremely large, this commit adds an option for callers to request the object store path of the similarity results instead of the results themselves. * Adds a small test ensuring we can pull similarity scores via object store * Build script now builds the test docker image * Put common environment variables into a .env file for docker-compose * Store credentials with environment variable names to avoid confusion and reduce duplication * The init object store script now creates a readonly user * Updates documentation on uploading and downloading via object store * [minor] Update entity-service chart to use helm api v2 (#606) * Initialize database via alembic * Delete the raw SQL to create the database * Update k8s deployment to use alembic * Update queries to use run_id instead of run for run_results table * Minio python API now requires a "DeleteObject" I don't know why. * Base wasn't building * Bump psycopg2 from 2.8.4 to 2.8.6 in /base (#604) Bumps [psycopg2](https://github.com/psycopg/psycopg2) from 2.8.4 to 2.8.6. - [Release notes](https://github.com/psycopg/psycopg2/releases) - [Changelog](https://github.com/psycopg/psycopg2/blob/master/NEWS) - [Commits](https://github.com/psycopg/psycopg2/commits) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com> Co-authored-by: wilko77 <wilko77@users.noreply.github.com> * Bump alpine from 3.13.1 to 3.13.2 in /base Bumps alpine from 3.13.1 to 3.13.2. Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * Run migration jobs after upgrade as well as after install (#611) * Connect to object store using TLS if configured to (#614) * Update env var names in k8s init jobs (#612) * Update env var name * Update comment in deployment values * Update environment variable used in alembic * Bump iso8601 from 0.1.12 to 0.1.14 in /base Bumps [iso8601](https://github.com/micktwomey/pyiso8601) from 0.1.12 to 0.1.14. - [Release notes](https://github.com/micktwomey/pyiso8601/releases) - [Commits](micktwomey/pyiso8601@0.1.12...0.1.14) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * bump python3 dependency Alpine 3.13.2 now needs python3=3.8.7-r1 * Adds recommended k8s labels to deployments and services (#616) * Limit the maximum number of candidate pairs (#605) * Cache the number of identified candidates along with the number of comparisons carried out. * Update cache test * Add global limits on number of edges * Handle the case where there are no cached edges * Add a step in the integration test pipeline validating if a test result file exists, otherwise fails. (#618) * Add optional pod annotations to init jobs (#619) * Adds changelog/release notes for v1.14.0 (#620) * Proposed changelog for v1.14.0 * Update azure-pipelines.yml to fix a name change in a previous PR... Co-authored-by: wilko77 <wilko77@users.noreply.github.com> * Bump pytest-xdist from 1.29.0 to 2.2.1 in /base Bumps [pytest-xdist](https://github.com/pytest-dev/pytest-xdist) from 1.29.0 to 2.2.1. - [Release notes](https://github.com/pytest-dev/pytest-xdist/releases) - [Changelog](https://github.com/pytest-dev/pytest-xdist/blob/master/CHANGELOG.rst) - [Commits](pytest-dev/pytest-xdist@v1.29.0...v2.2.1) Signed-off-by: dependabot-preview[bot] <support@dependabot.com> * bump version number * more bumping... Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com> Co-authored-by: Brian Thorne <brian@hardbyte.nz> Co-authored-by: Brian Thorne <brian@thorne.link> Co-authored-by: Guillaume Smith <gusmith@users.noreply.github.com>
Adds two global settings to protect the service from running out of memory due to excessive numbers of candidate pairs being processed.
Closes #595