-
Notifications
You must be signed in to change notification settings - Fork 4
419 reduce ops costs #441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
419 reduce ops costs #441
Conversation
…finity to be consistent
@LucaCinquini - There is one pending item. Once any DAGs are executed the |
@nikki-t : I tested the PR and it looks good - in steady state, the cluster has 3 nodes. 2 would be even better, can you try your suggestion of using "WhenEmptyOrUnderutilized" for the celer-workers, and see if the 3rd node is shut down? Thanks. |
@LucaCinquini - I will work on testing this today. For future reference here is the details on consolidation: https://karpenter.sh/v1.0/concepts/disruption/#consolidation |
@LucaCinquini - I pushed some changes that seem to have worked. The main thing we will want to decide on is if it is okay to wait for the DAG setup task to run? Because with these changes there is no celery worker node available to run on, so the task has to wait for a node to be launched and initialized. |
@nikki-t : I think your last changes are good. To finalize, can you add a comment to the template file that explains what the implications of using 0 vs 1 are? By default, we should set the value to 0. |
@LucaCinquini - I added a comment to the template file. Let me know if you want me to add any further details. |
Not sure why the pre-commit GitHub action is failing, it seems to not be able to install a dependency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job :) Approving and merging.
Purpose
Proposed Changes
db_instance_class
variable to control the EC2 instance selected for RDSr5
instance family and8
CPUs and make consistent for allIssues
Testing
unity-venue-dev
undernikki-3
and ran integration tests: https://github.com/unity-sds/unity-sps/actions/runs/16196483391/job/45733952614unity_sps_ogc_processes_api_python_client.exceptions.NotFoundException: (404); Reason: Not Found
so I ran them on my local laptop and they complete successfully.nikki-3
deployment onunity-venue-dev
.