Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various pipeline fixes for CCR reliability #1863

Merged
merged 7 commits into from
Sep 9, 2021
Merged

Conversation

alexwlchan
Copy link
Contributor

Closes wellcomecollection/platform#5295, closes #1862

  • Image inferrers now get dedicated capacity providers for TEI on/off (h/t @alicefuzier)
  • We now get the version of Elasticsearch that the API is using, so we never step ahead of it. This means CCR can't get its versioning in a twist.
  • We now run two, smaller Elasticsearch nodes in the pipeline cluster when we're not reindexing – for high availability and offset costs
  • We no longer depend on the Elasticsearch cluster for all the pipeline resources. This is a "clever idea" I had when setting this up that means services won't be created before the relevant secrets are ready, but:
    1. It creates unreadable Terraform plans.
    2. If those secrets aren't ready, the services will just restart until they are. Not a big issue.

@@ -1,12 +1,33 @@
resource "aws_ecs_cluster" "cluster" {
name = local.namespace_hyphen
capacity_providers = [module.inference_capacity_provider.name]
capacity_providers = [module.inference_capacity_provider_tei_off.name, module.inference_capacity_provider_tei_on.name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All @alicefuzier’s work :D

@alexwlchan alexwlchan merged commit d01db1f into main Sep 9, 2021
@alexwlchan alexwlchan deleted the pipeline-2021-09-09 branch September 9, 2021 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Always using the latest version of Elasticsearch for the pipeline cluster can break CCR
4 participants