We run our Archivematica instance on EKS, spread across 5 pods (one runs the dashboard, storage service, and MCP server; the others run MCP clients). All of these services need access to a shared filesystem due to the way Archivematica expects them to pass information between each other. This is achieved using Kubernetes volume claims, with the underlying Kubernetes volumes using EBS storage. However, EBS volumes can only connect to one EC2 instance at a time, which means only one of the three nodes on which our EKS cluster runs can access them. This in turn means all five Archivematica pods must run on that node, which causes a handful of problems:
- If something goes wrong on that node Kubernetes can't gracefully move the pods to another one because it won't have access to the volumes.
- We can't take advantage of horizontal scaling, which is cheaper than vertical scaling (e.g., presently our file conversions sometimes fail because the node they're running on runs out of memory; using a larger EC2 instance to fix that is more expensive than spinning up an additional smaller instance to spread out the pods more, but we can't do that because we can't spread out the pods at all).
This issue could be resolved by using EFS volumes instead of EBS volumes. EFS allows multiple EC2 instances to connect to the same filesystem. It also doesn't require provisioning storage in advance, instead just providing what is necessary. This is a tradeoff; our nodes won't crash from running out of disk space, but we may need to be more vigilant about temporary files lingering longer than necessary and costing us money. EFS is also more expensive than EBS ($0.30 per GB vs $0.08 per GB), but currently we only use between a third and a quarter of the provisioned storage on the nodes that run our Archivematica cluster, so the cost of switching to EFS should be minimal if anything.
We run our Archivematica instance on EKS, spread across 5 pods (one runs the dashboard, storage service, and MCP server; the others run MCP clients). All of these services need access to a shared filesystem due to the way Archivematica expects them to pass information between each other. This is achieved using Kubernetes volume claims, with the underlying Kubernetes volumes using EBS storage. However, EBS volumes can only connect to one EC2 instance at a time, which means only one of the three nodes on which our EKS cluster runs can access them. This in turn means all five Archivematica pods must run on that node, which causes a handful of problems:
This issue could be resolved by using EFS volumes instead of EBS volumes. EFS allows multiple EC2 instances to connect to the same filesystem. It also doesn't require provisioning storage in advance, instead just providing what is necessary. This is a tradeoff; our nodes won't crash from running out of disk space, but we may need to be more vigilant about temporary files lingering longer than necessary and costing us money. EFS is also more expensive than EBS ($0.30 per GB vs $0.08 per GB), but currently we only use between a third and a quarter of the provisioned storage on the nodes that run our Archivematica cluster, so the cost of switching to EFS should be minimal if anything.