New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hardware errors on toxis #51
Comments
@mtelvers I've launched an instance, but what should I call it? Is toxis a replacement for what was formerly |
|
We only need an internal name for it at Scaleway, as it will have many external DNS names. Why not just |
ok i set it up with the internal name of opam-repo-ci.sw.ocaml.org, and we can clean up all the other names once you've got it running. We should move the ocluster scheduler as well at some point into this namespace... |
The new instance is an experimental ARM (Graviton2) based setup that Scaleway says in on a trial basis, so we may have to migrate again in the future. But it's half the price and half the energy usage, so much better than the old toxis! |
@avsm I'm nearly ready to make the switch over. I will try to maintain the current state from For reference the complete list is:
|
@mtelvers I've switched over opam-repo.ci.ocaml.org and opam.ci.ocaml.org to point to opam-repo-ci.sw.ocaml.org now. |
@avsm Thank you for your help. The switchover of these services is complete. |
Splendid! I'll keep an eye on the new ARM infrastructure VM. It seems like a good addition to their lineup. |
@avsm thank you very much for you help with provisioning a new machine and getting this all switched over. |
The machine
toxis
has multiple hardware issues. The following services have been affected:Issues:
The machine has a spare spinning disk, which has been brought into service with a copy of
/var/lib/docker
, but due to size constraints, the job log output,var/job
, has not been copied.The current configuration is 2 x 18 core CPUs giving 72 threads with 512GB RAM and 1.8TB SSD. Historically,
toxis
also performed the solves locally, but this has recently been migrated to the solver-service; therefore, a smaller machine is required. We know that Opam Repo CI requires > 32GB of RAM, which is why it was migrated totoxis
.The suggested new specification is:
The text was updated successfully, but these errors were encountered: