Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize Helms Charts #635

Merged
merged 7 commits into from
Nov 17, 2022
Merged

Standardize Helms Charts #635

merged 7 commits into from
Nov 17, 2022

Conversation

XciD
Copy link
Member

@XciD XciD commented Nov 10, 2022

  • Extract secret to use the helm chart without already existing secrets
  • Abstract storage to PV/PVC
  • Some helm refactoring

cc: @n1t0

@XciD XciD marked this pull request as draft November 10, 2022 07:58
@XciD XciD marked this pull request as ready for review November 10, 2022 14:45
@severo
Copy link
Collaborator

severo commented Nov 17, 2022

To be able to fully test (and then deploy to prod), I would need to have access to the pvc. I currently get

$ kubectl get pvc
Error from server (Forbidden): persistentvolumeclaims is forbidden: User "AWSReservedSSO_EKS-HUB-Tensorboard_855674a9053d4044:sylvain.lesage-huggingface.co" cannot list resource "persistentvolumeclaims" in API group "" in the namespace "datasets-server"

both in the prod and ephemeral clusters.

Could you help me with that @huggingface/infra-team?

@severo
Copy link
Collaborator

severo commented Nov 17, 2022

Done, thx @co42 :

$ kubectl get pvc
NAME                                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
datadir-datasets-server-dev-mongodb-0   Bound    pvc-a2eb0016-4d7b-4125-9f72-52dd790c5cce   8Gi        RWO            ebs-gp2        113d
nfs-datasets-server-pvc                 Bound    nfs-datasets-server-pv                     100Gi      RWX            nfs            7d

@severo
Copy link
Collaborator

severo commented Nov 17, 2022

Still an issue:

$ k exec -it datasets-server-dev-worker-first-rows-76f94b7f8d-8z45t -- mkdir /assets/test
Defaulted container "datasets-server-worker-first-rows" out of: datasets-server-worker-first-rows, prepare-assets (init), prepare-cache (init), prepare-numba-cache (init)
mkdir: cannot create directory ‘/assets/test’: Read-only file system
command terminated with exit code 1

Edit: It's my fault, due to the code factorization! 7f88e66

the other pods don't need to write in assets
@severo
Copy link
Collaborator

severo commented Nov 17, 2022

OK, tested on ephemeral and deployed in prod. It works as expected, the first-rows workers could write images in the new dataset I created to test: https://huggingface.co/datasets/severo/mnist/viewer/mnist/test

@severo severo merged commit 35a30db into main Nov 17, 2022
@severo severo deleted the helm-standardization branch November 17, 2022 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants