Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use kubernetes #227

Merged
merged 10 commits into from
May 9, 2022
Merged

Use kubernetes #227

merged 10 commits into from
May 9, 2022

Conversation

severo
Copy link
Collaborator

@severo severo commented May 3, 2022

See #223

This first PR only installs the API in the Kubernetes cluster. Other PRs will install 1. the workers and 2. the nginx reverse proxy

@severo severo marked this pull request as ready for review May 6, 2022 14:31
@severo severo requested a review from XciD May 6, 2022 14:33
@severo severo mentioned this pull request May 6, 2022
14 tasks
@XciD
Copy link
Member

XciD commented May 9, 2022

Neat. Good job @severo

Your Kubernetes documentation should be exported in some notion IMO, it can be used for other projects (like hub)

cc: @julien-c

@XciD
Copy link
Member

XciD commented May 9, 2022

Did you try to install it twice? with another domain? it's a good test to see if your helm chart works with multiple instances

@severo severo merged commit ef3c83d into main May 9, 2022
@severo severo deleted the kube branch May 9, 2022 07:30
severo added a commit that referenced this pull request May 9, 2022
this way, two instances of datasets-server can run on the same
namespace, with different domain names. Fixes
#227 (comment)
@severo
Copy link
Collaborator Author

severo commented May 9, 2022

Did you try to install it twice? with another domain?

you're right: it does not work because I use the name of the chart, not of the release.

Fixed: 7abd07d. It works:

Capture d’écran 2022-05-09 à 14 17 16

severo added a commit that referenced this pull request May 11, 2022
* docs: ✏️ add doc about AWS configure, eks, ecr

* docs: ✏️ add doc about kubernetes

* docs: ✏️ reorganize the files

* docs: ✏️ add tools

* feat: 🎸 add the structure for the datasets-server helm chart

* feat: 🎸 initial helm chart

* fix: 🐛 fix helm chart thanks to @XciD comments

* feat: 🎸 add workers

* feat: 🎸 fix NFS server

* test: 💍 disable a test for now

* fix: 🐛 remove file!

* feat: 🎸 fix images, and remove nfs for now

* feat: 🎸 never stop the worker

* feat: 🎸 upgrade the worker image (loop)

also: fix the dev configuration (typo on the name)

* fix: 🐛 name the objects with the chart + release

this way, two instances of datasets-server can run on the same
namespace, with different domain names. Fixes
#227 (comment)

* Nfs (#242)

* docs: ✏️ clarify doc about Release and Instance name

* fix: 🐛 use the release name to name all the objects

this way, mongo, which prefixes its name with the release, appears in
the same "group" of pods when ordered alphabetically

* fix: 🐛 remove "public," from cache-control

it is not needed (public is only meant to force public caching for
authenticated requests, which is not the case). And it prevented nginx
to correctly cache the responses based on cache-control.

* refactor: 💡 create one template directory per service

Also: use a static name for the containers (see
7abd07d#r73243800)

* feat: 🎸 ignore cpu and ram tests if MAX set to 0

Also: add doc to allow using HF_DATASETS_CACHE and HF_MODULES_CACHE to
set the datasets library cache directories.

* feat: 🎸 add NFS for /assets and dataset library /cache

Also: remove the tests for the load (cpu/ram) in the workers. Note: we
run the containers with user 1000 and group 3000, and use an
initContainer to ensure the mounted volumes from NFS use these user and
group. See #241 for
a possible security improvement on this.

* fix: 🐛 avoid alpine image, and fix the version

see https://www.google.com/search?q=why+not+use+alpine for why not using
alpine

* fix: 🐛 use a sanitized version of .Release.Name

because the length of the k8s names is limited

* refactor: 💡 move the mongodb URL to a dedicated variable

* fix: 🐛 use Recreate strategy for workers

* docs: ✏️ add a one-liner to create the secret
mattstern31 added a commit to mattstern31/datasets-server-storage-admin that referenced this pull request Nov 11, 2023
* docs: ✏️ add doc about AWS configure, eks, ecr

* docs: ✏️ add doc about kubernetes

* docs: ✏️ reorganize the files

* docs: ✏️ add tools

* feat: 🎸 add the structure for the datasets-server helm chart

* feat: 🎸 initial helm chart

* fix: 🐛 fix helm chart thanks to @XciD comments

* feat: 🎸 add workers

* feat: 🎸 fix NFS server

* test: 💍 disable a test for now

* fix: 🐛 remove file!

* feat: 🎸 fix images, and remove nfs for now

* feat: 🎸 never stop the worker

* feat: 🎸 upgrade the worker image (loop)

also: fix the dev configuration (typo on the name)

* fix: 🐛 name the objects with the chart + release

this way, two instances of datasets-server can run on the same
namespace, with different domain names. Fixes
huggingface/dataset-viewer#227 (comment)

* Nfs (#242)

* docs: ✏️ clarify doc about Release and Instance name

* fix: 🐛 use the release name to name all the objects

this way, mongo, which prefixes its name with the release, appears in
the same "group" of pods when ordered alphabetically

* fix: 🐛 remove "public," from cache-control

it is not needed (public is only meant to force public caching for
authenticated requests, which is not the case). And it prevented nginx
to correctly cache the responses based on cache-control.

* refactor: 💡 create one template directory per service

Also: use a static name for the containers (see
huggingface/dataset-viewer@7abd07d#r73243800)

* feat: 🎸 ignore cpu and ram tests if MAX set to 0

Also: add doc to allow using HF_DATASETS_CACHE and HF_MODULES_CACHE to
set the datasets library cache directories.

* feat: 🎸 add NFS for /assets and dataset library /cache

Also: remove the tests for the load (cpu/ram) in the workers. Note: we
run the containers with user 1000 and group 3000, and use an
initContainer to ensure the mounted volumes from NFS use these user and
group. See huggingface/dataset-viewer#241 for
a possible security improvement on this.

* fix: 🐛 avoid alpine image, and fix the version

see https://www.google.com/search?q=why+not+use+alpine for why not using
alpine

* fix: 🐛 use a sanitized version of .Release.Name

because the length of the k8s names is limited

* refactor: 💡 move the mongodb URL to a dedicated variable

* fix: 🐛 use Recreate strategy for workers

* docs: ✏️ add a one-liner to create the secret
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants