Query parameter naming scheme? #712

betatim · 2018-11-07T09:44:19Z

Should we agree a "standard" way of extending the BinderHub URL schema?

The particular use case for me is that I'd like to be able to pass extra information to the spawner via the launch URL. For example which of a set of resource constraints to use or what set of extra volumes to mount on launch.

Right now thinking something like:

example.com/v2/gh/binder-examples/requirements/master?resources=lots-more&volumes=all-of-them
example.com/v2/gh/binder-examples/requirements/master?resources=small-cpu&volumes=none

with the spawner knowing how to convert "lots-more", "small-cpu", etc into a set of actual constraints /volume specs.

Right now I can pick essentially any name but I was wondering if we should agree that (say) the x- prefix will be reserved for people to do "what they want" and BinderHub itself will only use none prefixed query parameters.

The text was updated successfully, but these errors were encountered:

minrk · 2018-11-08T09:13:20Z

I think at least recommending that extensions use x- is a good start.

jhamman · 2019-01-04T01:36:21Z

Reviving this conversation again as it seems to address most of #731 and #759. I'm personally not very experienced in the area of building REST-like api. Maybe someone who has some experience in this area could suggest a path toward evaluating schemas?

yuvipanda · 2019-01-17T17:07:26Z

There's also the option of passing JSON as a 'resourceRequests' parameter.

yuvipanda · 2019-01-17T18:18:03Z

Kubernetes resources are pretty much key value pairs of string -> integer, with some special casing. If you read https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/, you'll see that:

CPU, RAM and 'ephemeral storage' are special cased
Other plugins can define and use any arbitrary values here - see https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#consuming-extended-resources

I think we should do something similar, and define in the 'binderhub runtime parameters spec' a key-value pair mapping for resources, with some well known keys as special case. I like that the values are integers with a small base unit (byte for memory or microcpu for CPU) - this keeps us out of floating point precision issues, and translates well to how most resource managers see these resources.

These can then be encoded in the URL in many ways. My favorite way is to just repeat the parameter, and use get_query_arguments to get it as a list. So you'd end up with:

?resourceRequest=cpu:5000&resourceRequest=memory:1024 in the URL, and can read that properly. This also makes it pretty extensible.

For the UI, operators can define commonly used combinations of requests into human readable names that are shown in the UI. This makes it user friendly, while making the launch links themselves be consistent across binderhub installations and time (as ideas of what is 'small-cpu' change).

Other things we should define are:

What happens when user asks for more resources of a kind than we want to provide them?
What happens when user asks for a resource kind we don't have?

We should also define a way for repos to 'suggest' runtime parameters (including resources) in a file in the repo, but that's probably a different issue. Thanks to @craig-willis for bringing that up!

jhamman · 2019-01-17T18:54:45Z

Thanks @yuvipanda - this all makes sense.

What happens when user asks for more resources of a kind than we want to provide them?

In additional to defaults for specific resources, I think we should also have maximum requests.

What happens when user asks for a resource kind we don't have?

I don't think we want binderhub to know very much about the resources available for user sessions. Instead, we should rely on kubernetes to let us know if specific resources are not available. We should endeavor to help users understand what "Unschedulable" means

minrk · 2019-01-18T12:55:19Z

I think there are a couple of questions about limits and what to do when they are exceeded in different ways:

propagating informative launch failure messages from the Spawner all the way up through jupyterhub and binderhub to the ui. This may require some work in all of KubeSpawner, JupyterHub, and BinderHub (it's possible that JupyterHub already propagates Spawner error messages via the api, but I'm not sure).
a BinderHub deployment probably want to set per-launch limits, e.g. users may not request more than 2G ram, 1 cpu, etc.
what to do when limits are exceeded (not failed spawn at the kube level)? Do we fail informatively? Or do we semi-quietly use the lower of the two: mem_request = min(self.max_mem, requested_mem)?

jhamman · 2019-01-19T00:40:25Z

+1 on (1) and (2).

what to do when limits are exceeded (not failed spawn at the kube level)?

I think binderhub should explicitly fail in this case. I think it would be confusing to request 4GB of ram and then have an application die from a memory error at 2GB.

scottyhq · 2021-02-16T17:36:06Z

Was recently looking into facilitating getting users running binderhub with a GPU and ran into this issue. I think this functionality would be amazing!

A bit more background on a specific use-case: We are already running a binderhub, which has a single 'user' nodegroup and the resource requests are fixed. I'd like to add a second 'user-gpu' nodegroup that pods get scheduled on only if a GPU is needed as we do in jupyterhub. Current options seem to be 1) run an entirely separate binderhub or 2) refer people to Google Colab :(

For JupyterHub this is simply a matter of of adding some key-value pairs under kubespawner override:

  jupyterhub:
    hub:
    singleuser:
      profileList:
          kubespawner_override:
            environment: {'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility'}
            tolerations: [{'key': 'nvidia.com/gpu','operator': 'Equal','value': 'present','effect': 'NoSchedule'}]
            extra_resource_limits: {"nvidia.com/gpu": "1"}

(https://github.com/pangeo-data/pangeo-cloud-federation/blob/ec5358714c29db574319b26aed952a90dad3f700/deployments/icesat2/config/prod.yaml#L14-L20)

If I understand the flow correctly I think a workaround can be implemented by adding some custom code to the helm chart spawner configuration here by parsing the URL and adding the right kubespawner_override settings?

binderhub/helm-chart/binderhub/values.yaml

Line 81 in 69da78d

class BinderSpawner(KubeSpawner):

. Exposing options like this to users on the landing page and implementing limit checking would be fantastic, but of course more complicated!

betatim mentioned this issue Nov 15, 2018

Select pod resources from binder UI #731

Open

betatim mentioned this issue Dec 26, 2018

Custom resource requests #759

Open

bitnik mentioned this issue Feb 19, 2019

a binder deployment with authentication and persistent storage #794

Closed

sgibson91 mentioned this issue Mar 6, 2019

Public can use a Turing-hosted BinderHub to verify the Reproducibility of Turing Research the-turing-way/the-turing-way#297

Closed

4 tasks

bitnik mentioned this issue Mar 13, 2019

pass allowed query parameters to spawner via user options #805

Open

betatim mentioned this issue Apr 1, 2019

Allow for disabling of nbviewer #816

Closed

bitnik mentioned this issue Apr 3, 2019

[WIP] possible solution to propagate informative spawn failure messages from spawner to bhub ui #819

Closed

pdurbin mentioned this issue Dec 5, 2019

Add Dataverse to UI. Fixes #900 #969

Merged

sgibson91 mentioned this issue Jun 14, 2020

Provide greater computational resources for authenticated Turing users alan-turing-institute/hub23-deploy#249

Closed

2 tasks

yuvipanda mentioned this issue May 21, 2024

[Initiative] Ephemeral sessions 2i2c-org/infrastructure#4109

Closed

yuvipanda mentioned this issue May 29, 2024

[EPIC] Allow selection of resources via URL parameters 2i2c-org/infrastructure#4151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query parameter naming scheme? #712

Query parameter naming scheme? #712

betatim commented Nov 7, 2018

minrk commented Nov 8, 2018

jhamman commented Jan 4, 2019

yuvipanda commented Jan 17, 2019

yuvipanda commented Jan 17, 2019

jhamman commented Jan 17, 2019

minrk commented Jan 18, 2019

jhamman commented Jan 19, 2019

scottyhq commented Feb 16, 2021

Query parameter naming scheme? #712

Query parameter naming scheme? #712

Comments

betatim commented Nov 7, 2018

minrk commented Nov 8, 2018

jhamman commented Jan 4, 2019

yuvipanda commented Jan 17, 2019

yuvipanda commented Jan 17, 2019

jhamman commented Jan 17, 2019

minrk commented Jan 18, 2019

jhamman commented Jan 19, 2019

scottyhq commented Feb 16, 2021