diff --git a/docs/source/guides/configuration.rst b/docs/source/guides/configuration.rst index a8b2d28774f..ab978ca18f9 100644 --- a/docs/source/guides/configuration.rst +++ b/docs/source/guides/configuration.rst @@ -2,36 +2,60 @@ Configuring BentoML =================== -BentoML starts with an out-of-the-box configuration that works for common use cases. For advanced users, many -features can be customized through configuration. Both BentoML CLI and Python APIs can be customized -by the configuration. Configuration is best used for scenarios where the customizations can be specified once -and applied to the entire team. +BentoML provides a configuration interface that allows you to customize the runtime +behaviour of your BentoService. This article highlight and consolidates the configuration +fields definition, as well as some of recommendation for best-practice when configuring +your BentoML. -BentoML configuration is defined by a YAML file placed in a directory specified by the ``BENTOML_CONFIG`` -environment variable. The example below starts the bento server with configuration defined in ``~/bentoml_configuration.yaml``: + Configuration is best used for scenarios where the customizations can be specified once + and applied anywhere among your organization using BentoML. -.. code-block:: shell +BentoML comes with out-of-the-box configuration that should work for most use cases. - $ BENTOML_CONFIG=~/bentoml_configuration.yaml bentoml serve iris_classifier:latest +However, for more advanced users who wants to fine-tune the feature suites BentoML has to offer, +users can configure such runtime variables and settings via a configuration file, often referred to as +``bentoml_configuration.yaml``. -Users only need to specify a partial configuration with only the properties they wish to customize instead -of a full configuration schema. In the example below, the microbatching workers count is overridden to 4. -Remaining properties will take their defaults values. +.. note:: + + This is not to be **mistaken** with the ``bentofile.yaml`` which is used to define and + package your :ref:`Bento 🍱 ` + + This configuration file are for BentoML runtime configuration. + +Providing configuration during serve runtime +-------------------------------------------- + +BentoML configuration is a :wiki:`YAML` file which can then be specified via the environment variable ``BENTOML_CONFIG``. + +For example, given the following ``bentoml_configuration.yaml`` that specify that the +server should only use 4 workers: .. code-block:: yaml :caption: `~/bentoml_configuration.yaml` - api_server: - workers: 4 - timeout: 60 - http: - port: 6000 + version: 2 + api_server: + workers: 4 + +Said configuration then can be parsed to :ref:`bentoml serve ` like +below: + +.. code-block:: bash -Throughout the BentoML documentation, features that are customizable through configuration are demonstrated -like the example above. For a full configuration schema including all customizable properties, refer to -the BentoML configuration template defined in :github:`default_configuration.yml `. + » BENTOML_CONFIG=~/bentoml_configuration.yaml bentoml serve iris_classifier:latest --production + +.. note:: + Users will only have to specify a partial configuration with properties they wish to customize. BentoML + will then fill in the rest of the configuration with the default values. + In the example above, the number of API workers count is overridden to 4. + Remaining properties will take their defaults values. + +.. seealso:: + + :ref:`guides/configuration:Configuration fields` Overrding configuration with environment variables @@ -63,25 +87,81 @@ Which the override configuration will be intepreted as: :alt: Configuration override environment variable -Docker Deployment ------------------ +Mounting configuration to containerized Bento +--------------------------------------------- + +To mount a configuration file to a containerized BentoService, user can use the +|volume_mount|_ option to mount the configuration file to the container and +|env_flag|_ option to set the ``BENTOML_CONFIG`` environment variable: + +.. code-block:: bash + + $ docker run --rm -v /path/to/configuration.yml:/home/bentoml/configuration.yml \ + -e BENTOML_CONFIG=/home/bentoml/configuration.yml \ + iris_classifier:6otbsmxzq6lwbgxi serve --production + +Voila! You have successfully mounted a configuration file to your containerized BentoService. + +.. _env_flag: https://docs.docker.com/engine/reference/commandline/run/#set-environment-variables--e---env---env-file + +.. |env_flag| replace:: ``-e`` + +.. _volume_mount: https://docs.docker.com/storage/volumes/#choose-the--v-or---mount-flag + +.. |volume_mount| replace:: ``-v`` + + +Configuration fields +-------------------- + +This section defines the configuration specs for BentoML. + +BentoML configuration provides a versioning specs, which enables users to easily specify +and upgrade their configuration file as BentoML evolves. One can specify the version of +the configuration file by adding a top level ``version`` field to ``bentoml_configuration.yaml``: + +.. code-block:: yaml + :caption: `~/bentoml_configuration.yaml` + + version: 2 + +.. epigraph:: + + Note that ``version`` is not a required field, and BentoML will default to version 1 if + it is not specified. This is mainly for backward compatibility with older configuration. + However, we encourage users to always use the latest version of BentoML to ensure the best experience. + +On the top level, BentoML configuration is split into two sections: + +* ``api_server``: Configuration for BentoML API server. + +* ``runners``: Configuration for BentoService runners. + +.. tab-set:: + + .. tab-item:: version 2 + :sync: v2 + + .. include:: ./snippets/configuration/v2.rst + + .. tab-item:: version 1 + :sync: v1 + + .. include:: ./snippets/configuration/v1.rst -Configuration file can be mounted to the Docker container using the `-v` option and specified to the BentoML -runtime using the `-e` environment variable option. +.. dropdown:: `Expands for default configuration` + :icon: code -.. code-block:: shell + .. tab-set:: - $ docker run -v /local/path/configuration.yml:/home/bentoml/configuration.yml -e BENTOML_CONFIG=/home/bentoml/configuration.yml + .. tab-item:: version 2 + :sync: v2 + .. literalinclude:: ../../../bentoml/_internal/configuration/v2/defaults.yaml + :language: yaml -.. spelling:: + .. tab-item:: version 1 + :sync: v1 - customizations - microbatching - customizable - multiproc - dir - tls - apiserver - uri - gcs + .. literalinclude:: ../../../bentoml/_internal/configuration/v1/defaults.yaml + :language: yaml diff --git a/docs/source/guides/grpc.rst b/docs/source/guides/grpc.rst index e89a75917b2..0ccc3ffe99b 100644 --- a/docs/source/guides/grpc.rst +++ b/docs/source/guides/grpc.rst @@ -1342,6 +1342,7 @@ A quick overview of the available configuration for gRPC: ``max_concurrent_streams`` ^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. epigraph:: :bdg-info:`Definition:` Maximum number of concurrent incoming streams to allow on a HTTP2 connection. By default we don't set a limit cap. HTTP/2 connections typically has limit of `maximum concurrent streams `_ @@ -1370,6 +1371,7 @@ on a connection at one time. ``maximum_concurrent_rpcs`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. epigraph:: :bdg-info:`Definition:` The maximum number of concurrent RPCs this server will service before returning ``RESOURCE_EXHAUSTED`` status. By default we set to ``None`` to indicate no limit, and let gRPC to decide the limit. @@ -1379,6 +1381,7 @@ By default we set to ``None`` to indicate no limit, and let gRPC to decide the l ``max_message_length`` ^^^^^^^^^^^^^^^^^^^^^^ +.. epigraph:: :bdg-info:`Definition:` The maximum message length in bytes allowed to be received on/can be send to the server. By default we set to ``-1`` to indicate no limit. diff --git a/docs/source/guides/snippets/configuration/v1.rst b/docs/source/guides/snippets/configuration/v1.rst new file mode 100644 index 00000000000..e69de29bb2d diff --git a/docs/source/guides/snippets/configuration/v2.rst b/docs/source/guides/snippets/configuration/v2.rst new file mode 100644 index 00000000000..7c57c6d55c7 --- /dev/null +++ b/docs/source/guides/snippets/configuration/v2.rst @@ -0,0 +1,11 @@ +``api_server`` +^^^^^^^^^^^^^^ + +The following options are available for the ``api_server`` section: + ++-------------+-------------------------------------+-----------------+ +| Option | Description | Default | ++-------------+-------------------------------------+-----------------+ +| ``workers`` | Number of API workers for to spawn | None (which will be determined by BentoML) ++-------------+-------------------------------------+-----------------+ +``timeout`` diff --git a/docs/source/guides/snippets/configuration/v2/api_server.yaml b/docs/source/guides/snippets/configuration/v2/api_server.yaml new file mode 100644 index 00000000000..09f90046892 --- /dev/null +++ b/docs/source/guides/snippets/configuration/v2/api_server.yaml @@ -0,0 +1,98 @@ +api_server: + workers: ~ # cpu_count() will be used when null + timeout: 60 + backlog: 2048 + metrics: + enabled: true + namespace: bentoml_api_server + duration: + # https://github.com/prometheus/client_python/blob/f17a8361ad3ed5bc47f193ac03b00911120a8d81/prometheus_client/metrics.py#L544 + buckets: + [ + 0.005, + 0.01, + 0.025, + 0.05, + 0.075, + 0.1, + 0.25, + 0.5, + 0.75, + 1.0, + 2.5, + 5.0, + 7.5, + 10.0, + ] + min: ~ + max: ~ + factor: ~ + logging: + access: + enabled: true + request_content_length: true + request_content_type: true + response_content_length: true + response_content_type: true + format: + trace_id: 032x + span_id: 016x + ssl: + enabled: false + certfile: ~ + keyfile: ~ + keyfile_password: ~ + ca_certs: ~ + version: 17 # ssl.PROTOCOL_TLS_SERVER + cert_reqs: 0 # ssl.CERT_NONE + ciphers: TLSv1 # default ciphers + http: + host: 0.0.0.0 + port: 3000 + cors: + enabled: false + allow_origin: ~ + allow_credentials: ~ + allow_methods: ~ + allow_headers: ~ + allow_origin_regex: ~ + max_age: ~ + expose_headers: ~ + grpc: + host: 0.0.0.0 + port: 3000 + max_concurrent_streams: ~ + maximum_concurrent_rpcs: ~ + max_message_length: -1 + reflection: + enabled: false + metrics: + host: 0.0.0.0 + port: 3001 + tracing: + exporter_type: ~ + sample_rate: ~ + excluded_urls: ~ + timeout: ~ + max_tag_value_length: ~ + zipkin: + endpoint: ~ + jaeger: + protocol: thrift + collector_endpoint: ~ + thrift: + agent_host_name: ~ + agent_port: ~ + udp_split_oversized_batches: ~ + grpc: + insecure: ~ + otlp: + protocol: ~ + endpoint: ~ + compression: ~ + http: + certificate_file: ~ + headers: ~ + grpc: + headers: ~ + insecure: ~