Skip to content

Commit

Permalink
update configuration docs, fix some docstrings (#1530)
Browse files Browse the repository at this point in the history
* update configuration docs, fix some docstrings

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* update copy

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

* add config init command

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>

---------

Signed-off-by: Niels Bantilan <niels.bantilan@gmail.com>
  • Loading branch information
cosmicBboy committed Mar 2, 2023
1 parent 6b56fb5 commit 71d436a
Show file tree
Hide file tree
Showing 8 changed files with 99 additions and 54 deletions.
1 change: 0 additions & 1 deletion docs/source/conf.py
Expand Up @@ -56,7 +56,6 @@
"sphinx.ext.graphviz",
"sphinx-prompt",
"sphinx_copybutton",
"sphinx_fontawesome",
"sphinx_panels",
"sphinxcontrib.yt",
"sphinx_tags",
Expand Down
2 changes: 2 additions & 0 deletions docs/source/design/control_plane.rst
Expand Up @@ -88,6 +88,8 @@ The ``for_endpoint`` method also accepts:
* ``data_config``: can be used to configure how data is downloaded or uploaded to a specific blob storage like S3, GCS, etc.
* ``config_file``: the path to the configuration file to use.

.. _general_initialization:

Generalized Initialization
==========================

Expand Down
5 changes: 3 additions & 2 deletions docs/source/extras.tensorflow.rst
@@ -1,6 +1,7 @@
############
###############
TensorFlow Type
############
###############

.. automodule:: flytekit.extras.tensorflow
:no-members:
:no-inherited-members:
Expand Down
24 changes: 0 additions & 24 deletions docs/source/pyflyte.rst
Expand Up @@ -5,27 +5,3 @@ Pyflyte CLI
.. click:: flytekit.clis.sdk_in_container.pyflyte:main
:prog: pyflyte
:nested: full

.. click:: flytekit.clis.sdk_in_container.init:init
:prog: pyflyte init
:nested: full

.. click:: flytekit.clis.sdk_in_container.local_cache:local_cache
:prog: pyflyte local-cache
:nested: full

.. click:: flytekit.clis.sdk_in_container.package:package
:prog: pyflyte package
:nested: full

.. click:: flytekit.clis.sdk_in_container.register:register
:prog: pyflyte register
:nested: full

.. click:: flytekit.clis.sdk_in_container.run:run
:prog: pyflyte run
:nested: none

.. click:: flytekit.clis.sdk_in_container.serialize:serialize
:prog: pyflyte serialize
:nested: full
103 changes: 78 additions & 25 deletions flytekit/configuration/__init__.py
Expand Up @@ -5,28 +5,72 @@
.. currentmodule:: flytekit.configuration
Flytekit Configuration Ecosystem
--------------------------------
Flytekit Configuration Sources
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Where can configuration come from?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are multiple ways to configure flytekit settings:
- Command line arguments. This is the ideal location for settings to go. (See ``pyflyte package --help`` for example.)
- Environment variables. Users can specify these at compile time, but when your task is run, Flyte Propeller will also set configuration to ensure correct interaction with the platform.
- A config file - an INI style configuration file. By default, flytekit will look for a file in two places
1. First, a file named ``flytekit.config`` in the Python interpreter's starting directory
2. A file in ``~/.flyte/config`` in the home directory as detected by Python.
**Command Line Arguments**: This is the recommended way of setting configuration values for many cases.
For example, see `pyflyte package <pyflyte.html#pyflyte-package>`_ command.
**Python Config Object**: A :py:class:`~flytekit.configuration.Config` object can by used directly, e.g. when
initializing a :py:class:`~flytefit.remote.remote.FlyteRemote` object. See :doc:`here <design/control_plane>` for examples on
how to specify a ``Config`` object.
**Environment Variables**: Users can specify these at compile time, but when your task is run, Flyte Propeller will
also set configuration to ensure correct interaction with the platform. The environment variables must be specified
with the format ``FLYTE_{SECTION}_{OPTION}``, all in upper case. For example, to specify the
:py:class:`PlatformConfig.endpoint <flytekit.configuration.PlatformConfig>` setting, the environment variable would
be ``FLYTE_PLATFORM_URL``.
.. note::
Environment variables won't work for image configuration, which need to be specified with the
`pyflyte package --image ... <pyflyte.html#cmdoption-pyflyte-package-i>`_ option or in a configuration
file.
**YAML Format Configuration File**: A configuration file that contains settings for both
`flytectl <https://docs.flyte.org/projects/flytectl/>`__ and ``flytekit``. This is the recommended configuration
file format. Invoke the :ref:`flytectl config init <flytectl_config_init>` command to create a boilerplate
``~/.flyte/config.yaml`` file, and ``flytectl --help`` to learn about all of the configuration yaml options.
.. dropdown:: See example ``config.yaml`` file
:title: text-muted
:animate: fade-in-slide-down
.. literalinclude:: ../../tests/flytekit/unit/configuration/configs/sample.yaml
:language: yaml
:caption: config.yaml
**INI Format Configuration File**: A configuration file for ``flytekit``. By default, ``flytekit`` will look for a
file in two places:
1. First, a file named ``flytekit.config`` in the Python interpreter's working directory.
2. A file in ``~/.flyte/config`` in the home directory as detected by Python.
.. dropdown:: See example ``flytekit.config`` file
:title: text-muted
:animate: fade-in-slide-down
.. literalinclude:: ../../tests/flytekit/unit/configuration/configs/images.config
:language: ini
:caption: flytekit.config
.. warning::
The INI format configuration is considered a legacy configuration format. We recommend using the yaml format
instead if you're using a configuration file.
How is configuration used?
^^^^^^^^^^^^^^^^^^^^^^^^^^
Configuration usage can roughly be bucketed into the following areas,
- Compile-time settings - things like the default image, where to look for Flyte code, etc.
- Platform settings - Where to find the Flyte backend (Admin DNS, whether to use SSL)
- Run time (registration) settings - these are things like the K8s service account to use, a specific S3/GCS bucket to write off-loaded data (dataframes and files) to, notifications, labels & annotations, etc.
- Data access settings - Is there a custom S3 endpoint in use? Backoff/retry behavior for accessing S3/GCS, key and password, etc.
- Other settings - Statsd configuration, which is a run-time applicable setting but is not necessarily relevant to the Flyte platform.
- **Compile-time settings**: these are settings like the default image and named images, where to look for Flyte code, etc.
- **Platform settings**: Where to find the Flyte backend (Admin DNS, whether to use SSL)
- **Registration Run-time settings**: these are things like the K8s service account to use, a specific S3/GCS bucket to write off-loaded data (dataframes and files) to, notifications, labels & annotations, etc.
- **Data access settings**: Is there a custom S3 endpoint in use? Backoff/retry behavior for accessing S3/GCS, key and password, etc.
- **Other settings** - Statsd configuration, which is a run-time applicable setting but is not necessarily relevant to the Flyte platform.
Configuration Objects
---------------------
Expand All @@ -42,8 +86,15 @@
.. _configuration-compile-time-settings:
Compilation (Serialization) Time Settings
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Serialization Time Settings
^^^^^^^^^^^^^^^^^^^^^^^^^^^
These are serialization/compile-time settings that are used when using commands like
`pyflyte package <pyflyte.html#pyflyte-package>`_ or `pyflyte register <pyflyte.html#pyflyte-register>`_. These
configuration settings are typically passed in as flags to the above CLI commands.
The image configurations are typically either passed in via an `--image <pyflyte.html#cmdoption-pyflyte-package-i>`_ flag,
or can be specified in the ``yaml`` or ``ini`` configuration files (see examples above).
.. autosummary::
:template: custom.rst
Expand All @@ -60,6 +111,10 @@
Execution Time Settings
^^^^^^^^^^^^^^^^^^^^^^^
Users typically shouldn't be concerned with these configurations, as they are typically set by FlytePropeller or
FlyteAdmin. The configurations below are useful for authenticating to a Flyte backend, configuring data access
credentials, secrets, and statsd metrics.
.. autosummary::
:template: custom.rst
:toctree: generated/
Expand All @@ -71,7 +126,6 @@
~S3Config
~GCSConfig
~DataConfig
~Config
"""
from __future__ import annotations
Expand Down Expand Up @@ -190,10 +244,9 @@ def find_image(self, name) -> Optional[Image]:
def validate_image(_: typing.Any, param: str, values: tuple) -> ImageConfig:
"""
Validates the image to match the standard format. Also validates that only one default image
is provided. a default image, is one that is specified as
default=img or just img. All other images should be provided with a name, in the format
name=img
This method can be used with the CLI
is provided. a default image, is one that is specified as ``default=<image_uri>`` or just ``<image_uri>``. All
other images should be provided with a name, in the format ``name=<image_uri>`` This method can be used with the
CLI
:param _: click argument, ignored here.
:param param: the click argument, here should be "image"
Expand Down Expand Up @@ -266,7 +319,8 @@ def from_images(cls, default_image: str, m: typing.Optional[typing.Dict[str, str
{
"spark": "ghcr.io/flyteorg/myspark:...",
"other": "...",
})
}
)
:return:
"""
Expand Down Expand Up @@ -557,7 +611,7 @@ def auto(cls, config_file: typing.Union[str, ConfigFile, None] = None) -> Config
@classmethod
def for_sandbox(cls) -> Config:
"""
Constructs a new Config object specifically to connect to :std:ref:`deploy-sandbox-local`.
Constructs a new Config object specifically to connect to :std:ref:`deployment-deployment-sandbox`.
If you are using a hosted Sandbox like environment, then you may need to use port-forward or ingress urls
:return: Config
"""
Expand Down Expand Up @@ -619,15 +673,14 @@ class FastSerializationSettings(object):
distribution_location: Optional[str] = None


# TODO: ImageConfig, python_interpreter, venv_root, fast_serialization_settings.destination_dir should be combined.
@dataclass_json
@dataclass()
class SerializationSettings(object):
"""
These settings are provided while serializing a workflow and task, before registration. This is required to get
runtime information at serialization time, as well as some defaults.
TODO: ImageConfig, python_interpreter, venv_root, fast_serialization_settings.destination_dir should be combined.
Attributes:
project (str): The project (if any) with which to register entities under.
domain (str): The domain (if any) with which to register entities under.
Expand Down
3 changes: 2 additions & 1 deletion flytekit/extras/tasks/shell.py
Expand Up @@ -120,7 +120,8 @@ def __init__(
task_config: T Configuration for the task, can be either a Pod (or coming soon, BatchJob) config
inputs: A Dictionary of input names to types
output_locs: A list of :py:class:`OutputLocations`
**kwargs: Other arguments that can be passed to :ref:class:`PythonInstanceTask`
**kwargs: Other arguments that can be passed to
:py:class:`~flytekit.core.python_function_task.PythonInstanceTask`
"""
if script and script_file:
raise ValueError("Only either of script or script_file can be provided")
Expand Down
4 changes: 3 additions & 1 deletion plugins/setup.py
Expand Up @@ -7,6 +7,8 @@

PACKAGE_NAME = "flytekitplugins-parent"

__version__ = "0.0.0+develop"

# Please maintain an alphabetical order in the following list
SOURCES = {
"flytekitplugins-athena": "flytekit-aws-athena",
Expand Down Expand Up @@ -74,7 +76,7 @@ def run(self):

setup(
name=PACKAGE_NAME,
version="0.1.0",
version=__version__,
author="flyteorg",
author_email="admin@flyte.org",
description="This is a microlib package to help install all the plugins",
Expand Down
11 changes: 11 additions & 0 deletions tests/flytekit/unit/configuration/configs/images.config
@@ -1,3 +1,14 @@
[sdk]
workflow_packages=module1,module2

[platform]
url=flyte.mycorp.io
insecure=true

[auth]
kubernetes_service_account=demo
raw_output_data_prefix=s3://my-bucket

[images]
xyz=docker.io/xyz:latest
abc=docker.io/abc

0 comments on commit 71d436a

Please sign in to comment.