Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Signed-off-by: Harutaka Kawamura <hkawamura0130@gmail.com> Signed-off-by: dbczumar <corey.zumar@databricks.com> Signed-off-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Signed-off-by: mlflow-automation <mlflow-automation@users.noreply.github.com> Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Signed-off-by: Sunish Sheth <sunishsheth2009@gmail.com> Signed-off-by: Daniel Lok <daniel.lok@databricks.com> Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: mlflow-automation <mlflow-automation@users.noreply.github.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Sunish Sheth <sunishsheth2009@gmail.com> Co-authored-by: Prithvi Kannan <46332835+prithvikannan@users.noreply.github.com> Co-authored-by: Ben Wilson <39283302+BenWilson2@users.noreply.github.com> Co-authored-by: Daniel Lok <daniel.lok@databricks.com> Co-authored-by: Prithvi Kannan <prithvi.kannan@databricks.com>
- Loading branch information
1 parent
66616de
commit 2ef3a13
Showing
149 changed files
with
7,570 additions
and
2,225 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
name: Deployments | ||
|
||
on: | ||
pull_request: | ||
push: | ||
branches: | ||
- master | ||
- branch-[0-9]+.[0-9]+ | ||
|
||
permissions: | ||
contents: read | ||
|
||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.event_name }}-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
defaults: | ||
run: | ||
shell: bash --noprofile --norc -exo pipefail {0} | ||
|
||
jobs: | ||
deployments: | ||
if: github.event_name != 'pull_request' || github.event.pull_request.draft == false | ||
runs-on: ubuntu-latest | ||
timeout-minutes: 30 | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: ./.github/actions/untracked | ||
- uses: ./.github/actions/setup-python | ||
- name: Install dependencies | ||
run: | | ||
pip install --no-dependencies tests/resources/mlflow-test-plugin | ||
pip install .[gateway] \ | ||
pytest pytest-timeout pytest-asyncio httpx psutil sentence-transformers transformers | ||
- name: Run tests | ||
run: | | ||
pytest tests/deployments |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
Getting Started with MLflow Deployments for LLMs | ||
================================================ | ||
|
||
MLflow provides a robust framework for deploying and managing machine learning models. In this tutorial, we will explore how to set up an | ||
MLflow Deployments Server tailored for OpenAI's models, allowing seamless integration and querying of OpenAI's powerful language models. | ||
|
||
What's in this tutorial? | ||
|
||
This guide will cover: | ||
|
||
- **Installation**: Setting up the necessary dependencies and tools to get your MLflow Deployments Server up and running. | ||
|
||
- **Configuration**: How to expose your OpenAI token, configure the deployments server, and define routes for various OpenAI models. | ||
|
||
- **Starting the deployments server**: Launching the deployments server and ensuring it's operational. | ||
|
||
- **Querying the deployments server**: Interacting with the deployments server using fluent APIs to query various OpenAI models, including completions, chat, and embeddings. | ||
|
||
By the end of this tutorial, you'll have a fully functional MLflow Deployments Server tailored for OpenAI, ready to handle and process requests. | ||
You'll also gain insights into querying different types of routes, providers, and models through the deployments server. | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
Setting Up the MLflow Deployments Server <step1-create-deployments> | ||
Querying the MLflow Deployments Server <step2-query-deployments> |
99 changes: 99 additions & 0 deletions
99
docs/source/llms/deployments/guides/step1-create-deployments.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
Configuring and Starting the Deployments Server | ||
=============================================== | ||
|
||
Step 1: Install | ||
--------------- | ||
First, install MLflow along with the ``genai`` extras to get access to a range of serving-related | ||
dependencies, including ``uvicorn`` and ``fastapi``. Note that direct dependencies on OpenAI are | ||
unnecessary, as all supported providers are abstracted from the developer. | ||
|
||
.. code-section:: | ||
|
||
.. code-block:: bash | ||
:name: install-genai | ||
pip install 'mlflow[genai]' | ||
Step 2: Set the OpenAI Token as an Environment Variable | ||
------------------------------------------------------- | ||
Next, set the OpenAI API key as an environment variable in your CLI. | ||
|
||
This approach allows the MLflow Deployments Server to read the sensitive API key safely, reducing the risk | ||
of leaking the token in code. The Deployments Server, when started, will read the value set by this environment | ||
variable without any additional action required. | ||
|
||
.. code-section:: | ||
|
||
.. code-block:: bash | ||
:name: token | ||
export OPENAI_API_KEY=your_api_key_here | ||
Step 3: Configure the Deployments Server | ||
---------------------------------------- | ||
Third, set up several routes for the Deployments Server to host. The configuration of the Deployments Server is done through | ||
editing a YAML file that is read by the server initialization command (covered in step 4). | ||
|
||
Notably, the Deployments Server allows real-time updates to an active server through the YAML configuration; | ||
service restart is not required for changes to take effect and can instead be done simply by editing the | ||
configuration file that is defined at server start, permitting dynamic route creation without downtime of the service. | ||
|
||
.. code-section:: | ||
|
||
.. code-block:: yaml | ||
:name: server-config | ||
endpoints: | ||
- name: completions | ||
endpoint_type: llm/v1/completions | ||
model: | ||
provider: openai | ||
name: gpt-3.5-turbo | ||
config: | ||
openai_api_key: $OPENAI_API_KEY | ||
- name: chat | ||
endpoint_type: llm/v1/chat | ||
model: | ||
provider: openai | ||
name: gpt-4 | ||
config: | ||
openai_api_key: $OPENAI_API_KEY | ||
- name: chat_3.5 | ||
endpoint_type: llm/v1/chat | ||
model: | ||
provider: openai | ||
name: gpt-3.5-turbo | ||
config: | ||
openai_api_key: $OPENAI_API_KEY | ||
- name: embeddings | ||
endpoint_type: llm/v1/embeddings | ||
model: | ||
provider: openai | ||
name: text-embedding-ada-002 | ||
config: | ||
openai_api_key: $OPENAI_API_KEY | ||
Step 4: Start the Server | ||
------------------------- | ||
Fourth, let's test the deployments server! | ||
|
||
To launch the deployments server using a YAML config file, use the deployments CLI command. | ||
|
||
The deployments server will automatically start on ``localhost`` at port ``5000``, accessible via | ||
the URL: ``http://localhost:5000``. To modify these default settings, use the | ||
``mlflow deployments start-server --help`` command to view additional configuration options. | ||
|
||
.. code-section:: | ||
|
||
.. code-block:: bash | ||
:name: start-server | ||
mlflow deployments start-server --config-path config.yaml | ||
.. note:: | ||
MLflow Deployments Server automatically creates API docs. You can validate your deployment server | ||
is running by viewing the docs. Go to `http://{host}:{port}` in your web browser. |
Oops, something went wrong.