Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create mlflow.deployments.openai #10473

Merged
merged 113 commits into from Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from 106 commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
f536157
Remove unused loggger
harupy Nov 16, 2023
dce3cb6
Add databricks deployments client skeleton + example (#10421)
harupy Nov 16, 2023
856de75
Set up CI (#10422)
harupy Nov 16, 2023
94ba4d2
Change git labels from gateway -> deployments (#10428)
dbczumar Nov 16, 2023
e450b10
Update gateway embeddings request / response format (#10430)
dbczumar Nov 16, 2023
3fd534f
Fix typo
harupy Nov 17, 2023
ee8daca
Fix typo
harupy Nov 17, 2023
c32b9a8
Add `MLflowDeploymentClient` (still empty) (#10447)
harupy Nov 18, 2023
8566e3f
Implement CRUD for serving-endpoints (#10425)
harupy Nov 18, 2023
d220673
Use `Literal` for string constants (#10457)
harupy Nov 18, 2023
2845e82
Update gateway chat request / response format (#10454)
dbczumar Nov 20, 2023
1390a2d
Update gateway completions request / response format (#10465)
dbczumar Nov 21, 2023
ac98a64
Deprecation warning for `mlflow.gateway` (#10460)
harupy Nov 21, 2023
12f2279
Mark deployment clients as experimental (#10468)
harupy Nov 21, 2023
9f1d547
[1] Copy gateway server CLI to mlflow deployments start-server (#10426)
dbczumar Nov 21, 2023
d53c558
[2] Add /endpoints APIs to MLflow Deployments server (#10466)
dbczumar Nov 21, 2023
ca96d43
Create mlflow.deployments.openai
prithvikannan Nov 21, 2023
f25b516
create_endpoint
prithvikannan Nov 21, 2023
a483873
inline _call_openai_api fn
prithvikannan Nov 21, 2023
5ccb7f3
Fix `start_server` (#10470)
harupy Nov 21, 2023
9fcb238
[3] Support "endpoints" in deployments server YAML conf (#10467)
dbczumar Nov 22, 2023
1877ef1
Suppress Pydantic warnings (#10483)
B-Step62 Nov 22, 2023
fdb2779
Implement `MLflowDeploymentClient` (#10458)
harupy Nov 22, 2023
8ca8429
Set timeout default to `None` (#10487)
harupy Nov 22, 2023
7dad6ce
create test_openai
prithvikannan Nov 22, 2023
ced403d
setup openai plugin
prithvikannan Nov 22, 2023
f6f5171
Merge remote-tracking branch 'databricks/master' into support-openai-…
prithvikannan Nov 22, 2023
449b286
Revert "Merge remote-tracking branch 'databricks/master' into support…
prithvikannan Nov 22, 2023
f93be2e
merge gateway-migration
prithvikannan Nov 22, 2023
a3f6f62
add openai to test dep
prithvikannan Nov 22, 2023
9a44e23
import openai.error
prithvikannan Nov 22, 2023
6f97551
Rename `MLflowDeploymentClient` to `MlflowDeploymentClient` (#10500)
harupy Nov 24, 2023
5472a35
Fix `MlflowDeploymentClient.list_endpoints` to auto-paginate (#10501)
harupy Nov 24, 2023
9fc93d3
Use `/rate-limits` if `config` only contains `rate_limits` (#10502)
harupy Nov 24, 2023
8f26348
merge
prithvikannan Nov 27, 2023
dbc75bf
ci test
prithvikannan Nov 27, 2023
9ca8b7a
fix test resp
prithvikannan Nov 27, 2023
d22f058
assert_called_once_with
prithvikannan Nov 27, 2023
3d7c1f3
remove test requirements
prithvikannan Nov 27, 2023
2cef1f3
quote
prithvikannan Nov 27, 2023
fa9335f
tiktoken dep
prithvikannan Nov 27, 2023
3af47b3
OPENAI_API_VERSION for azure
prithvikannan Nov 27, 2023
090da87
response format
prithvikannan Nov 27, 2023
0aeebe2
expand test coverage
prithvikannan Nov 27, 2023
e1d39a3
remove 2x
prithvikannan Nov 27, 2023
bf0782c
Replace `TODO` in `MlflowDeploymentClient` (#10496)
harupy Nov 28, 2023
787d5f0
Replace `TODO` in `DatabricksDeploymentClient` (#10499)
harupy Nov 28, 2023
cbb627e
Add `genai` extra (#10516)
harupy Nov 28, 2023
4caba21
Fix FastAPI docs (#10518)
harupy Nov 28, 2023
87607de
merge
prithvikannan Nov 28, 2023
bd9b810
add openai to genai
prithvikannan Nov 28, 2023
99ea3ae
fill in tood
prithvikannan Nov 28, 2023
4ec7a91
spacing
prithvikannan Nov 28, 2023
0cd468e
manifest
prithvikannan Nov 28, 2023
58972f7
Deployments server docs (#10517)
harupy Nov 28, 2023
cd0d458
[FEAT] Migrating prompt lab to use MLflow deployments (#10515)
sunishsheth2009 Nov 29, 2023
ebf1c29
Fix request and response schema (#10523)
harupy Nov 30, 2023
86ec60d
Remove fluent API (#10524)
harupy Nov 30, 2023
faff091
Fix getting-started guide (#10522)
harupy Nov 30, 2023
6382432
Support `endpoints:/my-endpoint` in LLM-as-judge metrics (#10528)
prithvikannan Nov 30, 2023
ef013ae
Fix schema (#10541)
harupy Nov 30, 2023
cff30c9
Fix client API section (#10542)
harupy Nov 30, 2023
a0b30e5
Fix refs (#10544)
harupy Nov 30, 2023
2e9f299
Use `endpoints` and `endpoint_type` (#10545)
harupy Nov 30, 2023
9ac5337
Fix REST examples (#10547)
harupy Nov 30, 2023
8b36901
Fix langchain example (#10548)
harupy Nov 30, 2023
4c9a908
Update `docs/source/llms/deployments/index.rst` (#10550)
harupy Dec 1, 2023
4602e81
Update `docs/source/llms/index.rst` (#10551)
harupy Dec 1, 2023
54751eb
Use `MLFLOW_DEPLOYMENTS_TARGET` in `gateway_proxy_handler` (#10554)
harupy Dec 1, 2023
c8dfc06
Update `docs/source/llms/prompt-engineering/index.rst` (#10552)
harupy Dec 1, 2023
d2945a9
Quick fix for promptlab gateway migration (#10563)
daniellok-db Dec 1, 2023
631ce09
Fix completions params (#10565)
harupy Dec 1, 2023
ea341fd
Fix incorrect Embeddings param: `text` -> `input` (#10566)
harupy Dec 1, 2023
2f0a1fc
Replace `candidates` with `choices` (#10569)
harupy Dec 1, 2023
1861d5b
More gateway replacements (#10568)
harupy Dec 1, 2023
beaf89f
[Docs] Updating the docs for prompt engineering with MLflow deploymen…
sunishsheth2009 Dec 1, 2023
22e791f
Gateway anthropic small fix for "n" parameter (#10576)
dbczumar Dec 1, 2023
0998384
Fix example links (#10561)
harupy Dec 1, 2023
96a6d2b
[Bug-fix] Fixing the error state bug for Mlflow deployments (#10575)
sunishsheth2009 Dec 1, 2023
77252fe
Update examples to use MLflow Deployments (#10558)
dbczumar Dec 1, 2023
e502ddf
Fix deployments examples (#10582)
harupy Dec 4, 2023
7e99592
Support completions endpoints (#10577)
prithvikannan Dec 4, 2023
c764bc1
merge
prithvikannan Dec 5, 2023
5fef0f1
merge master
prithvikannan Dec 6, 2023
8a3cea1
fix
prithvikannan Dec 6, 2023
438f5c5
fix
prithvikannan Dec 6, 2023
1cabb7d
remove openai
prithvikannan Dec 6, 2023
e177afd
remove openai dep
prithvikannan Dec 6, 2023
969d7ca
tests
prithvikannan Dec 6, 2023
e1826c4
tests pass
prithvikannan Dec 6, 2023
4c51eb5
setup
prithvikannan Dec 6, 2023
f07ab92
use chat format
prithvikannan Dec 6, 2023
dd2a7cc
add to setup
prithvikannan Dec 6, 2023
ed13cfd
Merge remote-tracking branch 'databricks/master' into support-openai-…
prithvikannan Dec 6, 2023
5bea2ee
remove openai dep for _get_api_config
prithvikannan Dec 6, 2023
110635c
dont break openai flavor
prithvikannan Dec 6, 2023
083f446
comments
prithvikannan Dec 7, 2023
57fafff
Merge remote-tracking branch 'databricks/master' into support-openai-…
prithvikannan Dec 8, 2023
844535c
Merge remote-tracking branch 'databricks/master' into support-openai-…
prithvikannan Dec 11, 2023
0c8653d
list and get for openai
prithvikannan Dec 11, 2023
b7db28d
azure openai
prithvikannan Dec 11, 2023
0c4d717
comment
prithvikannan Dec 11, 2023
af59968
skinny
prithvikannan Dec 11, 2023
435871b
fix skinny
prithvikannan Dec 11, 2023
45ff05b
remove utils
prithvikannan Dec 11, 2023
c3b6af6
endpoint instead of deployment
prithvikannan Dec 11, 2023
5b284a6
comments
prithvikannan Dec 12, 2023
a3909c0
fix
prithvikannan Dec 12, 2023
47f15cf
requirements
prithvikannan Dec 12, 2023
8fc290e
Merge remote-tracking branch 'databricks/master' into support-openai-…
prithvikannan Dec 12, 2023
5f2f496
tiktoken version
prithvikannan Dec 12, 2023
1b37a77
augmented_raise_for_status
prithvikannan Dec 13, 2023
67947dc
Merge remote-tracking branch 'databricks/master' into support-openai-…
prithvikannan Dec 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/deployments.yml
Expand Up @@ -31,7 +31,7 @@ jobs:
run: |
pip install --no-dependencies tests/resources/mlflow-test-plugin
pip install .[gateway] \
pytest pytest-timeout pytest-asyncio httpx psutil sentence-transformers transformers
pytest pytest-timeout pytest-asyncio httpx psutil sentence-transformers transformers tiktoken
prithvikannan marked this conversation as resolved.
Show resolved Hide resolved
- name: Run tests
run: |
pytest tests/deployments
2 changes: 2 additions & 0 deletions mlflow/deployments/__init__.py
Expand Up @@ -19,6 +19,7 @@
from mlflow.deployments.base import BaseDeploymentClient
from mlflow.deployments.databricks import DatabricksDeploymentClient, DatabricksEndpoint
from mlflow.deployments.interface import get_deploy_client, run_local
from mlflow.deployments.openai import OpenAIDeploymentClient
from mlflow.deployments.utils import get_deployments_target, set_deployments_target
from mlflow.exceptions import MlflowException
from mlflow.protos.databricks_pb2 import INVALID_PARAMETER_VALUE
Expand Down Expand Up @@ -100,6 +101,7 @@ def from_json(cls, json_str):
"run_local",
"BaseDeploymentClient",
"DatabricksDeploymentClient",
"OpenAIDeploymentClient",
"DatabricksEndpoint",
"MlflowDeploymentClient",
"PredictionsResponse",
Expand Down
279 changes: 279 additions & 0 deletions mlflow/deployments/openai/__init__.py
@@ -0,0 +1,279 @@
import logging
import os

from mlflow.exceptions import MlflowException
from mlflow.protos.databricks_pb2 import INVALID_PARAMETER_VALUE
from mlflow.utils.openai_utils import (
REQUEST_URL_CHAT,
_OAITokenHolder,
_OpenAIApiConfig,
_OpenAIEnvVar,
)

_logger = logging.getLogger(__name__)

from mlflow.deployments import BaseDeploymentClient


class OpenAIDeploymentClient(BaseDeploymentClient):
"""
Client for interacting with OpenAI endpoints.

Example:

First, set up credentials for authentication:

.. code-block:: bash

export OPENAI_API_KEY=...

.. seealso::

See https://mlflow.org/docs/latest/python_api/openai/index.html for other authentication methods.

Then, create a deployment client and use it to interact with OpenAI endpoints:

.. code-block:: python

from mlflow.deployments import get_deploy_client

client = get_deploy_client("openai")
client.predict(
endpoint="gpt-3.5-turbo",
inputs={
"messages": [
{"role": "user", "content": "Hello!"},
],
},
)
"""

def create_deployment(self, name, model_uri, flavor=None, config=None, endpoint=None):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def update_deployment(self, name, model_uri=None, flavor=None, config=None, endpoint=None):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def delete_deployment(self, name, config=None, endpoint=None):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def list_deployments(self, endpoint=None):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def get_deployment(self, name, endpoint=None):
prithvikannan marked this conversation as resolved.
Show resolved Hide resolved
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def predict(self, deployment_name=None, inputs=None, endpoint=None):
if "OPENAI_API_KEY" not in os.environ:
raise MlflowException(
"OPENAI_API_KEY environment variable not set",
error_code=INVALID_PARAMETER_VALUE,
)
prithvikannan marked this conversation as resolved.
Show resolved Hide resolved

from mlflow.openai.api_request_parallel_processor import process_api_requests

api_config = _get_api_config_without_openai_dep()
api_token = _OAITokenHolder(api_config.api_type)

if api_config.api_type in ("azure", "azure_ad", "azuread"):
api_base = api_config.api_base
api_version = api_config.api_version
engine = api_config.engine
deployment_id = api_config.deployment_id

if engine:
# Avoid using both parameters as they serve the same purpose
# Invalid inputs:
# - Wrong engine + correct/wrong deployment_id
# - No engine + wrong deployment_id
# Valid inputs:
# - Correct engine + correct/wrong deployment_id
# - No engine + correct deployment_id
if deployment_id is not None:
_logger.warning(
"Both engine and deployment_id are set. "
"Using engine as it takes precedence."
)
inputs = {"engine": engine, **inputs}
elif deployment_id is None:
raise MlflowException(
"Either engine or deployment_id must be set for Azure OpenAI API",
)

request_url = (
f"{api_base}/openai/deployments/{deployment_id}"
f"/chat/completions?api-version={api_version}"
)
else:
inputs = {"model": endpoint, **inputs}
request_url = REQUEST_URL_CHAT

try:
return process_api_requests(
[inputs],
request_url,
api_token=api_token,
throw_original_error=True,
max_workers=1,
)[0]
except MlflowException:
raise
except Exception as e:
raise MlflowException(f"Error response from OpenAI:\n {e}")

def create_endpoint(self, name, config=None):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def update_endpoint(self, endpoint, config=None):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def delete_endpoint(self, endpoint):
"""
.. warning::

This method is not implemented for `OpenAIDeploymentClient`.
"""
raise NotImplementedError

def list_endpoints(self):
"""
List the currently available models.
"""

if "OPENAI_API_KEY" not in os.environ:
raise MlflowException(
"OPENAI_API_KEY environment variable not set",
error_code=INVALID_PARAMETER_VALUE,
)

api_config = _get_api_config_without_openai_dep()
import requests

if api_config.api_type in ("azure", "azure_ad", "azuread"):
raise NotImplementedError(
"List deployments is not implemented for Azure OpenAI API",
prithvikannan marked this conversation as resolved.
Show resolved Hide resolved
)
else:
api_key = os.environ["OPENAI_API_KEY"]
request_header = {"Authorization": f"Bearer {api_key}"}

response = requests.get(
"https://api.openai.com/v1/models",
headers=request_header,
)

if response.status_code != 200:
raise MlflowException(
f"Error response from OpenAI: {response.text}",
error_code=INVALID_PARAMETER_VALUE,
)

return response.json()

def get_endpoint(self, endpoint):
"""
Get information about a specific model.
"""

if "OPENAI_API_KEY" not in os.environ:
raise MlflowException(
"OPENAI_API_KEY environment variable not set",
error_code=INVALID_PARAMETER_VALUE,
)

api_config = _get_api_config_without_openai_dep()
import requests

if api_config.api_type in ("azure", "azure_ad", "azuread"):
raise NotImplementedError(
"Get deployment is not implemented for Azure OpenAI API",
prithvikannan marked this conversation as resolved.
Show resolved Hide resolved
)
else:
api_key = os.environ["OPENAI_API_KEY"]
request_header = {"Authorization": f"Bearer {api_key}"}

response = requests.get(
f"https://api.openai.com/v1/models/{endpoint}",
headers=request_header,
)

if response.status_code != 200:
raise MlflowException(
f"Error response from OpenAI: {response.text}",
error_code=INVALID_PARAMETER_VALUE,
)
prithvikannan marked this conversation as resolved.
Show resolved Hide resolved

return response.json()


def run_local(name, model_uri, flavor=None, config=None):
pass


def target_help():
pass


def _get_api_config_without_openai_dep() -> _OpenAIApiConfig:
"""
Gets the parameters and configuration of the OpenAI API connected to.
"""
api_type = os.getenv(_OpenAIEnvVar.OPENAI_API_TYPE.value)
api_version = os.getenv(_OpenAIEnvVar.OPENAI_API_VERSION.value)
api_base = os.getenv(_OpenAIEnvVar.OPENAI_API_BASE.value, None)
engine = os.getenv(_OpenAIEnvVar.OPENAI_ENGINE.value, None)
deployment_id = os.getenv(_OpenAIEnvVar.OPENAI_DEPLOYMENT_NAME.value, None)
if api_type in ("azure", "azure_ad", "azuread"):
batch_size = 16
max_tokens_per_minute = 60_000
else:
# The maximum batch size is 2048:
# https://github.com/openai/openai-python/blob/b82a3f7e4c462a8a10fa445193301a3cefef9a4a/openai/embeddings_utils.py#L43
# We use a smaller batch size to be safe.
batch_size = 1024
max_tokens_per_minute = 90_000
return _OpenAIApiConfig(
api_type=api_type,
batch_size=batch_size,
max_requests_per_minute=3_500,
max_tokens_per_minute=max_tokens_per_minute,
api_base=api_base,
api_version=api_version,
engine=engine,
deployment_id=deployment_id,
)
6 changes: 5 additions & 1 deletion mlflow/langchain/__init__.py
Expand Up @@ -491,7 +491,11 @@ def predict(
import langchain
from langchain.schema.retriever import BaseRetriever

from mlflow.openai.utils import TEST_CONTENT, TEST_INTERMEDIATE_STEPS, TEST_SOURCE_DOCUMENTS
from mlflow.utils.openai_utils import (
TEST_CONTENT,
TEST_INTERMEDIATE_STEPS,
TEST_SOURCE_DOCUMENTS,
)

from tests.langchain.test_langchain_model_export import _mock_async_request

Expand Down
4 changes: 2 additions & 2 deletions mlflow/metrics/genai/model_utils.py
Expand Up @@ -3,8 +3,8 @@
import urllib.parse

from mlflow.exceptions import MlflowException
from mlflow.openai.utils import REQUEST_URL_CHAT
from mlflow.protos.databricks_pb2 import INVALID_PARAMETER_VALUE
from mlflow.utils.openai_utils import REQUEST_URL_CHAT

_logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -54,7 +54,7 @@ def _call_openai_api(openai_uri, payload, eval_parameters):

from mlflow.openai import _get_api_config
from mlflow.openai.api_request_parallel_processor import process_api_requests
from mlflow.openai.utils import _OAITokenHolder
from mlflow.utils.openai_utils import _OAITokenHolder

api_config = _get_api_config()
api_token = _OAITokenHolder(api_config.api_type)
Expand Down