diff --git a/functions/src/noise_reduction/item.yaml b/functions/src/noise_reduction/item.yaml index d8f2cddd4..c37b4ab39 100644 --- a/functions/src/noise_reduction/item.yaml +++ b/functions/src/noise_reduction/item.yaml @@ -26,4 +26,5 @@ spec: torchaudio>=2.1.2, ] url: '' -version: 1.1.0 \ No newline at end of file +version: 1.1.0 +test_valid: False \ No newline at end of file diff --git a/modules/src/count_events/count_events.ipynb b/modules/src/count_events/count_events.ipynb index 54f657bb0..8a3cac849 100644 --- a/modules/src/count_events/count_events.ipynb +++ b/modules/src/count_events/count_events.ipynb @@ -1,35 +1,829 @@ { "cells": [ + { + "cell_type": "markdown", + "id": "2f5aea66-03d3-4ba2-a0cb-3e74e8376ff0", + "metadata": {}, + "source": [ + "# Count Events Demo" + ] + }, + { + "cell_type": "markdown", + "id": "cdadd95e-d65f-4910-b72f-ef545c09c96b", + "metadata": {}, + "source": [ + "## Overview" + ] + }, + { + "cell_type": "markdown", + "id": "c336160a-3eba-40b3-8d02-7849ca74925b", + "metadata": {}, + "source": [ + "This notebook walks through a simple example of how to monitor a real-time serving function and how to add your a custom monitoring application from the hub.\n", + "For simplicity, we’ll use the Count Events application, which calculates the number of requests in each time window.\n", + "If you’d like to create your own model monitoring application (which can later be added to the hub), follow these instructions:https://docs.mlrun.org/en/stable/model-monitoring/applications.html\n", + "\n", + "To add a model monitoring application to your project from the hub, you can choose one of two approaches:\n", + "1. **Set it directly** – the application will be deployed as is.\n", + "2. **Import it as a module** – this lets you test and modify the application code before deploying it.\n" + ] + }, + { + "cell_type": "markdown", + "id": "1bcc90b4-f3c3-46ea-8348-1e7239e4e6e0", + "metadata": {}, + "source": [ + "## Demo" + ] + }, + { + "cell_type": "markdown", + "id": "2761fb6c-2c9d-4e8c-8efd-e01762b3bb22", + "metadata": {}, + "source": [ + "### Create a project" + ] + }, { "cell_type": "code", - "execution_count": null, - "id": "initial_id", + "execution_count": 1, + "id": "e06ac3e1-8afd-45ab-9448-f664a4e54640", "metadata": { - "collapsed": true + "collapsed": true, + "jupyter": { + "outputs_hidden": true + }, + "tags": [] }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2025-11-05 15:33:39,611 [warning] Failed resolving version info. Ignoring and using defaults\n", + "> 2025-11-05 15:33:43,049 [warning] Server or client version is unstable. Assuming compatible: {\"client_version\":\"0.0.0+unstable\",\"server_version\":\"1.11.0\"}\n", + "> 2025-11-05 15:33:58,614 [info] Created and saved project: {\"context\":\"./\",\"from_template\":null,\"name\":\"count-events-demo\",\"overwrite\":false,\"save\":true}\n", + "> 2025-11-05 15:33:58,616 [info] Project created successfully: {\"project_name\":\"count-events-demo\",\"stored_in_db\":true}\n" + ] + } + ], + "source": [ + "import mlrun\n", + "project = mlrun.get_or_create_project(\"count-events-demo\",'./')" + ] + }, + { + "cell_type": "markdown", + "id": "cb0c365d-243f-447d-a693-38007d38329a", + "metadata": {}, + "source": [ + "### Generate datastore profiles for model monitoring\n", + "Before you enable model monitoring, you must configure datastore profiles for TSDB and streaming endpoints. A datastore profile holds all the information required to address an external data source, including credentials.\n", + "Model monitoring supports Kafka and V3IO as streaming platforms, and TDEngine and V3IO as TSDB platforms.\n", + "\n", + "In this example we will use V3IO for both streaming and TSDB platforms." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "10df799e-0e63-409c-a204-551635c90410", + "metadata": {}, + "outputs": [], + "source": [ + "from mlrun.datastore.datastore_profile import (\n", + " DatastoreProfileV3io\n", + ")\n", + "\n", + "v3io_profile = DatastoreProfileV3io(name=\"v3io_profile\", v3io_access_key=mlrun.mlconf.get_v3io_access_key())\n", + "\n", + "project.register_datastore_profile(v3io_profile)\n", + "project.set_model_monitoring_credentials(stream_profile_name=v3io_profile.name, tsdb_profile_name=v3io_profile.name)" + ] + }, + { + "cell_type": "markdown", + "id": "94af15ae-b250-4583-950d-b14876065b8a", + "metadata": {}, + "source": [ + "### Deploy model monitoring infrastructure" + ] + }, + { + "cell_type": "markdown", + "id": "56b2adf8-dd65-4ee1-bf18-cd97eeb129b8", + "metadata": {}, + "source": [ + "Once you’ve provided the model monitoring credentials, you can enable monitoring capabilities for your project. \n", + "Visit MLRun's [Model Monitoring Architecture](https://docs.mlrun.org/en/stable/model-monitoring/index.html#model-monitoring-des) to read more." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "a83f95bc-e6b5-4184-84cd-d3117f394b1c", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2025-11-05 15:41:01 (info) Deploying function\n", + "2025-11-05 15:41:01 (info) Building\n", + "2025-11-05 15:41:01 (info) Staging files and preparing base images\n", + "2025-11-05 15:41:01 (warn) Using user provided base image, runtime interpreter version is provided by the base image\n", + "2025-11-05 15:41:02 (info) Building processor image\n", + "2025-11-05 15:42:57 (info) Build complete\n", + "2025-11-05 15:43:07 (info) Function deploy complete\n", + "2025-11-05 15:40:57 (info) Deploying function\n", + "2025-11-05 15:40:57 (info) Building\n", + "2025-11-05 15:40:58 (info) Staging files and preparing base images\n", + "2025-11-05 15:40:58 (warn) Using user provided base image, runtime interpreter version is provided by the base image\n", + "2025-11-05 15:40:58 (info) Building processor image\n", + "2025-11-05 15:42:53 (info) Build complete\n", + "2025-11-05 15:43:12 (info) Function deploy complete\n", + "2025-11-05 15:40:59 (info) Deploying function\n", + "2025-11-05 15:40:59 (info) Building\n", + "2025-11-05 15:40:59 (info) Staging files and preparing base images\n", + "2025-11-05 15:40:59 (warn) Using user provided base image, runtime interpreter version is provided by the base image\n", + "2025-11-05 15:41:00 (info) Building processor image\n", + "2025-11-05 15:42:55 (info) Build complete\n", + "2025-11-05 15:43:03 (info) Function deploy complete\n" + ] + } + ], + "source": [ + "project.enable_model_monitoring(base_period=10, \n", + " deploy_histogram_data_drift_app=False, # built-in monitoring application for structured data \n", + " wait_for_deployment=True)" + ] + }, + { + "cell_type": "markdown", + "id": "e9f4186b-6f8f-479e-a603-d270397dd9ff", + "metadata": {}, + "source": [ + "### Log Models" + ] + }, + { + "cell_type": "markdown", + "id": "310fed55-3f62-4af8-800f-4fb2dccfe2fd", + "metadata": { + "tags": [] + }, + "source": [ + "We’ll generate some dummy classification models and log them to the project." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "fafcec2f-75d1-4af0-bbe0-b796367c48be", + "metadata": {}, + "outputs": [], + "source": [ + "from sklearn.datasets import make_classification\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn.linear_model import LinearRegression\n", + "import pickle\n", + "import pandas as pd" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "6cabd9aa-87f2-4af7-a5c6-ea0417ceb33f", + "metadata": {}, + "outputs": [], + "source": [ + "# Prepare a model and generate training set\n", + "\n", + "X,y = make_classification(n_samples=200,n_features=5,random_state=42)\n", + "X_train,X_test,y_train,y_test = train_test_split(X,y,train_size=0.8,test_size=0.2,random_state=42)\n", + "model = LinearRegression()\n", + "model.fit(X_train,y_train)\n", + "X_test = pd.DataFrame(X_test,columns=[f\"column_{i}\" for i in range(5)])\n", + "y_test = pd.DataFrame(y_test,columns=[\"label\"])\n", + "training_set = pd.concat([X_test,y_test],axis=1)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "3afde46a-9f26-4438-bedb-acad15866b03", + "metadata": {}, + "outputs": [], + "source": [ + "# Log your models\n", + "for i in range(5):\n", + " project.log_model(key=f\"model_{i}\",body=pickle.dumps(model),model_file=f'model.pkl',training_set=training_set,label_column=\"label\")" + ] + }, + { + "cell_type": "markdown", + "id": "49d820b1-9fd7-4184-9005-25d69578c995", + "metadata": {}, + "source": [ + "### Deploy Serving Function" + ] + }, + { + "cell_type": "markdown", + "id": "19fd7570-3f91-45ff-ba2b-4aebce4a95b4", + "metadata": {}, + "source": [ + "We’ll use a basic serving function and enrich it with the logged models.\n", + "\n", + "\n", + "Note that if you want to monitor a serving function along with its associated models, you must enable tracking by calling `set_tracking()`. Otherwise, the serving function’s requests won’t be monitored." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "cb806c5b-a0a0-4deb-a63d-f2ea72dc3e02", + "metadata": {}, + "outputs": [], + "source": [ + "# Define the serving\n", + "serving = mlrun.new_function('serving-model-v1',kind='serving')\n", + "graph = serving.set_topology(\"router\", engine=\"sync\")" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "93ee54ec-0c4a-4eb1-8bc3-d065aec64c8f", + "metadata": {}, + "outputs": [], + "source": [ + "# Apply monitoring\n", + "serving.set_tracking()" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "f162a254-00ce-4c8a-89df-0cf5d25da5b1", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "100%|██████████| 5/5 [00:00<00:00, 22052.07it/s]\n" + ] + } + ], + "source": [ + "# Add models to your serving\n", + "models_uri = [model.uri for model in project.list_models(tag=\"latest\")]\n", + "i=0\n", + "from tqdm import tqdm\n", + "for uri in tqdm(models_uri):\n", + " serving.add_model(key=f'model_{i}',model_path=uri,class_name='mlrun.frameworks.sklearn.SKLearnModelServer')\n", + " i+=1" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "ff91f360-5c85-4bc7-a3c3-80a31f1ebd3c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2025-11-05 15:55:08,989 [info] Starting remote function deploy\n", + "2025-11-05 15:55:09 (info) Deploying function\n", + "2025-11-05 15:55:09 (info) Building\n", + "2025-11-05 15:55:09 (info) Staging files and preparing base images\n", + "2025-11-05 15:55:09 (warn) Using user provided base image, runtime interpreter version is provided by the base image\n", + "2025-11-05 15:55:09 (info) Building processor image\n", + "2025-11-05 15:56:54 (info) Build complete\n", + "2025-11-05 15:57:06 (info) Function deploy complete\n", + "> 2025-11-05 15:57:10,181 [info] Model endpoint creation task completed with state succeeded\n", + "> 2025-11-05 15:57:10,181 [info] Successfully deployed function: {\"external_invocation_urls\":[\"count-events-demo-serving-model-v1.default-tenant.app.vmdev211.lab.iguazeng.com/\"],\"internal_invocation_urls\":[\"nuclio-count-events-demo-serving-model-v1.default-tenant.svc.cluster.local:8080\"]}\n" + ] + } + ], + "source": [ + "# Deploy serving\n", + "serving_function = project.deploy_function(serving)" + ] + }, + { + "cell_type": "markdown", + "id": "1652a010-e086-4c62-9493-1a82bc125ad4", + "metadata": {}, + "source": [ + "### Invoke Serving" + ] + }, + { + "cell_type": "markdown", + "id": "4c937193-27bc-4b6f-bc1d-cf7472045778", + "metadata": {}, + "source": [ + "Let’s generate some dummy data and invoke our serving function." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "66f469db-9f5b-4e3d-bc85-160a9c90bc8f", + "metadata": {}, "outputs": [], "source": [ - "" + "serving = project.get_function(\"serving-model-v1\")" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "50305c3e-bd1b-4240-9c63-9851173af75e", + "metadata": {}, + "outputs": [], + "source": [ + "inputs = [[-0.51,0.051,0.6287761723991921,-0.8751269647375463,-1.0660002219502747], [-0.51,0.051,0.6287761723991921,-0.8751269647375463,-1.0660002219502747], [-0.51,0.051,0.6287761723991921,-0.8751269647375463,-1.0660002219502747], [-0.51,0.051,0.6287761723991921,-0.8751269647375463,-1.0660002219502747], [-0.51,0.051,0.6287761723991921,-0.8751269647375463,-1.0660002219502747]]" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "9e8372d6-4fa7-4b45-8932-1f690b55048c", + "metadata": {}, + "outputs": [], + "source": [ + "import time\n", + "for i in range(5):\n", + " for j in range(100):\n", + " serving.invoke(f\"/v2/models/model_{i}/infer\", {\"inputs\": inputs})" + ] + }, + { + "cell_type": "markdown", + "id": "4eeb44e1-9c1a-430a-b978-f58f1adeaa12", + "metadata": {}, + "source": [ + "# Evaluate App" + ] + }, + { + "cell_type": "markdown", + "id": "936afba8-c06b-4141-a85e-5cbc9d32aa45", + "metadata": {}, + "source": [ + "Before deploying the Count Events application, let’s first test it to make sure it works as expected. We’ll import it as a module, which downloads the module file to your local filesystem, and then run it as a job using the `evaluate` mechanism." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "213425d1-8470-483e-b325-14aaa991c8c5", + "metadata": {}, + "outputs": [], + "source": [ + "# Import count events from the hub\n", + "count_events_app = mlrun.import_module(\"hub://count_events\")" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "d91450e4-effb-4963-b913-dcd9829e78b9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "> 2025-11-05 15:57:37,746 [info] Changing function name - adding `\"-batch\"` suffix: {\"func_name\":\"countapp-batch\"}\n", + "> 2025-11-05 15:57:37,927 [info] Storing function: {\"db\":\"http://mlrun-api:8080\",\"name\":\"countapp-batch--handler\",\"uid\":\"b7c240fd99ed4c9b940db6a587a53b80\"}\n", + "> 2025-11-05 15:57:38,202 [info] Job is running in the background, pod: countapp-batch--handler-469fm\n", + "> 2025-11-05 15:57:42,390 [info] Counted events for model endpoint window: {\"count\":4,\"end\":\"NaT\",\"model_endpoint_name\":\"model_0\",\"start\":\"NaT\"}\n", + "> 2025-11-05 15:57:42,498 [info] To track results use the CLI: {\"info_cmd\":\"mlrun get run b7c240fd99ed4c9b940db6a587a53b80 -p count-events-demo\",\"logs_cmd\":\"mlrun logs b7c240fd99ed4c9b940db6a587a53b80 -p count-events-demo\"}\n", + "> 2025-11-05 15:57:42,498 [info] Or click for UI: {\"ui_url\":\"https://dashboard.default-tenant.app.vmdev211.lab.iguazeng.com/mlprojects/count-events-demo/jobs/monitor-jobs/countapp-batch--handler/b7c240fd99ed4c9b940db6a587a53b80/overview\"}\n", + "> 2025-11-05 15:57:42,499 [info] Run execution finished: {\"name\":\"countapp-batch--handler\",\"status\":\"completed\"}\n" + ] + }, + { + "data": { + "text/html": [ + "\n", + "
| project | \n", + "uid | \n", + "iter | \n", + "start | \n", + "end | \n", + "state | \n", + "kind | \n", + "name | \n", + "labels | \n", + "inputs | \n", + "parameters | \n", + "results | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|---|
| count-events-demo | \n", + "\n", + " | 0 | \n", + "Nov 05 15:57:41 | \n", + "2025-11-05 15:57:42.474376+00:00 | \n", + "completed | \n", + "run | \n", + "countapp-batch--handler | \n", + "v3io_user=iguazio kind=job owner=iguazio mlrun/client_version=0.0.0+unstable mlrun/client_python_version=3.11.12 host=countapp-batch--handler-469fm | \n",
+ " sample_data | \n",
+ " endpoints=['model_0'] write_output=False existing_data_handling=fail_on_overlap stream_profile=None | \n",
+ " model_0-d25a6714a19b4027b9bccfe8adca8ddc_NaT_NaT={'metric_name': 'count', 'metric_value': 4.0} | \n",
+ "