Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model Monitoring] Implement TSDB abstraction #5210

Merged
merged 40 commits into from
May 7, 2024

Conversation

Eyal-Danieli
Copy link
Member

@Eyal-Danieli Eyal-Danieli commented Feb 28, 2024

Add Time Series DB abstraction that will be used for storing the model monitoring real time data. The PR includes:

  • A new tsdb package under mlrun/model_monitoring/db/tsdb. The abstract class called TSDBConnector and it includes the main applications that are being used throughout the model monitoring processes.
  • An implementation of V3IOTSDBConnector that is stored under mlrun/model_monitoring/db/tsdb/v3io_connector. Note that this object is based on V3IO frames client that handles the communications with V3IO TSDB.
  • The TSDB type is defined under a new configuration value called tsdb_connector_type. At the moment, the default value is v3io-tsdb.
  • Create the TSDB application result table during the writer deployment (Instead of trying to generate it on every writer call).

Related JIRAs - https://iguazio.atlassian.net/browse/ML-5795, https://iguazio.atlassian.net/browse/ML-5796

@gtopper gtopper marked this pull request as draft April 24, 2024 07:57
@Eyal-Danieli Eyal-Danieli marked this pull request as ready for review April 25, 2024 13:12
Copy link
Contributor

@davesh0812 davesh0812 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Added few more comments :)

mlrun/model_monitoring/db/tsdb/base.py Show resolved Hide resolved
server/api/crud/model_monitoring/deployment.py Outdated Show resolved Hide resolved
server/api/crud/model_monitoring/model_endpoints.py Outdated Show resolved Hide resolved
mlrun/model_monitoring/db/tsdb/base.py Show resolved Hide resolved
mlrun/common/schemas/model_monitoring/constants.py Outdated Show resolved Hide resolved
mlrun/db/httpdb.py Outdated Show resolved Hide resolved
"""
Write a single application or metric to TSDB.

:raise mlrun.errors.MLRunInvalidArgumentError: If an error occurred while writing the event.
Copy link
Collaborator

@gtopper gtopper May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is the right exception to raise? Sounds more like an MLRunRuntimeError.

:param end: The end time of the metrics.

:return: DataFrame with the provided attributes from the data collection.
:raise: MLRunInvalidArgumentError if the provided table wasn't found.
Copy link
Collaborator

@gtopper gtopper May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one sounds more like an MLRunNotFoundError.

@Eyal-Danieli Eyal-Danieli requested a review from gtopper May 7, 2024 10:07
@assaf758 assaf758 merged commit 8c0defe into mlrun:development May 7, 2024
10 checks passed
assaf758 pushed a commit that referenced this pull request May 21, 2024
…the streaming interface (#5599)

* [Datastore] Rename `DatastoreProfileAzureBlob` 'bucket' parameter to 'container' (#5521)

* [Project] Add delete artifact method (#5491)

* [API] Add delete artifact data endpoints (#5477)

* [Docs] Add registering an HDFS datastore profile (#5396)
To 
* how to register a datastore profile

* Update docs/store/datastore.md

Co-authored-by: Gal Topper <gal.topper@gmail.com>

* Update docs/store/datastore.md

Co-authored-by: Gal Topper <gal.topper@gmail.com>

* fix: data store is 2 words

* clarify example

---------

Co-authored-by: Gal Topper <gal.topper@gmail.com>

* [Alerts] Add backwards compatibility for igz < 3.6 (#5524)

* [Coding Conventions] Fix `mlconf` usages (#5525)

* [API] Enhance usage of iguazio client (#5506)

* [Model Monitoring] Implement TSDB abstraction (#5210)

* [Automation] Don't rollout deployment before purging DB (#5527)

* [SQLDB] Refactor partition querying to minimize subquery table size (#5526)

* [Docs] Enable automodule for mlrun.comon.schemas.artifact (#5529)

* [Alerts] Filter project resources by permissions for list (#5532)

* [ProjectSummaries] Add pipelines_completed_recent_count, pipelines_failed_recent_count (#5523)

* [KFP] Merge KFP2 Feature Branch (#5533)

* [Nuclio] Delete remote function config map (#5530)

* [API] Add read model monitoring metrics values endpoint (#5519)

* [Model Monitoring] Fix up imports (#5531)

* Remove unused import
* Change import so that PyCharm detects inheritance correctly

* [Notifications] Display Step Kind In Pipeline Notification (#5538)

* Add step kind to pipeline notification

* better system test

* redundant comment

* fix ut

---------

Co-authored-by: quaark <a.melnick@icloud.com>

* [Alerts] Modify alert entity to have list of ids per alert (#5511)

* [Application] Fix test function handler assertion (#5548)

* [Docs] Feature-store + Datastore: consolidate Apache Kafka details, cleanups in datastores (#5305)

* [Model Monitoring] Fix invalid current_stats in `MonitoringApplicationContext` (#5552)

fix deepcopy bug

* [KFP] Bump version and create makefile for release (#5555)

* [Runs] Store artifact URIs instead of full artifacts in body (#5553)

* [Artifacts] Add `producer_uri` filter to list artifacts API (#5549)

* [API] Check the artifact kind while attempting to remove artifact data (#5545)

* [CI] Fix pipeline adapters makefile (#5557)

* [Model Monitoring] Update model endpoint's current stats and drift measures in the writer (#5554)

* [Model Monitoring] Add new TSDB target for all predictions (#5551)

* [Model Monitoring] Add new TSDB target for all predictions

Tracking count and latency.

[ML-6349](https://iguazio.atlassian.net/browse/ML-6349)

* Change TSDB predictions table path

* Move predictions table

* Fix

* Fix parameter names

* Fix

* [Functions] Update function with deletion task id (#5536)

* [Application] Create API Gateway by default (#5515)

* [Logger] Enhance formatter creation (#5561)

* [KFP] Suppress reusable-components warnings (#5501)

* [ProjectSummaries] Add distinct_scheduled_jobs_pending_count, distinct_scheduled_pipelines_pending_count (#5535)

* [Notifications] Fix Pipeline Notification Log Spam + Git Notification (#5563)

fix pipeline notification bugs

Co-authored-by: quaark <a.melnick@icloud.com>

* [Docs] Compile notification when compiling docs  (#5564)

* [Helpers] Ignore pre-release in `min_iguazio_version` (#5558)

* [Runs] Enrich run with artifacts when getting a single run (#5565)

* [Application] Set api gateway name on status (#5567)

* [CI] Bundling demos as release asset (#5550)

* [Nuclio-jupyter] Bump nuclio-jupyter to 0.9.17 (#5568)

* [FrontendSpec] Adding TTLCache to `try_get_grafana_service_url` function (#5543)

* [KFP] Fix `watch=True` for remote pipeline when using `set_workflow` (#5572)

* [Alerts] Add cooldown_period to notification config in alert (#5547)

* [Model] Replace yaml dump with safe_dump (#5571)

* [Versioning] Fix package version for dev feature branches (#5575)

* repace hasattr() with dir()

* Support single Parquet and CSV files

---------

Co-authored-by: moranbental <107995850+moranbental@users.noreply.github.com>
Co-authored-by: jillnogold <88145832+jillnogold@users.noreply.github.com>
Co-authored-by: Gal Topper <gal.topper@gmail.com>
Co-authored-by: Yan Burman <yanburman@users.noreply.github.com>
Co-authored-by: Adam <adamm@iguazio.com>
Co-authored-by: Liran BG <liranbg@users.noreply.github.com>
Co-authored-by: Eyal Danieli <eyald@iguazio.com>
Co-authored-by: Alon Maor <48641682+alonmr@users.noreply.github.com>
Co-authored-by: roei3000b <40743125+roei3000b@users.noreply.github.com>
Co-authored-by: Katerina Molchanova <35141662+rokatyy@users.noreply.github.com>
Co-authored-by: Jonathan Daniel <36337649+jond01@users.noreply.github.com>
Co-authored-by: quaark <a.melnick@icloud.com>
Co-authored-by: TomerShor <90552140+TomerShor@users.noreply.github.com>
Co-authored-by: daniels290813 <78727943+daniels290813@users.noreply.github.com>
Co-authored-by: ZeevRispler <73653682+ZeevRispler@users.noreply.github.com>
Co-authored-by: Alex Toker <alex_toker@mckinsey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants