In [None]:
# Upgrade Oracle ADS to pick up latest features and maintain compatibility with Oracle Cloud Infrastructure.
!pip install -U oracle-ads

Oracle Data Science service sample notebook.

Copyright (c) 2022 Oracle, Inc. All rights reserved. Licensed under the [Universal Permissive License v 1.0](https://oss.oracle.com/licenses/upl).

---

# <font color="red">Introduction to Model Version Set [Limited Availability]</font>
<p style="margin-left:10%; margin-right:10%;">by the <font color="teal">Oracle Cloud Infrastructure Data Science Service.</font></p>

---

# Overview:

The normal workflow of a data scientist is to create a model and push it into production. While in production the data scientist learns what the model is doing well and what it isn't. Using this information they create an improved model. These models should be linked in some way, which are model version sets. A model version set is a collection of models that are related to each other as ancestors. A model version set is a way to track the relationships between models. As a container, the model version set takes a collection of models. Those models are assigned a sequential version number based on the order they are entered into the model version set. 

In ADS the class ``ModelVersionSet`` is used to represent the model version set. An object of ``ModelVersionSet`` references a model version set in the Data Science service. The ``ModelVersionSet`` class supports two APIs: the builder pattern and the traditional parameter-based pattern. You can use either of these API frameworks interchangeably and examples for both patterns are included.

Use the ``.create()`` method to create a model version set in your tenancy. If the model version set already exists in the model catalog, then use the ``.from_id()`` and ``from_name()`` methods to create a ``ModelVersionSet`` object based on the specified model version set. If you make changes to the metadata associated with the model version set, use the ``.update()`` method to push those changes to the model catalog. The ``.list()`` method lists all model version sets. To add an existing model to a model version set, use the ``.add_model()`` method. The ``.models()`` method lists the models in the model version set. Use the ``.delete()`` method to delete a model version set from the model catalog.

Compatible conda pack: [Oracle Database and Data Exploration](https://docs.oracle.com/en-us/iaas/data-science/using/conda-dem-fam.htm) for CPU Python 3.8

---

## Contents:

 - <a href='#create'>Create new Model Version Set</a>
 - <a href='#update'>Update a Model Version Set</a>
 - <a href='#list-mvs'>List Model Version Sets</a>
 - <a href='#get'>Get Model Version Sets by id</a>
 - <a href='#associate'>Associate Models with a Model Version Set</a>
 - <a href='#list'>List Models within a Model Version Set</a>
 - <a href='#delete'>Delete a Model Version Set</a>
 
   
 
---

**Important:**

Placeholder text for required values are surrounded by angle brackets that must be removed when adding the indicated content. For example, when adding a database name to `database_name = "<database_name>"` would become `database_name = "production"`.

---

<font color="gray">
Datasets are provided as a convenience.  Datasets are considered third-party content and are not considered materials 
under your agreement with Oracle.
    
You can access the `oracle_classification_dataset1` dataset license [here](https://oss.oracle.com/licenses/upl). 
</font>

---

# Introduction 

Versioning a model is a way to keep track of the relationships between the various models. The creation of multiple model versions occurs mostly during the ideation/iterative phase of the machine learning model lifecycle where multiple experiments are run. Data scientists will typically train multiple model "candidates" that are represented by different versions.  Some of these candidate models will be eventually deployed. Data scientists want to keep records of the different models they have trained and their various attempts at improving the model performance on validation datasets. Versioning allows data scientists to keep track of those candidate models in a flexible way.

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
from ads.model.model_metadata import UseCaseType
from ads.model import GenericModel, ModelVersionSet
from numpy import array, ndarray
import numpy as np
import oci
import ads
import logging
import os
import random
import sys
import tempfile
import warnings

logging.basicConfig(
    format="%(levelname)s:%(message)s", level=logging.INFO, stream=sys.stdout
)
ads.set_auth("resource_principal")

<a id='create'></a>
# Create new Model Version Set

The ``.create()`` method on a ``ModelVersionSet`` object creates a model version set in the model catalog. The properties of the ``ModelVersionSet`` are used to create the model version set in the model catalog. 

The following examples creates a ``ModelVersionSet``, defines the properties of the model version set, and then creates a model version set in the model catalog.



<a id='constructor'></a>
## Constructor

In [None]:
mvs = ModelVersionSet(
    compartment_id=os.environ["PROJECT_COMPARTMENT_OCID"],
    name="demo_experiment_1",
    projectId=os.environ["PROJECT_OCID"],
    description="Demo experiment number one",
)

<a id='yaml'></a>
## From YAML

Sometimes, a user may want to define their ``ModelVersionSet`` in terms of a YAML file because of its superior readability and user-friendlieness

In [None]:
mvs = ModelVersionSet.from_yaml(mvs.to_yaml())

## Create

The previous cells only create a ``ModelVersionSet`` object locally, in order to propegate its creation to the console, ``create()`` must be called on the object

In [None]:
mvs.create()

<a id='update'></a>
# Update a Model Version Set

The ``ModelVersionSet`` object has a number of properties that you can be update. When the properties in a ``ModelVersionSet`` object are updated, the model version set in the model catalog are not automatically updated. You must call the ``.update()`` method to commit the changes.

The properties that you can be update are:

* ``compartment_id``: The OCID of the compartment that the model version set belongs to.
* ``description``: A description of the models in the collection.
* ``freeform_tags``: A dictionary of string values.
* ``name``: Name of the model version set.
* ``project_id``: The OCID of the data science project that the model version set belongs to.

The following demonstrates how to update these values of a model version set using the various API interfaces:



In [None]:
mvs.description = "Demo experiment number one with corrected description."
mvs.freeform_tags = {"test_tag": "Some tag value", "some_other_tag": "New tag"}
mvs.update()


## Version Label

Versioning lets you keep track of all of your models, how well they’ve done, and what hyperparameters they used. Versioning ML models is useful for the same reasons that it is useful to use ``git`` to version control software

The version label is associated with the model, and not the model version set. To change the version label, you must have a ``Model`` object. Then you can change the ``version_label`` property, and then commit it to the model catalog.


<a id='list-mvs'></a>
# List Model Version Sets

The ``.list()`` method on the ``ModelVersionSet`` class takes a compartment ID and lists the model version sets in that compartment. If the compartment isn't given, then the compartment of the notebook session is used. 

The following  uses a context manager to iterate over the collection of model version sets:


In [None]:
for item in ModelVersionSet.list(os.environ["PROJECT_COMPARTMENT_OCID"], limit=4):
    print(item)
    print("---------")

<a id='get'></a>
# Get Model Version Sets by id

You can get a list of models that are associated with a model version set by calling the ``.models()`` method on a ``ModelVersionSet`` object. A list of models that are associated with that model version set is returned.First you must obtain a ``ModelVersionSet`` object. Use the ``.from_id()`` method if you know the model version set OCID. Alternatively, use the ``.from_name()`` method if you know the name of the model version set.

In [None]:
mvs = ModelVersionSet.from_id(mvs.id)
mvs

## Get by name

In [None]:
mvs = ModelVersionSet.from_name(name=mvs.name)
mvs

<a id='associate'></a>
# Associate Models with a Model Version Set

Model version sets are a collection of models. When a model is associated with a model version set, a version label can be assigned to it. This is different than the model version that is maintained by the model version set. 
There are a number of ways to associate a model with a model version set. Which approach you use depends on the workflow.

In [None]:
class Square:
    def predict(self, x):
        x_array = np.array(x)
        return np.ndarray.tolist(x_array * x_array)


X = random.sample(range(0, 100), 10)

with tempfile.TemporaryDirectory() as temp_dir:
    artifact_dir = os.path.join(temp_dir, "artifacts")

generic_model = GenericModel(estimator=Square(), artifact_dir=artifact_dir)
generic_model.prepare(
    inference_conda_env="dbexp_p38_cpu_v1",
    training_conda_env="dbexp_p38_cpu_v1",
    use_case_type=UseCaseType.MULTINOMIAL_CLASSIFICATION,
    X_sample=X,
    y_sample=array(X) ** 2,
    force_overwrite=True,
)

A model does not have to be associated with a model version set. In this case, using the ``.model_add()`` method on a ``ModelVersionSet`` object to associate it with the model version set that it represents. The ``.model_add()`` requires that you provide the model OCID and optionally a version label.


## With ModelVersionSet

In [None]:
generic_model.save(display_name="Demo Generic Model 1")
mvs.model_add(generic_model.model_id, version_label="Version label 1")

## With Context Manager
When you have multiple models that you want to associate with some model version set, use a context manager. The ``ads.model.experiment()`` method has a required ``name`` parameter. If the model catalog has a model version set name that matches, it uses that model version set. If the parameter ``create_if_not_exists`` is ``True``, then the ``experiment()`` method attempts to use the model version set with the matching name in the model catalog or it creates a version set if needed.

Within the context manager, you can save multiple `Model Serialization` models without specifying the ``model_version_set`` parameter because it's taken from the model context manager. The following example assumes that ``model_1``, ``model_2``, and ``model_3`` are `Model Serialization` objects. It creates a model version set named ``my_model_version_set`` if it doesn't exist in the model catalog. If it does exist in the model catalog, it saves the models to that model version set.

In [None]:
with ads.model.experiment(name=mvs.name, create_if_not_exists=False):
    # experiment 1
    generic_model.save(
        display_name="Demo Generic Model 2", version_label="Version label 2"
    )
    # experiment 2
    generic_model.save(
        display_name="Demo Generic Model 3", version_label="Version label 3"
    )
    # experiment 3
    generic_model.save(
        display_name="Demo Generic Model 4", version_label="Version label 4"
    )

<a id='list'></a>
# Get list of models within a Model Version Set


In [None]:
for dsc_model in mvs.models():
    print(dsc_model.display_name, dsc_model.id, dsc_model.status)

<a id='delete'></a>
# Delete a Model Version Set

To delete a model version set, all the associated models must be deleted or in a terminated state. The ``.delete()`` method on a ``ModelVersionSet`` object initiates an asynchronous delete operation. You can check the ``.status`` method on the ``ModelVersionSet`` object to determine the status of the delete request. Since all of the models associated with the model version set must be deleted or in a terminated state, set the ``delete_model`` parameter to ``True`` to delete all of the models in the model version set, and then delete the model version set.


The following deletes a model version set and it associated models. 


In [None]:
mvs

In [None]:
mvs.delete(delete_model=True)

The ``status`` property has the following values:

* ``ModelVersionSet.LIFECYCLE_STATE_ACTIVE``
* ``ModelVersionSet.LIFECYCLE_STATE_DELETED``
* ``ModelVersionSet.LIFECYCLE_STATE_DELETING``
* ``ModelVersionSet.LIFECYCLE_STATE_FAILED``


In [None]:
mvs.status

<a id='ref'></a>
# References

- [ADS Library Documentation](https://docs.cloud.oracle.com/en-us/iaas/tools/ads-sdk/latest/index.html)
- [Data Science YouTube Videos](https://www.youtube.com/playlist?list=PLKCk3OyNwIzv6CWMhvqSB_8MLJIZdO80L)
- [OCI Data Science Documentation](https://docs.cloud.oracle.com/en-us/iaas/data-science/using/data-science.htm)
- [Oracle Data & AI Blog](https://blogs.oracle.com/datascience/)