<div style="text-align: right;">
  <img src="https://raw.githubusercontent.com/exasol/ai-lab/refs/heads/main/assets/Exasol_Logo_2025_Dark.svg" style="width:200px; margin: 10px;" />
</div>

# Fix the Version of Python Library Scikit-learn

This notebook ensures the AI-Lab is using the same version of the python library `scikit-learn` as the one used by the built-in [Script Language Container (SLC)](https://docs.exasol.com/db/latest/database_concepts/udf_scripts/adding_new_packages_script_languages.htm#ScriptLanguageContainer) inside the Exasol database.

## Rationale

Using identical versions is required when transferring the Scikit-learn model from the AI-Lab to the database SLC.

The AI-Lab serializes the Scikit-learn model with [pickle](https://docs.python.org/3/library/pickle.html) and uploads it into the BucketFS of the database. The UDF using the built-in SLC can only _deserialize_ the model if it is using the same version of Scikit-learn as was used for serializing it. The specific version of the library available in the built-in SLC depends on the release version of the database and cannot be controlled by the AI-Lab.

Running the following script will update the version of the library used in the AI-Lab, if required.

## Open Secure Configuration Storage

In [None]:
%run ../utils/access_store_ui.ipynb
display(get_access_store_ui('../'))

## Detect the Version of Scikit-learn Used in the SLC

The following cell creates a User Defined Function (UDF) called `detect_scikit_learn_version()` and then executes the UDF using the built-in SLC via an SQL statement.

The UDF inquires and returns the version of Scikit-learn available in the built-in SLC which is then stored in variable `slc_scikit_learn_version`.

In [None]:
import textwrap
from exasol.nb_connector.connections import open_pyexasol_connection

sql = textwrap.dedent("""
CREATE OR REPLACE PYTHON3 SCALAR SCRIPT {schema!q}.detect_scikit_learn_version() RETURNS VARCHAR(100) AS
import sklearn
def run(ctx):
    return sklearn.__version__ 
/
""")

with open_pyexasol_connection(ai_lab_config, compression=True) as conn:
    query_params={'schema': ai_lab_config.db_schema}
    conn.execute(sql, query_params)
    result = conn.execute("select {schema!q}.detect_scikit_learn_version()", query_params).fetchone()
    slc_scikit_learn_version = result[0]

## Compare the Scikit-learn Version and Update the AI-Lab if Required

The next cell compares the Scikit-learn version returned by the UDF with the Scikit-learn version in the AI-Lab environment. If they differ, then the cell installs the UDF's Scikit-learn version in the AI-Lab environment.

In [None]:
import sklearn
from importlib import reload

my_version = sklearn.__version__

if slc_scikit_learn_version == my_version:
    print(f"AI-Lab scikit-learn version {my_version} is identical to that of the SLC.\nNothing to do.")
else:
    print(f"AI-Lab scikit-learn version {my_version} differs from SLC.\nInstalling version {slc_scikit_learn_version} ...")
    %pip install "scikit_learn=={slc_scikit_learn_version}"
    sklearn = reload(sklearn)
    print(f"Updated AI-Lab scikit-learn to version {sklearn.__version__}.")