<div style="text-align: center;">
    <h1><strong>Evaluate a Resnet Model </strong></h1>
    <h1><strong>Azure posgresql Database & Azure Blob Storage</strong></h1>
</div> 

# Goal of the notebook:
#### The purpose of the notebook is to load a Resnet model previously trained and to evaluate its performance  on a test dataset
#### We assumed that the test dataset is generated according to the same approach than the one used to train the model (see train_resnet_model_local.ipynb or train_resnet_model_local.ipynb)
#### Albeit images used for testing should not have been used during training for data leakage consideration (ie. no date overlap)
#### Evaluation metrics displayed here are the ones that have been set while compiling the model before training (Loss, Accuracy, Precision and Recall)
#### The notebook is meant to be used if the storage solution is set for <strong>Azure posgresql Database & Azure Blob Storage</strong>

# Summary:
### 1- Import of Packages and Dependencies
### 2- Import Environment Variables
### 3- Finding MLFlow experiments and runs 
### 4- Generate the test dataset
### 5- Evaluate the model on the test dataset

# 1- Import of Packages and Dependencies

In [1]:
import os
from dotenv import load_dotenv
from utils.build_dataset import *
import mlflow

2025-01-02 16:43:34.999055: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-01-02 16:43:35.002091: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-01-02 16:43:35.012053: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-01-02 16:43:35.029094: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1735832615.049891    3569 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1735832615.05

# 2- Import Environment Variables

In [2]:
# Load environment variables from the .env file
load_dotenv()

# Access environment variables using os.getenv() method
# We need api_key and pai_url to connect to the API and get the data
api_key = os.getenv("API_KEY")
api_url = os.getenv("API_URL")

# We need the follow variables to connect to the Azure Blob Storage
container_name = os.getenv("AZURE_STORAGE_CONTAINER_NAME")
storage_account_name = os.getenv("AZURE_STORAGE_ACCOUNT_NAME")
connection_string = os.getenv("AZURE_STORAGE_CONNECTION_STRING")

# We need the follow variables to connect to the Azure Posgresql Database
pghost = os.getenv("PGHOST")
pguser = os.getenv("PGUSER")
pgport = os.getenv("PGPORT")
pgdatabase = os.getenv("PGDATABASE")
pgpassword = os.getenv("PGPASSWORD")

# 3- Finding MLFlow experiments and runs  

#### We first need to define the tracking URI for MLflow so that it can log the results

In [3]:
# We first need to define the tracking URI for MLflow so that it can log the results
tracking_uri=f"postgresql://{pguser}:{pgpassword}@{pghost}:{pgport}/{pgdatabase}"
mlflow.set_tracking_uri(tracking_uri)

#### Now we can list our experiments and select the one we want to use

In [4]:
# List all experiments
experiments = mlflow.search_experiments()

# Print the experiment details
for experiment in experiments:
    print(f"Experiment ID: {experiment.experiment_id}")
    print(f"Name: {experiment.name}")
    print(f"Artifact Location: {experiment.artifact_location}")
    print(f"Lifecycle Stage: {experiment.lifecycle_stage}")
    print("------------------------------")

Experiment ID: 7
Name: my_experiment_v2
Artifact Location: wasbs://mlflow@mlflowstoredjerome.blob.core.windows.net?
Lifecycle Stage: active
------------------------------
Experiment ID: 6
Name: my_experiment
Artifact Location: wasbs://mlflow@mlflowstoredjerome.blob.core.windows.net?
Lifecycle Stage: active
------------------------------
Experiment ID: 0
Name: Default
Artifact Location: wasbs://mlflow@mlflowstoredjerome.blob.core.windows.net/0?sp=racwdli&st=2024-12-30T16:51:05Z&se=2025-01-30T00:51:05Z&sip=91.164.251.67&sv=2022-11-02&sr=c&sig=dw%2FwpN%2Bd0Kd%2BBC2Pm9QDfLMCAB%2BKkMXGVRxcdXAgLFE%3D
Lifecycle Stage: active
------------------------------


#### Now we choose the ID of the experiment we want to log the results to

In [5]:
# we set the ID of the experiment we want to use
experiment_id = "7"
# We search for the runs in the experiment
runs = mlflow.search_runs(experiment_ids=experiment_id)
# We view the runs in the experiment (a pd.DataFrame)
runs

Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.validation_precision,metrics.validation_accuracy,metrics.accuracy,metrics.precision,...,params.validation_freq,params.optimizer_global_clipnorm,params.epochs,params.validation_split,params.optimizer_epsilon,tags.mlflow.source.name,tags.mlflow.user,tags.mlflow.log-model.history,tags.mlflow.source.type,tags.mlflow.runName
0,cd3c957afed34324b5776f888993f6db,7,FINISHED,wasbs://mlflow@mlflowstoredjerome.blob.core.wi...,2025-01-02 15:07:34.158000+00:00,2025-01-02 15:09:24.204000+00:00,1.0,1.0,1.0,1.0,...,1,,5,0.0,1e-07,/home/jerome/.pyenv/versions/3.9.17/envs/resne...,jerome,"[{""run_id"": ""cd3c957afed34324b5776f888993f6db""...",LOCAL,righteous-grouse-925
1,824ce9589ccb45d59e7d075d9538907f,7,FINISHED,wasbs://mlflow@mlflowstoredjerome.blob.core.wi...,2025-01-02 15:05:05.516000+00:00,2025-01-02 15:06:58.944000+00:00,,1.0,1.0,,...,1,,5,0.0,1e-07,/home/jerome/.pyenv/versions/3.9.17/envs/resne...,jerome,"[{""run_id"": ""824ce9589ccb45d59e7d075d9538907f""...",LOCAL,luxuriant-snake-36
2,5e20f4480ea740bf8576ee9f71fcddf8,7,FINISHED,wasbs://mlflow@mlflowstoredjerome.blob.core.wi...,2025-01-02 14:52:14.981000+00:00,2025-01-02 14:54:21.901000+00:00,,1.0,1.0,,...,1,,5,0.0,1e-07,/home/jerome/.pyenv/versions/3.9.17/envs/resne...,jerome,"[{""run_id"": ""5e20f4480ea740bf8576ee9f71fcddf8""...",LOCAL,omniscient-ant-478


In [6]:
runs = mlflow.search_runs(experiment_ids=experiment_id)

# Print the experiment details
for index in runs.index:
    print(f"run ID: {runs[['run_id']].iloc[index]}")
    print(f"Mlflow name: {runs[['tags.mlflow.runName']].iloc[index]}")
    print(f"Artifact URI: {runs[['artifact_uri']].iloc[index]}")
    print("------------------------------")

run ID: run_id    cd3c957afed34324b5776f888993f6db
Name: 0, dtype: object
Mlflow name: tags.mlflow.runName    righteous-grouse-925
Name: 0, dtype: object
Artifact URI: artifact_uri    wasbs://mlflow@mlflowstoredjerome.blob.core.wi...
Name: 0, dtype: object
------------------------------
run ID: run_id    824ce9589ccb45d59e7d075d9538907f
Name: 1, dtype: object
Mlflow name: tags.mlflow.runName    luxuriant-snake-36
Name: 1, dtype: object
Artifact URI: artifact_uri    wasbs://mlflow@mlflowstoredjerome.blob.core.wi...
Name: 1, dtype: object
------------------------------
run ID: run_id    5e20f4480ea740bf8576ee9f71fcddf8
Name: 2, dtype: object
Mlflow name: tags.mlflow.runName    omniscient-ant-478
Name: 2, dtype: object
Artifact URI: artifact_uri    wasbs://mlflow@mlflowstoredjerome.blob.core.wi...
Name: 2, dtype: object
------------------------------


#### We will use here the first run of the experiment as an example (e.g. at the index 0).
#### We need the run_id to load the model.

In [7]:
run_id = mlflow.search_runs(experiment_ids=experiment_id).iloc[0].run_id

In [8]:
print(f"The run ID is : {run_id}")

The run ID is : cd3c957afed34324b5776f888993f6db


#### Now we can load the model from the run.

In [9]:
# Load the model from the MLflow run
model_uri = f"runs:/{run_id}/model"  # replace with the actual run ID

# Load the Keras model
loaded_model = mlflow.keras.load_model(model_uri)

2025-01-02 16:44:48.065646: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


# 4- Generate the test dataset

#### We first need to get the name of the Resnet model from which our model was fine-tuned to propress the test dataset
#### We can get this information from the MLflow run metadata. 

In [10]:
# from MLFlow we call the parameters of the model and get the model_name (to be passed in the propressing function)
params = mlflow.get_run(run_id).data.params
model_name = params['model_name']
print(f"Model name: {model_name}")

Model name: ResNet50


#### We set the parameters to built the test dataset like we did for the training dataset
#### We use a slighty different function to create the test dataset because we don't need to split the data
#### Otherwise the pipeline is the same

In [11]:
image_dir = "media"
labels = ["vine", "grass", "ground"]
start_date = "2021-03-01"
end_date = "2021-05-01"

# We make the train and validation datasets
image_urls = get_image_urls_with_multiple_labels(labels, start_date, end_date, api_key, api_url)
# Download images and create a sample map
df_sample_map = create_sample_map(image_urls)
df_sample_map = download_images(df_sample_map, image_dir)

test_dataset = create_test_dataset(df_sample_map,
                              image_dir = 'media',
                              model_name = model_name,
                              )

Number of urls collected for vine: 1
Number of urls collected for grass: 28
Number of urls collected for ground: 23
Dataframe created successfully with shape : (52, 4)
Preprocess_input function for 'ResNet50' loaded successfully.


# 5-  Evaluate the model on the test dataset

#### Now we can evaluate the model on the test dataset
#### The metrics displayed here are the ones that have been set while compiling the model before training

In [12]:
# Evaluate the model
results = loaded_model.evaluate(test_dataset)

# We print the results
print(f"Test Loss: {results[0]}")
print(f"Test Accuracy: {results[1]}")
print(f"Test Precision: {results[2]}")
print(f"Test Recall: {results[3]}")

[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 217ms/step - accuracy: 0.0520 - loss: 11.3382 - precision: 0.0520 - recall: 0.0520
Test Loss: 12.658583641052246
Test Accuracy: 0.01923076994717121
Test Precision: 0.01923076994717121
Test Recall: 0.01923076994717121
