<div style="text-align: center;">
    <h1><strong>Evaluate a Resnet Model </strong></h1>
    <h1><strong>Azure posgresql Database & Azure Blob Storage</strong></h1>
</div> 

# Goal of the notebook:
#### The purpose of the notebook is to load a Resnet model previously trained and to evaluate its performance  on a test dataset
#### We assumed that the test dataset is generated according to the same approach than the one used to train the model (see train_resnet_model_local.ipynb or train_resnet_model_local.ipynb)
#### Albeit images used for testing should not have been used during training for data leakage consideration (ie. no date overlap)
#### Evaluation metrics displayed here are the ones that have been set while compiling the model before training (Loss, Accuracy, Precision and Recall)
#### The notebook is meant to be used if the storage solution is set for <strong>Azure posgresql Database & Azure Blob Storage</strong>

# Summary:
### 1- Import of Packages and Dependencies
### 2- Import Environment Variables
### 3- Finding MLFlow experiments and runs 
### 4- Generate the test dataset
### 5- Evaluate the model on the test dataset

# 1- Import of Packages and Dependencies

In [None]:
import os
from dotenv import load_dotenv
from utils.build_dataset import *
import mlflow

# 2- Import Environment Variables

In [3]:
# Load environment variables from the .env file
load_dotenv()

# Access environment variables using os.getenv() method
# We need api_key and pai_url to connect to the API and get the data
api_key = os.getenv("API_KEY")
api_url = os.getenv("API_URL")

# We need the follow variables to connect to the Azure Blob Storage
container_name = os.getenv("AZURE_STORAGE_CONTAINER_NAME")
storage_account_name = os.getenv("AZURE_STORAGE_ACCOUNT_NAME")
connection_string = os.getenv("AZURE_STORAGE_CONNECTION_STRING")

# We need the follow variables to connect to the Azure Posgresql Database
pghost = os.getenv("PGHOST")
pguser = os.getenv("PGUSER")
pgport = os.getenv("PGPORT")
pgdatabase = os.getenv("PGDATABASE")
pgpassword = os.getenv("PGPASSWORD")

# 3- Finding MLFlow experiments and runs  

#### We first need to define the tracking URI for MLflow so that it can log the results

In [4]:
# We first need to define the tracking URI for MLflow so that it can log the results
tracking_uri=f"postgresql://{pguser}:{pgpassword}@{pghost}:{pgport}/{pgdatabase}"
mlflow.set_tracking_uri(tracking_uri)

#### Now we can list our experiments and select the one we want to use

In [None]:
# List all experiments
experiments = mlflow.search_experiments()

# Print the experiment details
for experiment in experiments:
    print(f"Experiment ID: {experiment.experiment_id}")
    print(f"Name: {experiment.name}")
    print(f"Artifact Location: {experiment.artifact_location}")
    print(f"Lifecycle Stage: {experiment.lifecycle_stage}")
    print("------------------------------")

#### Now we choose the ID of the experiment we want to log the results to

In [None]:
# we set the ID of the experiment we want to use
experiment_id = "7"
# We search for the runs in the experiment
runs = mlflow.search_runs(experiment_ids=experiment_id)
# We view the runs in the experiment (a pd.DataFrame)
runs

In [None]:
runs = mlflow.search_runs(experiment_ids=experiment_id)

# Print the experiment details
for index in runs.index:
    print(f"run ID: {runs[['run_id']].iloc[index]}")
    print(f"Mlflow name: {runs[['tags.mlflow.runName']].iloc[index]}")
    print(f"Artifact URI: {runs[['artifact_uri']].iloc[index]}")
    print("------------------------------")

#### We will use here the first run of the experiment as an example (e.g. at the index 0).
#### We need the run_id to load the model.

In [8]:
run_id = mlflow.search_runs(experiment_ids=experiment_id).iloc[0].run_id

In [None]:
print(f"The run ID is : {run_id}")

#### Now we can load the model from the run.

In [None]:
# Load the model from the MLflow run
model_uri = f"runs:/{run_id}/model"  # replace with the actual run ID

# Load the Keras model
loaded_model = mlflow.keras.load_model(model_uri)

# 4- Generate the test dataset

#### We first need to get the name of the Resnet model from which our model was fine-tuned to propress the test dataset
#### We can get this information from the MLflow run metadata. 

In [None]:
# from MLFlow we call the parameters of the model and get the model_name (to be passed in the propressing function)
params = mlflow.get_run(run_id).data.params
model_name = params['model_name']
print(f"Model name: {model_name}")

#### We set the parameters to built the test dataset like we did for the training dataset
#### We use a slighty different function to create the test dataset because we don't need to split the data
#### Otherwise the pipeline is the same

In [None]:
image_dir = "media"
labels = ["vine", "grass", "ground"]
start_date = "2021-03-01"
end_date = "2021-05-01"

# We make the train and validation datasets
image_urls = get_image_urls_with_multiple_labels(labels, start_date, end_date, api_key, api_url)
# Download images and create a sample map
df_sample_map = create_sample_map(image_urls)
df_sample_map = download_images(df_sample_map, image_dir)

test_dataset = create_test_dataset(df_sample_map,
                              image_dir = 'media',
                              model_name = model_name,
                              )

# 5-  Evaluate the model on the test dataset

#### Now we can evaluate the model on the test dataset
#### The metrics displayed here are the ones that have been set while compiling the model before training

In [None]:
# Evaluate the model
results = loaded_model.evaluate(test_dataset)

# We print the results
print(f"Test Loss: {results[0]}")
print(f"Test Accuracy: {results[1]}")
print(f"Test Precision: {results[2]}")
print(f"Test Recall: {results[3]}")