<a href="https://colab.research.google.com/github/informatics-isi-edu/eye-ai-exec/blob/main/notebooks/VGG19_Diagnosis_Predict.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# VGG19 Model Application

This notebook applied a pre-trained model to a dataset specified in the configuration file and uploads the labels to the catalog.  The ROC curve is also calculated and uploaded.


In [1]:
# Prerequisites to configure colab
import sys
IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    !pip install deriva
    !pip install bdbag
    !pip install --upgrade --force pydantic
    !pip install git+https://github.com/informatics-isi-edu/deriva-ml git+https://github.com/informatics-isi-edu/eye-ai-ml
    !pip install setuptools_git_versioning


In [2]:
repo_dir = "Repos"   # Set this to be where your github repos are located.
%load_ext autoreload
%autoreload 2

# Update the load path so python can find modules for the model
import sys
from pathlib import Path
sys.path.insert(0, str(Path.home() / repo_dir / "eye-ai-ml"))

In [3]:
# Prerequisites

import json
import os
from eye_ai.eye_ai import EyeAI
import pandas as pd
from pathlib import Path, PurePath
import logging
# import torch

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', force=True)

In [4]:

from deriva.core.utils.globus_auth_utils import GlobusNativeLogin
catalog_id = "eye-ai" #@param
host = 'www.eye-ai.org'


gnl = GlobusNativeLogin(host=host)
if gnl.is_logged_in([host]):
    print("You are already logged in.")
else:
    gnl.login([host], no_local_server=True, no_browser=True, refresh_tokens=True, update_bdbag_keychain=True)
    print("Login Successful")

2024-06-27 08:41:39,990 - INFO - Creating client of type <class 'globus_sdk.services.auth.client.native_client.NativeAppAuthClient'> for service "auth"
2024-06-27 08:41:39,991 - INFO - Finished initializing AuthLoginClient. client_id='8ef15ba9-2b4a-469c-a163-7fd910c9d111', type(authorizer)=<class 'globus_sdk.authorizers.base.NullAuthorizer'>


You are already logged in.


Connect to Eye-AI catalog.  Configure to store data local cache and working directories.  Initialize Eye-AI for pending execution based on the provided configuration file.

In [5]:
# Variables to configure the rest of the notebook.

cache_dir = '/data'        # Directory in which to cache materialized BDBags for datasets
working_dir = '/data'    # Directory in which to place output files for later upload.

configuration_rid="2-C8MM"      # Configuration file for this run.  Needs to be changed for each execution.

In [6]:
EA = EyeAI(hostname = host, catalog_id = catalog_id, cache_dir= cache_dir, working_dir=working_dir)

2024-06-27 08:41:40,943 - INFO - Creating client of type <class 'globus_sdk.services.auth.client.native_client.NativeAppAuthClient'> for service "auth"
2024-06-27 08:41:40,944 - INFO - Finished initializing AuthLoginClient. client_id='8ef15ba9-2b4a-469c-a163-7fd910c9d111', type(authorizer)=<class 'globus_sdk.authorizers.base.NullAuthorizer'>


In [7]:
# @title Initiate an Execution
configuration_records = EA.execution_init(configuration_rid=configuration_rid)
input_dataset = configuration_records.bag_paths[0] # Assumes that the configuration file only specifies one dataset.
configuration_records.model_dump()



2024-06-27 08:41:42,583 - INFO - File [/data/sreenidhi/EyeAI_working/Execution_Metadata/Execution_Config-Vgg19_Untrained_Eval_LAC_DHS_diagnosis_insert_june_27_2024_.json] transfer successful. 0.83 KB transferred. Elapsed time: 0:00:00.000082.
2024-06-27 08:41:42,584 - INFO - Verifying MD5 checksum for downloaded file [/data/sreenidhi/EyeAI_working/Execution_Metadata/Execution_Config-Vgg19_Untrained_Eval_LAC_DHS_diagnosis_insert_june_27_2024_.json]
2024-06-27 08:41:42,604 - INFO - Configuration validation successful!


{'caching_dir': PosixPath('/data'),
 'working_dir': PosixPath('/data/sreenidhi/EyeAI_working'),
 'vocabs': {'Workflow_Type': [{'name': 'VGG19_Untrained_Eval_Diagnosis_Predict',
    'rid': '2-C8MY'}],
  'Execution_Asset_Type': [{'name': 'VGG19_Untrained_Eval_Diagnosis_Predict',
    'rid': '2-C8N0'}]},
 'execution_rid': '2-C8MW',
 'workflow_rid': '2-C8MR',
 'bag_paths': [PosixPath('/data/2-277M_6b713c2652c2d45c4bc0e0633e9a739569e98dcf0f10d3cfd403e29468d9d5d0/Dataset_2-277M')],
 'assets_paths': [],
 'configuration_path': PosixPath('/data/sreenidhi/EyeAI_working/Execution_Metadata/Execution_Config-Vgg19_Untrained_Eval_LAC_DHS_diagnosis_insert_june_27_2024_.json')}

Algorithm was trained on cropped images, so take the raw images and bounding boxes and apply, storing the results in the working directory.

In [8]:
configuration_records

ConfigurationRecord(caching_dir=PosixPath('/data'), working_dir=PosixPath('/data/sreenidhi/EyeAI_working'), vocabs={'Workflow_Type': [Term(name='VGG19_Untrained_Eval_Diagnosis_Predict', rid='2-C8MY')], 'Execution_Asset_Type': [Term(name='VGG19_Untrained_Eval_Diagnosis_Predict', rid='2-C8N0')]}, execution_rid='2-C8MW', workflow_rid='2-C8MR', bag_paths=[PosixPath('/data/2-277M_6b713c2652c2d45c4bc0e0633e9a739569e98dcf0f10d3cfd403e29468d9d5d0/Dataset_2-277M')], assets_paths=[], configuration_path=PosixPath('/data/sreenidhi/EyeAI_working/Execution_Metadata/Execution_Config-Vgg19_Untrained_Eval_LAC_DHS_diagnosis_insert_june_27_2024_.json'))

In [9]:
str(EA.working_dir)

'/data/sreenidhi/EyeAI_working'

In [10]:
# @title Get Cropped Images
cropped_image_path, cropped_csv = EA.create_cropped_images(str(configuration_records.bag_paths[0]),
                                                           output_dir = str(EA.working_dir),
                                                           crop_to_eye=True)

In [11]:
output_path = str(EA.working_dir) + "/Execution_Assets/" + configuration_records.vocabs['Execution_Asset_Type'][0].name
output_path

'/data/sreenidhi/EyeAI_working/Execution_Assets/VGG19_Untrained_Eval_Diagnosis_Predict'

Import the actual model code and then run against the input dataset specified in the configuration file.  

In [12]:
# @title Execute Proecss algorithm (Test model)
from eye_ai.models.vgg19_untrained_eval import evaluate_untrained_model

with EA.execution(execution_rid=configuration_records.execution_rid) as exec:
  evaluate_untrained_model(cropped_image_path, output_path)

2024-06-27 08:44:09.191102: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-27 08:44:09.191149: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-27 08:44:09.191997: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-27 08:44:09.198120: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-06-27 08:44:10.641543: I external/local_xla/xla/

Found 1094 images belonging to 2 classes.


2024-06-27 08:44:11.695539: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8907
2024-06-27 08:44:23,264 - INFO - Predictions saved to /data/sreenidhi/EyeAI_working/Execution_Assets/VGG19_Untrained_Eval_Diagnosis_Predict/untrained_vgg19_predictions.csv
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
2024-06-27 08:44:23,391 - INFO - ROC curve saved to /data/sreenidhi/EyeAI_working/Execution_Assets/VGG19_Untrained_Eval_Diagnosis_Predict/roc_curve.png



Scikit-learn Metrics:
ROC AUC: 0.46870648797729336
F1 Score: 0.32469135802469135
F1 Score Normal: 0.0
Precision: 0.0
Recall: 0.0
Accuracy: 0.48080438756855576
Balanced Accuracy: 0.5
Matthews correlation coefficient: 0.0

Classification Report:
               precision    recall  f1-score   support

         0.0       0.48      1.00      0.65       526
         1.0       0.00      0.00      0.00       568

    accuracy                           0.48      1094
   macro avg       0.24      0.50      0.32      1094
weighted avg       0.23      0.48      0.31      1094



Add the new lables to the catalog using the provided diagnosis tage for this execution.  Also upload any additional assets that were produced by this execution..

In [14]:
# @title Save Execution Assets (model) and Metadata
uploaded_assets = EA.execution_upload(configuration_records.execution_rid, True)


2024-06-27 08:46:11,959 - INFO - Initializing uploader: GenericUploader v1.7.1 [Python 3.10.13, Linux-5.10.210-201.852.amzn2.x86_64-x86_64-with-glibc2.26]
2024-06-27 08:46:11,960 - INFO - Creating client of type <class 'globus_sdk.services.auth.client.native_client.NativeAppAuthClient'> for service "auth"
2024-06-27 08:46:11,961 - INFO - Finished initializing AuthLoginClient. client_id='8ef15ba9-2b4a-469c-a163-7fd910c9d111', type(authorizer)=<class 'globus_sdk.authorizers.base.NullAuthorizer'>
2024-06-27 08:46:11,998 - INFO - Checking for updated configuration...
2024-06-27 08:46:12,118 - INFO - Updated configuration found.
2024-06-27 08:46:12,119 - INFO - Scanning files in directory [/data/sreenidhi/EyeAI_working/Execution_Assets/VGG19_Untrained_Eval_Diagnosis_Predict]...
2024-06-27 08:46:12,123 - INFO - Including file: [/data/sreenidhi/EyeAI_working/Execution_Assets/VGG19_Untrained_Eval_Diagnosis_Predict/untrained_vgg19_predictions.csv].
2024-06-27 08:46:12,124 - INFO - Including fil