# Clone GitHub Repo: grapeleafGPT

This repo contains all of the files necessary to train AnomalyGPT on the [Kaggle Grape Leaf Disease dataset](https://www.kaggle.com/datasets/rm1000/grape-disease-dataset-original) and also the files needed to train AnomalyGPT on the MVTec dataset described in the AnomalyGPT documentation.

The following cells install requirements for training grapeleafGPT.




In [None]:
!git clone https://github.com/mikejrodd/grapeleafGPT


In [None]:
%cd /content/grapeleafGPT
!pip install -r requirements.txt

In [None]:
!pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117


# Connect Google Drive

The model checkpoints and datasets are too large to save in Colab, so they are linked here from my personal MyDrive

In [None]:
from google.colab import drive
drive.mount('/content/drive')


# Link to Datasets, Checkpoints, and Weights

ImageBind, PandaGPT, Vicuna are added to the script for model training.

Grape leaf images are linked as well as PandaGPT image set. If you are training on MVTec, be sure to add a link to your saved MVTec dataset.

In [None]:
!mkdir -p /content/grapeleafGPT/pretrained_ckpt/imagebind_ckpt
!ln -s /content/drive/MyDrive/gtprac/imagebind_huge.pth /content/grapeleafGPT/pretrained_ckpt/imagebind_ckpt/imagebind_huge.pth

!mkdir -p /content/grapeleafGPT/pretrained_ckpt/vicuna_ckpt/7b_v0
!ln -s /content/drive/MyDrive/gtprac/7b_v0/* /content/grapeleafGPT/pretrained_ckpt/vicuna_ckpt/7b_v0/

!mkdir -p /content/grapeleafGPT/data/
!ln -s /content/drive/MyDrive/gtprac/data/pandagpt4_visual_instruction_data.json /content/grapeleafGPT/data/pandagpt4_visual_instruction_data.json

!ln -s /content/drive/MyDrive/gtprac/leaf_images/grapeleaves /content/grapeleafGPT/data/grapeleaves

!mkdir -p /content/grapeleafGPT/pretrained_ckpt/pandagpt_ckpt/7b
!ln -s /content/drive/MyDrive/gtprac/pytorch_model.pt /content/grapeleafGPT/pretrained_ckpt/pandagpt_ckpt/7b/pytorch_model.pt

# Link to pandaGPT images
!mkdir -p /content/grapeleafGPT/data/images
!ln -s /content/drive/MyDrive/gtprac/data/images/* /content/grapeleafGPT/data/images/


# Train grapeleafGPT

The following script will begin training the grapeleafGPT model. The training may take longer than the Colab runtime limit (24 hours), in which case you may have to load the last checkpoint and continue training.

In [None]:
# Run this cell to train grapeleafGPT

%cd /content/grapeleafGPT/code
!bash ./scripts/train_grapeleaves_panda.sh

# Save trained model

Run the below cell to save the trained model to MyDrive

In [None]:
# import os
# import shutil

# source_dir = '/content/grapeleafGPT/code/ckpt/train_grapeleaves'
# dest_dir = '/content/drive/MyDrive/gtprac/grapeleaves_trained_model_panda/train_grapeleaves'

# os.makedirs(dest_dir, exist_ok=True)

# for root, dirs, files in os.walk(source_dir):
#     for file in files:
#         source_file = os.path.join(root, file)
#         # Construct the destination file path
#         relative_path = os.path.relpath(root, source_dir)
#         dest_path = os.path.join(dest_dir, relative_path)
#         os.makedirs(dest_path, exist_ok=True)  # Create directories if they don't exist
#         shutil.copy(source_file, dest_path)  # Copy each file

# print("Files copied successfully.")


Files copied successfully.


# Test grapeleafGPT

Run the cell below to load the previously trained model from MyDrive and then run test_grapeleaves.py.

Output will be in the following format:

```python
grapeleaves right: 1437 wrong: 2288
i_AUROC: 67.24709677419354
p_AUROC: 58.694516129032266
precision: 38.57718276977539


In [None]:
# Load train_grapeleaves model (fully trained)
!mkdir -p /content/grapeleafGPT/code/ckpt/train_grapeleaves
!ln -s /content/drive/MyDrive/gtprac/grapeleaves_trained_model_panda/train_grapeleaves/* /content/grapeleafGPT/code/ckpt/train_grapeleaves/

# Run test script
%cd /content/grapeleafGPT/code
!python test_grapeleaves.py