# PGCView Image Prediction Pipeline
This Python notebook will allow you to load the PGC View model API endpoints, upload your images, and then process them.

Instructions:
* Connect to a CPU instance using the menu at the top right of the notebook (no need to use a GPU runtime)
* Once connected, go to the 'Files' directory on the left sidebar and upload all your images in 'RGPCV_fastapi/assets/images'
* Click 'run' on each cell or use the toolbar and got to  'Runtime' -> 'Run all'
* Wait for all the images to finish processing and then download image predictions and tabular data from 'output'.

## Step 1
Run the cell below to mount your Google Drive

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

## Step 2
Clone the GitHub repository for the pipeline

In [None]:
# Grab the pipeline code from GitHub
!git clone https://github.com/BoMeyering/regen_pgc_inference_pipeline.git
%cd regen_pgc_inference_pipeline/

In [None]:
%cd ../

In [None]:
from pipeline import run_pipeline, get_filenames

 ## Step 3
Once all the above cells have run successfully, add all of your images into ```assets/images```. Wait for the uploads to finish, and then run the cell below.
It should output a list of all the images names like

```
['image_1.jpg', 'image_2.jpg', ... , 'image_n.jpg']
```

If there is no output or if the output is an empty list ```[]```, then you uploaded your images to the wrong directory.


UPDATE: If you have images stored in Google Drive and don't want to download them locally before reuploading them, uncomment the line below that says
```
#CUSTOM_DIR = "PATH/TO/YOUR/IMAGES/HERE"
```

and replace with the path to your images in Google Drive like so
```
CUSTOM_DIR = 'drive/MyDrive/pgc_project/images'  # Example only
```

In [None]:
# IF YOUR IMAGES ARE IN 'assets/images'
img_filenames = get_filenames()
img_filenames['filenames']

# Grab all of the image filenames in 'assets/images'

# CUSTOM_DIR = "PATH/TO/YOUR/IMAGES/HERE"
# img_filenames = get_filenames(CUSTOM_DIR)
# img_filenames['filenames']

## Step 4
Run the cell below. This sends each of the images in ```images``` list to the model API and sends the results to the image analysis pipeline.

This step might take a long time to complete (15-20 seconds per image) depending on the type of server connection.

In [None]:
# Run the image analysis pipeline
run_pipeline(img_filenames)

## Step 5
After the ```run_pipeline``` function is finished, you can check the model outputs in ```outputs/```