# Move amlr07-20221204 GCS files

The code for processiong amlr07-20221204 shadowgraph images (Cutter's code, adapted from Ohman et al methods) wrote out processed images into a single directory. The purpose of this notebook is to copy these files to their own folders, to be imported into VIAME-Web-AMLR.

Image types: 
- -ffPCG.png images: Flatfielded images, with Pixel Gamma Correction
- -imgff.png images: Flatfielded images
- .jpgorig-regions.jpg: Original jpg images, with red region bounding boxes pasted onto them

Note that both versions of flatfielded images have had other processing steps applied, such as masking.

Import modules, inlcuding 'sourcing' py file with functions

In [1]:
from google.cloud import storage
from itertools import repeat
import subprocess
import pandas as pd
import multiprocessing as mp
import time

%run -m file_move

## Variables and Prep

Set variable names, and generate list of files to rename

In [None]:
storage_client = storage.Client(project = "ggn-nmfs-usamlr-dev-7b99")
source_bucket_name    = "amlr-imagery-proc-dev"
destination_bucket_name = "amlr-gliders-imagery-proc-dev"

source_bucket = storage_client.bucket(source_bucket_name)
destination_bucket = storage_client.bucket(destination_bucket_name)

# file_substr    = "-ffPCG"
# file_substr    = "-imgff"
file_substr    = "jpgorig-regions"

numcores = mp.cpu_count()
print(f"Running with {numcores} cores")

print(f"\nStart time of all: {time.strftime('%Y-%m-%d %H:%M:%S')}")
# for z in range(0, 32):    
for z in range(0, 32):    
    file_prefix = f"gliders/2022/amlr07-20221204/shadowgraph/images/Dir0{z:02}"
    print("------------------------------------------------------")
    print(file_prefix)
    start_time = time.time()
    
    file_list_orig = list_blobs_with_prefix(
        source_bucket_name, file_prefix, file_substr=file_substr)   
    
    file_list_destination = []
    for i in file_list_orig:
        i_orig = i
        i = i.replace("gliders/2022", "FREEBYRD/2023")
        i = i.replace("/shadowgraph/images", f"/images{file_substr}")
        i = i.replace("/output", "")
        file_list_destination.append(i)
        
    print(f"copying {len(file_list_orig)} files with {file_substr} " +
        f"with the prefix {source_bucket_name}/{file_prefix}")
    print(f"destination list list length: {len(file_list_destination)}")
        
    with mp.Pool(numcores) as pool:
        out_list = pool.starmap(
            copy_blob_client, 
            zip(repeat(source_bucket_name), file_list_orig, 
                repeat(destination_bucket_name), file_list_destination)
        )
        
    # for (i, j) in zip(file_list_orig, file_list_destination):
    #     if destination_bucket.blob(j).exists():
    #         continue
    #         # print(f"skipping {destination_bucket.blob(j).name}")
    #     else:
    #         copy_blob(source_bucket, i, destination_bucket, j)
            
    print(f"Time is {time.strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"Full z runtime: {(time.time()-start_time)/60} minutes")
