<a href="https://colab.research.google.com/github/ProtossDragoon/paper_implementation_and_testing_tf2/blob/main/utils/GDrive_to_GCS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GDrive to GCS

## Author

name : Janghoo Lee <br>
github : https://github.com/ProtossDragoon <br>
contact : dlwkdgn1@naver.com <br>
circle : https://github.com/sju-coml <br>
organization : https://web.deering.co/ <br>
published date : June, 2021

# Environment

## Import

In [1]:
import os

## Global Hyper parameters

In [2]:
HOME_DIR = "/content/gdrive/MyDrive"
DATA_DIR = os.path.join(HOME_DIR, 'data')

GIT_USERNAME = None
GIT_EMAIL = None
GIT_PASSWORD = None

## 1 - Google Drive

In [4]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [5]:
!mkdir -p {WS}

## 2 - GCP

In [10]:
GCP_BUCKET_NAME = "deer-rudolph" #@param {type:"string"}
GCP_BUCKET_DATA_FOLDER_NAME = 'data' #@param {type:"string"}
GCP_PROJECT_NAME = 'deer-deep-learning-project'#@param {type:"string"}
GCP_PROJECT_ID = 'linear-freehold-314804' #@param {type:"string"}
GCP_HOME_DIR = os.path.join('gs://', GCP_BUCKET_NAME)

from google.colab import auth
auth.authenticate_user()
!gcloud config set project {GCP_PROJECT_ID}

Updated property [core/project].


In [7]:
!gsutil ls -al

gs://deer-rudolph/


In [8]:
!ls -al

total 36
drwxr-xr-x 1 root root 4096 Jun 28 06:28 .
drwxr-xr-x 1 root root 4096 Jun 28 06:25 ..
-rw-r--r-- 1 root root  720 Jun 28 06:28 adc.json
drwxr-xr-x 1 root root 4096 Jun 28 06:28 .config
drwx------ 6 root root 4096 Jun 28 06:26 gdrive
drwxr-xr-x 1 root root 4096 Jun 15 13:37 sample_data
drwxr-xr-x 2 root root 4096 Jun 28 06:26 {WS}


### [Caution] Copy GDrive data to GCS

In [None]:
%cd {DATA_DIR}
!ls

/content/gdrive/MyDrive/data
aihubsidewalk  Surface_1.zip  Surface_3.zip  Surface_5.zip
imagenet       Surface_2.zip  Surface_4.zip


### [Caution] Specific Dataset Example

In [22]:
DATASET_NAME = 'camvid' #@param {type:"string"}
print('from (Gdrive) - {}\nto (GCS) - {}'.format(os.path.join(DATA_DIR, DATASET_NAME), 
                                                 os.path.join(GCP_HOME_DIR, GCP_BUCKET_DATA_FOLDER_NAME, DATASET_NAME)))
print('\nGoogle drive :: ')
%ls {DATA_DIR}/{DATASET_NAME}/
print('\nGoogle Cloud Storage :: ')
!gsutil ls gs://{GCP_BUCKET_NAME}/{GCP_BUCKET_DATA_FOLDER_NAME}/{DATASET_NAME}
# If CLI raise CommandException: No URLs matched: <path>, go to GCS console and make directory for <path>.

from (Gdrive) - /content/gdrive/MyDrive/data/camvid
to (GCS) - gs://deer-rudolph/data/camvid

Google drive :: 
[0m[01;34mtest[0m/       test.txt  [01;34mtrainannot[0m/  [01;34mval[0m/       val.txt
[01;34mtestannot[0m/  [01;34mtrain[0m/    train.txt    [01;34mvalannot[0m/

Google Cloud Storage :: 
CommandException: One or more URLs matched no objects.


In [23]:
# If you have a large number of files to transfer, you can perform a parallel multi-threaded/multi-processing copy using the top-level gsutil -m option
!gsutil -m cp -r {DATA_DIR}/{DATASET_NAME} gs://{GCP_BUCKET_NAME}/{GCP_BUCKET_DATA_FOLDER_NAME}/{DATASET_NAME}

Copying file:///content/gdrive/MyDrive/data/camvid/val.txt [Content-Type=text/plain]...
Copying file:///content/gdrive/MyDrive/data/camvid/test.txt [Content-Type=text/plain]...
/ [0 files][    0.0 B/  7.6 KiB]                                                / [0 files][    0.0 B/ 26.4 KiB]                                                Copying file:///content/gdrive/MyDrive/data/camvid/test/0001TP_008550.png [Content-Type=image/png]...
Copying file:///content/gdrive/MyDrive/data/camvid/train.txt [Content-Type=text/plain]...
/ [0 files][    0.0 B/245.9 KiB]                                                / [0 files][    0.0 B/275.2 KiB]                                                Copying file:///content/gdrive/MyDrive/data/camvid/test/0001TP_008640.png [Content-Type=image/png]...
/ [0 files][    0.0 B/519.9 KiB]                                                Copying file:///content/gdrive/MyDrive/data/camvid/test/0001TP_008580.png [Content-Type=image/png]...
/ [0 files][    0.0 B/

### [Caution] All datasets

In [None]:
# If you have a large number of files to transfer, you can perform a parallel multi-threaded/multi-processing copy using the top-level gsutil -m option
!gsutil -m cp -r {DATA_DIR} gs://{GCS_BUCKET_NAME}/{GCS_BUCKET_DATA_FOLDER_NAME}

Copying file:///content/gdrive/MyDrive/data/imagenet/train/n02537525.tar [Content-Type=application/x-tar]...
Copying file:///content/gdrive/MyDrive/data/imagenet/train/n02514041.tar [Content-Type=application/x-tar]...
Copying file:///content/gdrive/MyDrive/data/imagenet/train/n01728572.tar [Content-Type=application/x-tar]...
Copying file:///content/gdrive/MyDrive/data/imagenet/train/n01630670.tar [Content-Type=application/x-tar]...
Copying file:///content/gdrive/MyDrive/data/imagenet/train/n01669191.tar [Content-Type=application/x-tar]...
