In [4]:
import os; os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'

In [5]:
%load_ext autoreload
%autoreload 2

# Terminology

- **web console**
    - access from browser(login required)
- **project**
    - projects menu located at top left hand corner of the web console
    - **ALL** resources **MUST** belong to a project
    - in another word, **ALL** resources **MUST** associate with 1 and only 1 project
- **bucket** (project's resources)
    - storage space associated to a project
    - access by side menu on the left under **'Cloud Storage'**
    - access right can be configured separately (users **DO NOT** necessary to be a member of the associated project)
- **blob**
    - directories and files sit inside a bucket is a blob
    - ways of manipulation of blob are very limited in the web console
    - accessible with gcp cli binary (**gcloud** and **gsutil**)
    - also accessible in python code with ```from google.cloud import storage```
    - access right depend on environment variables ```GOOGLE_APPLICATION_CREDENTIALS=<key_location>```
- **key**
    - associate with project
    - credential(refer as token in some library) for accessing resources
    - access right can be manage by key (different key can be used in different segment of code to avoid overwriting important files)

# Access Bucket in web console

https://console.cloud.google.com/storage/browser/languini-ai-bucket<br/>
Click above link to access bucket in web console (you must be login to your gcp account).<br/>
The web console provided basic operation to the bucket
- **OBJECTS** tab: show file structure, and provide basic feature like upload.
- **CONFIGURATION** tab: show detail information of _blob_

[Download key](https://storage.cloud.google.com/languini-ai-bucket/languini-ai-336b4987d9c1.json) to enable access from ```gcloud```, ```gsutil``` and python script (including ipython anbd jupyter)
- place the downloaded key in somewhere safe (e.g. ```/home/<os_user_name>/code/<github_nickname>/gcp/```)

# Access Bucket in Terminal, Jupyter Notebook and Python code

## Setup bucket access right for ```gcloud```, ```gsutil``` and python script

### using ```.env```
paste the following lines to project ```.env``` file

```
GOOGLE_APPLICATION_CREDENTIALS=/home/<os_user_name>/code/<github_nickname>/gcp/languini-ai-336b4987d9c1.json
BUCKET_NAME=languini-ai-bucket
```


### using terminal
```
export GOOGLE_APPLICATION_CREDENTIALS=/home/<os_user_name>/code/<github_nickname>/gcp/languini-ai-336b4987d9c1.json
export BUCKET_NAME=languini-ai-bucket
```

### for **jupyter notebook**
jupyter might not inherit environment variables (depends on how you initiate jupyter notebook)<br/>
copy the content in below cell to your notebook to access bucket resources

In [34]:
%env GOOGLE_APPLICATION_CREDENTIALS=/home/<os_user_name>/code/<github_nickname>/gcp/languini-ai-336b4987d9c1.json
%env BUCKET_NAME=languini-ai-bucket
BUCKET_NAME = os.getenv('BUCKET_NAME')

env: GOOGLE_APPLICATION_CREDENTIALS=/home/zach/code/matthias-403/languini-ai-336b4987d9c1.json
env: BUCKET_NAME=languini-ai-bucket


## Access bucket in python (including ipython and jupyter notebook)

**Pre-requirements**<br/>
run below cell to check environment variables is correctly set

In [62]:
!echo $GOOGLE_APPLICATION_CREDENTIALS
!echo $BUCKET_NAME

/home/zach/code/matthias-403/languini-ai-336b4987d9c1.json
languini-ai-bucket


below cell is boilerplate code to access bucket resources in python code

In [63]:
import os
from google.cloud import storage

BUCKET_NAME = os.getenv('BUCKET_NAME')
storage_client = storage.Client()
bucket = storage_client.bucket(BUCKET_NAME)

### list file in bucket

In [64]:
directory = 'forvo_api'
blobs = bucket.list_blobs(prefix=directory)
for blob in blobs:
    print(blob.name)

forvo_api/
forvo_api/20221130/
forvo_api/20221130/jpg/
forvo_api/20221130/jpg/awgn_noise/1st_speaker/a.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/about.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/act.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/after.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/again.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/air.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/all.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/always.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/an.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/and.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/animal.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/answer.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/appear.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/as.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/ask.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/at.jpg
forvo_api/20221130/jpg/awgn_noise/1st_speaker/back.jpg
forvo_api/202

### write file to bucket

In [80]:
USER_NAME = os.getenv('USER')
file_location = f'bucket_access_testbed/{USER_NAME}.txt'
content_to_write = f'Message from {USER_NAME}: Hello world'
b = bucket.blob(file_location)
with b.open('w') as f:  # use 'wb' for non-text encoded file (e.g. .mp3, .wav)
    f.write(content_to_write)

### read file from bucket

In [81]:
file_location = f'bucket_access_testbed/{USER_NAME}.txt'
b = bucket.blob(file_location)
with b.open('r') as f:  # use 'rb' for non-text encoded file (e.g. .mp3, .wav)
    file = f.read()

In [82]:
file

'Message from zach: Hello world'

## Access bucket in Terminal

### List blobs in bucket
- ```gsutil ls gs://languini-ai-bucket/```
- ```gcloud storage ls gs://languini-ai-bucket/```
### Upload directory to bucket
_omit ```-r``` or ```--recursive``` when uploading file_
- ```gsutil cp -r <source_directory> gs://<BUCKET_NAME>/<destination_directory>```
- ```gcloud storage cp --recursive <source_directory> gs://<BUCKET_NAME>/<destination_directory>```