# Google Cloud Storage
---
Cloud Storage is a service for storing your objects in Google Cloud. An object is an immutable piece of data consisting of a file of any format. You store objects in containers called buckets. All buckets are associated with a project, and you can group your projects under an organization. Each project, bucket, and object in Google Cloud is a resource in Google Cloud, as are things such as Compute Engine instances.

After you create a project, you can create Cloud Storage buckets, upload objects to your buckets, and download objects from your buckets. You can also grant permissions to make your data accessible to principals you specify, or - for certain use cases such as hosting a website - accessible to everyone on the public internet.

The ways to interact with Cloud Storage:
- **Console**: The Google Cloud console provides a visual interface for you to manage your data in a browser.
- **Google Cloud CLI**: The gcloud CLI allows you to interact with Cloud Storage through a terminal using gcloud storage commands.
- **Client libraries**: The Cloud Storage client libraries allow you to manage your data using one of your preferred languages, including C++, C#, Go, Java, Node.js, PHP, Python, and Ruby.
- **REST APIs**: Manage your data using the JSON or XML API.
- **Terraform**: Terraform is an infrastructure-as-code (IaC) tool that you can use to provision the infrastructure for Cloud Storage.

Reference:
- Upload an object to a bucket: https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python
- Download an object from a bucket: https://cloud.google.com/storage/docs/downloading-objects#storage-download-object-python

## Google Cloud CLI
Cloud Shell (gsutil CLI) <-> Cloud Storage (GCS buckets)

In [1]:
# !wget -O data.zip https://file.designil.com/bhXYol+

--2023-03-03 11:46:36--  https://file.designil.com/bhXYol+
Resolving file.designil.com (file.designil.com)... 2606:4700:3035::ac43:8261, 2606:4700:3031::6815:851, 104.21.8.81, ...
Connecting to file.designil.com (file.designil.com)|2606:4700:3035::ac43:8261|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-std.droplr.net/files/acc_513973/bhXYol [following]
--2023-03-03 11:46:37--  https://cdn-std.droplr.net/files/acc_513973/bhXYol
Resolving cdn-std.droplr.net (cdn-std.droplr.net)... 65.9.181.123, 65.9.181.38, 65.9.181.36, ...
Connecting to cdn-std.droplr.net (cdn-std.droplr.net)|65.9.181.123|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6290632 (6.0M) [application/zip]
Saving to: 'data.zip'

     0K .......... .......... .......... .......... ..........  0%  158K 38s
    50K .......... .......... .......... .......... ..........  1% 9.62M 19s
   100K .......... .......... .......... .......... ..........  2%  320K 19s

In [3]:
# !unzip data.zip

Archive:  data.zip
  inflating: ws2-output.csv          


Upload data from local to GCS buckets

In [None]:
# !gsutil cp ws2-output.csv gs://nutbodyslam053-r2de2-datalake
# !gsutil cp ws2-output.csv gs://nutbodyslam053-r2de2-datalake/2021/ws2-output.csv

Download data from GCS buckets to local

In [None]:
# !gsutil cp gs://nutbodyslam053-r2de2-datalake output.csv

## Client libraries
Python libraries (google-cloud-storage) <-> Cloud Storage (GCS buckets)

In [None]:
# !pip install google-cloud-storage

In [None]:
# Download & Upload data from local to GCS buckets
from google.cloud import storage

def upload_blob(bucket_name, source_file_name, destination_blob_name):
    """Uploads a file to the bucket."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"
    # The path to your file to upload
    # source_file_name = "local/path/to/file"
    # The ID of your GCS object
    # destination_blob_name = "storage-object-name"

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    # Optional: set a generation-match precondition to avoid potential race conditions
    # and data corruptions. The request to upload is aborted if the object's
    # generation number does not match your precondition. For a destination
    # object that does not yet exist, set the if_generation_match precondition to 0.
    # If the destination object already exists in your bucket, set instead a
    # generation-match precondition using its generation number.
    generation_match_precondition = 0

    blob.upload_from_filename(source_file_name, if_generation_match=generation_match_precondition)

    print(
        f"File {source_file_name} uploaded to {destination_blob_name}."
    )

def download_blob(bucket_name, source_blob_name, destination_file_name):
    """Downloads a blob from the bucket."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"

    # The ID of your GCS object
    # source_blob_name = "storage-object-name"

    # The path to which the file should be downloaded
    # destination_file_name = "local/path/to/file"

    storage_client = storage.Client()

    bucket = storage_client.bucket(bucket_name)

    # Construct a client side representation of a blob.
    # Note `Bucket.blob` differs from `Bucket.get_blob` as it doesn't retrieve
    # any content from Google Cloud Storage. As we don't need additional data,
    # using `Bucket.blob` is preferred here.
    blob = bucket.blob(source_blob_name)
    blob.download_to_filename(destination_file_name)

    print(
        "Downloaded storage object {} from bucket {} to local file {}.".format(
            source_blob_name, bucket_name, destination_file_name
        )
    )

if __name__ == "__main__":
    load = input("Upload (u) or Download (d)?")
    bucket_name="nutbodyslam053-r2de2-datalake"
    file_name="data/ws2-output.csv"
    gcs_file_name="data/output.csv"

    if load.lower() == "u":
        upload_blob(
            bucket_name=bucket_name,
            source_file_name=file_name,
            destination_blob_name=gcs_file_name
        )

    elif load.lower() == "d":
        download_blob(
            bucket_name=bucket_name, 
            source_blob_name=gcs_file_name,
            destination_file_name=file_name
        )
    
    else:
        print("Please input upload (u) or download (d).")

### gsutil CLI

```Bash
# Create a new Bucket
gsutil mb gs://[BUCKET]

# List files in bucket
gsutil ls gs://[BUCKET]

# Check bucket usage (du: disk usage)
gsutil du -sh gs://[BUCKET]

# Copy (upload) file to bucket
gsutil cp [File] gs://[BUCKET]
# > no need to create directory
gsutil cp [File] gs://[BUCKET]/path/to/file
# > copy (upload) directory
gsutil cp -r [Folder] gs://[BUCKET]
# > enable multiprocessing
gsutil -m cp -r [Folder] gs://[BUCKET]
# > copy (download) file
gsutil cp gs://[BUCKET]/path/to/file [File]
# > copy (download) directory to current local directory
gsutil -m cp -r gs://[BUCKET]/path .
# > copy (download) txt files to current local directory
gsutil -m cp gs://[BUCKET]/*.txt . 

# Move or rename object
gsutil mv gs://[BUCKET]/path/to/old_name gs://[BUCKET]/path/to/new_name

# Remove file or directory 
# > remove file
gsutil rm gs://[BUCKET]/path/to/file
# > remove directory
gsutil rm -r gs://[BUCKET]/path/to/directory


# Doc: https://cloud.google.com/storage/docs/how-to
```

In [None]:
# Create a new Bucket
gsutil mb gs://[BUCKET]

# List files in bucket
gsutil ls gs://[BUCKET]

# Check bucket usage (du: disk usage)
gsutil du -sh gs://[BUCKET]

# Copy (upload) file to bucket
gsutil cp [File] gs://[BUCKET]
# > no need to create directory
gsutil cp [File] gs://[BUCKET]/path/to/file
# > copy (upload) directory
gsutil cp -r [Folder] gs://[BUCKET]
# > enable multiprocessing
gsutil -m cp -r [Folder] gs://[BUCKET]
# > copy (download) file
gsutil cp gs://[BUCKET]/path/to/file [File]
# > copy (download) directory to current local directory
gsutil -m cp -r gs://[BUCKET]/path .
# > copy (download) txt files to current local directory
gsutil -m cp gs://[BUCKET]/*.txt . 

# Move or rename object
gsutil mv gs://[BUCKET]/path/to/old_name gs://[BUCKET]/path/to/new_name

# Remove file or directory 
# > remove file
gsutil rm gs://[BUCKET]/path/to/file
# > remove directory
gsutil rm -r gs://[BUCKET]/path/to/directory


# Doc: https://cloud.google.com/storage/docs/how-to