# GCP Read Write Testing

## Imports

### *Library imports*

In [1]:
from google.cloud import storage

import os
import pandas as pd
import sys

### *Custom function imports from src*

In [2]:
sys.path.append(os.path.abspath('../src'))

from utils import gcp_read_csv, gcp_write_csv

## Other Prep

I created a new bucket (gcp_read_write_testing) in the Data Science project (ID: gamebeast-data-science) of the gamebeast.gg org.  I placed stineman_siblings.csv in this bucket.

## (1) astineman@gamebeast.gg

### Authentication

Running the following in cmd/bash to authenticate BEFORE executing any of the reads/writes:

```gcloud auth application-default login```

Above is dependent on first installing [Google Cloud CLI tooling](https://cloud.google.com/sdk/docs/install).

Authenticating as astineman@gamebeast.gg.

In [3]:
client = storage.Client()

After authentication, now can initialize the google-cloud-storage client once for all tests.

### Read Test

In [4]:
stineman_siblings = gcp_read_csv(gcp_client_obj = client,
                                 bucket_name = 'gcp_read_write_testing',
                                 csv_file_name = 'stineman_siblings.csv')

In [5]:
stineman_siblings

Unnamed: 0,name,birthdate,height_inches
0,Alexandra,5/12/2011,68.0
1,Nathan,3/20/2009,71.5
2,Sarah,4/8/2006,71.0
3,Nicholas,8/27/2000,74.0
4,Andrew,10/21/1997,76.75


Import successful!

### Write Test

In [6]:
gcp_write_csv(gcp_client_obj = client,
              dataframe = stineman_siblings,
              bucket_name = 'gcp_read_write_testing',
              csv_file_name = 'stineman_siblings_gamebeast.csv')

Write complete.


### Overwrite Test

In [8]:
gcp_write_csv(gcp_client_obj = client,
              dataframe = stineman_siblings,
              bucket_name = 'gcp_read_write_testing',
              csv_file_name = 'stineman_siblings_gamebeast.csv')

Write complete.


Because our the gcp_read_write_testing bucket has versionioning enabled, this action (writing identical file name to bucket) makes the existing version noncurrent and this written file current.

## (2) andrewtstineman@gmail.com

Why this user?

astineman@gamebeast.gg has tons of permissions by virtue of being the Owner of the overarching project.  andrewtstineman@gmail.com allows us to experiment with what happens when permissions are missing.

### Authentication

Running the following in cmd/bash to remove the astineman@gamebeast.gg credentials:

```gcloud auth application-default revoke```

Now running the following in cmd/bash to authenticate as andrewtstineman@gmail.com BEFORE executing any of the reads/writes:

```gcloud auth application-default login```

Succeeded, but WARNING has appeared:

*Cannot add the project "gamebeast-data-science" to ADC as the quota project because the account in ADC does not have the "serviceusage.services.use" permission on this project. You might receive a "quota_exceeded" or "API not enabled" error. Run $ gcloud auth application-default set-quota-project to add a quota project.*

Response:

1. Navigated to the IAM service for the Data Science project, then added an "Allow" access for andrewtstineman@gmail.com with the "Service Usage Consumer" role
2. Run ```gcloud auth application-default revoke``` to remove andrewtstineman@gmail.com credentials
3. Run ```gcloud auth application-default login``` to again attempt to authenticate as andrewtstineman@gmail.com

Success!  I am now authenticated as andrewtstineman@gmail.com and gamebeast-data-science is appearing as the billing project.

In [9]:
client = storage.Client()

After authentication, now can initialize the google-cloud-storage client once for all tests.

### Read Test

In [10]:
stineman_siblings = gcp_read_csv(gcp_client_obj = client,
                                 bucket_name = 'gcp_read_write_testing',
                                 csv_file_name = 'stineman_siblings.csv')

Forbidden: 403 GET https://storage.googleapis.com/download/storage/v1/b/gcp_read_write_testing/o/stineman_siblings.csv?alt=media: andrewtstineman@gmail.com does not have storage.objects.get access to the Google Cloud Storage object. Permission &#39;storage.objects.get&#39; denied on resource (or it may not exist).: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

Error! 403 GET: andrewtstineman@gmail.com does not have storage.objects.get access to the Google Cloud Storage object.

Response:

1. Navigated to the Permissions settings for the gcp_read_write_testing bucket, then added access for andrewtstineman@gmail.com with the "Storage Object Viewer" role
2. Run ```gcloud auth application-default revoke``` to remove andrewtstineman@gmail.com credentials
3. Run ```gcloud auth application-default login``` to again attempt to authenticate as andrewtstineman@gmail.com

In [11]:
client = storage.Client()

Re-initializing the client

In [12]:
stineman_siblings = gcp_read_csv(gcp_client_obj = client,
                                 bucket_name = 'gcp_read_write_testing',
                                 csv_file_name = 'stineman_siblings.csv')

In [13]:
stineman_siblings

Unnamed: 0,name,birthdate,height_inches
0,Alexandra,5/12/2011,68.0
1,Nathan,3/20/2009,71.5
2,Sarah,4/8/2006,71.0
3,Nicholas,8/27/2000,74.0
4,Andrew,10/21/1997,76.75


Import successful!

### Write Test

In [14]:
gcp_write_csv(gcp_client_obj = client,
              dataframe = stineman_siblings,
              bucket_name = 'gcp_read_write_testing',
              csv_file_name = 'stineman_siblings_gmail.csv')

Forbidden: 403 POST https://storage.googleapis.com/upload/storage/v1/b/gcp_read_write_testing/o?uploadType=multipart: {
  "error": {
    "code": 403,
    "message": "andrewtstineman@gmail.com does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may not exist).",
    "errors": [
      {
        "message": "andrewtstineman@gmail.com does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may not exist).",
        "domain": "global",
        "reason": "forbidden"
      }
    ]
  }
}
: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>)

Error! 403 GET: andrewtstineman@gmail.com does not have storage.objects.create access to the Google Cloud Storage object.

Response:

1. Navigated to the Permissions settings for the gcp_read_write_testing bucket, then added access for andrewtstineman@gmail.com with the "Storage Object Creator" role
2. Run ```gcloud auth application-default revoke``` to remove andrewtstineman@gmail.com credentials
3. Run ```gcloud auth application-default login``` to again attempt to authenticate as andrewtstineman@gmail.com

In [15]:
client = storage.Client()

Re-initializing the client

In [16]:
gcp_write_csv(gcp_client_obj = client,
              dataframe = stineman_siblings,
              bucket_name = 'gcp_read_write_testing',
              csv_file_name = 'stineman_siblings_gmail.csv')

Write complete.


### Overwrite Test

In [18]:
gcp_write_csv(gcp_client_obj = client,
              dataframe = stineman_siblings,
              bucket_name = 'gcp_read_write_testing',
              csv_file_name = 'stineman_siblings_gmail.csv')

Forbidden: 403 POST https://storage.googleapis.com/upload/storage/v1/b/gcp_read_write_testing/o?uploadType=multipart: {
  "error": {
    "code": 403,
    "message": "andrewtstineman@gmail.com does not have storage.objects.delete access to the Google Cloud Storage object.",
    "errors": [
      {
        "message": "andrewtstineman@gmail.com does not have storage.objects.delete access to the Google Cloud Storage object.",
        "domain": "global",
        "reason": "forbidden"
      }
    ]
  }
}
: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>)

Error! 403 POST: andrewtstineman@gmail.com does not have storage.objects.delete access to the Google Cloud Storage object.

Are we really trying to delete a file?  In essence, yes - we have to delete the existing file and replace it with the new file.  The existing file is then logged as a noncurrent version, given that we have versioning enabled for our bucket.

Response:

1. Navigated to the Permissions settings for the gcp_read_write_testing bucket, then added access for andrewtstineman@gmail.com with the "Storage Object Admin" role
    1. There is no role which directly corresponds to the storage.objects.delete permission.  Storage Object Admin is the best way to give this delete permission.
    2. Because Storage Object Admin is a superset (i.e., is inclusive of) the Storage Object Viewer and Storage Object Creator roles, I also removed the duplicative Viewer and Creator roles.
2. Run ```gcloud auth application-default revoke``` to remove andrewtstineman@gmail.com credentials
3. Run ```gcloud auth application-default login``` to again attempt to authenticate as andrewtstineman@gmail.com

In [19]:
client = storage.Client()

Re-initializing the client

In [21]:
gcp_write_csv(gcp_client_obj = client,
              dataframe = stineman_siblings,
              bucket_name = 'gcp_read_write_testing',
              csv_file_name = 'stineman_siblings_gmail.csv')

Write complete.


### Clean-up

I don't want andrewtstineman@gmail.com to have lasting access to my bucket/Data Science project billing, so I have taken the following actions to remove andrewtstineman@gmail.com permissions:

1. Navigated to the IAM service for the Data Science project, then removed the "Service Usage Consumer" role access for andrewtstineman@gmail.com
2. Navigated to the Permissions settings for the gcp_read_write_testing bucket, then removed "Storage Object Admin" role access for andrewtstineman@gmail.com

andrewtstineman@gmail.com no longer has access to any Data Science project services.

## Conclusion

Experimented with reads/writes at https://console.cloud.google.com/storage/browser/gcp_read_write_testing?inv=1&invt=Ab0fGw&project=gamebeast-data-science

Key findings:

1. As owner of the project, I can do whatever I want in terms of reading/writing/overwriting
2. For my personal gmail account, I need several specific permissions:
    1. I need a project-level role (Service Usage Consumer) to use project resources
    2. I need a bucket-level role (Storage Object Viewer) to read objects
    3. I need a bucket-level role (Storage Object Creator) to write new objects
    4. I need a bucket-level permission to allow for overwriting of existing objects (Storage Object Admin - this is inclusive of the Viewer and Creator roles, so those roles can be dropped as duplicates)

Need to leverage the above research to inform the IAM scheme I use to share my projects within Gamebeast.  Perhaps a certain group gets read/write access, while I maintain read/write/overwrite access?