# Storage Commands

Notebook provides a set of commands for working with data stored in Google Cloud Storage. They can help you work with data files containing data that is not stored in BigQuery or manage data imported into or exported from BigQuery.

This notebook introduces several Cloud Storage commands that notebook introduces into the notebook environment.

## The Commands

The commands can list storage buckets and their contained objects, manage those objects, and read from and write to those objects.

In [4]:
%%gcs --help

usage: gcs [-h] {copy,create,delete,list,read,view,write} ...

Execute various Google Cloud Storage related operations. Use "%gcs <command>
-h" for help on a specific command.

positional arguments:
  {copy,create,delete,list,read,view,write}
                        commands
    copy                Copy one or more Google Cloud Storage objects to a
                        different location.
    create              Create one or more Google Cloud Storage buckets.
    delete              Delete one or more Google Cloud Storage buckets or
                        objects.
    list                List buckets in a project, or contents of a bucket.
    read                Read the contents of a Google Cloud Storage object
                        into a Python variable.
    view                View the contents of a Google Cloud Storage object.
    write               Write the value of a Python variable to a Google Cloud
                        Storage object.

optional arguments:
  -h, --h

# Buckets and Objects

Items or files held in Cloud Storage are called `objects`. These objects are immutable once written. They are organized into buckets.

## Listing

First, a couple of commands to list Datalab sample data. Try `%%gcs list` without arguments to list all buckets within the current project:

In [None]:
%%gcs list

In [5]:
%%gcs list --objects gs://cloud-datalab-samples

Name,Type,Size,Updated
applogs,application/octet-stream,506050,2015-11-24 00:06:07.588000+00:00
carprices/testing.csv,text/csv,3635,2015-10-06 09:02:03.638000+00:00
carprices/training.csv,text/csv,15018,2015-10-06 09:01:46.040000+00:00
cars.csv,text/csv,248,2015-10-05 04:58:10.481000+00:00
cars2.csv,text/csv,92,2015-10-05 05:41:30.935000+00:00
census/,application/x-www-form-urlencoded;charset=UTF-8,0,2017-03-05 05:51:55.107000+00:00
census/ACS2014_PUMS_README.pdf,application/pdf,289316,2017-03-05 05:52:31.193000+00:00
census/ss14psd.csv,binary/octet-stream,8189323,2017-03-05 05:53:54.728000+00:00
hello.txt,text/plain,14,2015-10-05 04:48:39.433000+00:00
httplogs/logs20140615.csv,text/csv,23799981,2015-10-06 08:39:42.605000+00:00


You can also use wildchars to list all objects matching a pattern:

In [9]:
%%gcs list --objects gs://cloud-datalab-samples/udf*

Name,Type,Size,Updated
udfsample/,application/x-www-form-urlencoded;charset=UTF-8,0,2015-11-23 23:57:38.494000+00:00
udfsample/2015_station_data.csv,text/csv,4230,2015-11-24 00:20:14.575000+00:00


## Creating a Bucket

In [14]:
# Some code to create a unique bucket name for the purposes of the sample
from google.cloud import storage
import random, string

project = "project_id"
storage_client = storage.Client()
bucket_name = project + '-datalab-samples-' 
bucket = storage_client.create_bucket(bucket_name)


## Creating

Bucket: gs://mysampleproject-datalab-samples-abcde
Object: gs://mysampleproject-datalab-samples-abcde/Hello.txt


In [None]:
sample_bucket = storage.Bucket(sample_bucket_name)
sample_bucket.create()
sample_bucket.exists()