# Using Google Cloud Datalab - Accessing Cloud Data

This notebook describes how Google Cloud Datalab integrates within your Google Cloud project, and how you can work with data, manage your notebooks, and invoke APIs that are part of Google Cloud Platform.

## Google Cloud Integration

In [1]:
from google.datalab import Context

context = Context.default()
print('The current project is %s' % context.project_id)

The current project is is833-demo


Datalab automatically handles authentication to detect the current project, as well as obtaining the OAuth token used to invoke APIs. In particular, it uses the OAuth token representing the project's service account, rather than an individual user's credentials.

## Service Accounts

This is an important detail.

The code you author and the data you access is stored in notebooks that are shared across the project. As such, the authorization used to execute and retrieve that data is based upon the project.

Also, any applications or data pipelines you produce within Datalab are deployed using the project's service account, not individual accounts; this use of the project's service account is generally considered good practice.

Consequently, to access resources contained within another project, you will need to authorize the service account of your Datalab project within that other project, rather than authorize a particular user.

In [1]:
!gcloud auth list

                  Credentialed Accounts
ACTIVE  ACCOUNT
*       429069221461-compute@developer.gserviceaccount.com

To set the active account, run:
    $ gcloud config set account `ACCOUNT`



# Reading data from Google Cloud Storage bucket
Upload `data/GOOGL.csv` to your bucket, modify the code below and replace `is833` with your own bucket name and run it:

In [67]:
import google.datalab.storage as storage
import pandas as pd
from io import BytesIO

uri = storage.Bucket('is833').object('GOOGL.csv').uri
 
%gcs read --object $uri --variable data

df = pd.read_csv(BytesIO(data), parse_dates=True, index_col= 'Date')
df = df.reset_index()
df.head()

Unnamed: 0,Date,High,Low,Open,Close,Volume,Adj Close
0,2004-08-19,52.082081,48.028027,50.050049,50.220219,44659000.0,50.220219
1,2004-08-20,54.594593,50.300301,50.555557,54.209209,22834300.0,54.209209
2,2004-08-23,56.796795,54.579578,55.430431,54.754753,18256100.0,54.754753
3,2004-08-24,55.855854,51.836838,55.675674,52.487488,15247300.0,52.487488
4,2004-08-25,54.054054,51.991993,52.532532,53.053055,9188600.0,53.053055


# Visualization

In [69]:
%%chart annotation --fields Date,Close --data df