# **BA870 - Colab notebook with Python 3.0 examples for loading and saving data from external sources on Google Colab**

# Uploading and Downloading files from your computer (local file system)

## Uploading files from your local file system

`files.upload` returns a dictionary of the files which were uploaded.
The dictionary is keyed by the file name and values are the data which were uploaded.

In [0]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

## Downloading files to your local file system

`files.download` will invoke a browser download of the file to your local computer.


In [0]:
from google.colab import files

with open('example.txt', 'w') as f:
  f.write('some content')

files.download('example.txt')

## Uploading a CSV datafile and read into Python dataframe

Again, we can use `files.upload` to upload a file from our local computer. After the cell below, you will be prompted to select a file to upload from your local computer.

In [0]:
from google.colab import files
uploaded = files.upload()

We can then convert the CSV file into a [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html). You should replace 'csvfilename.csv' in the code below with the actual name of the CSV file you are using.



In [0]:
import pandas as pd
df = pd.read_csv('csvfilename.csv')
pd.DataFrame.from_records(df)

# Google Drive

You can access files in Drive in a number of ways, including:
- Mounting your Google Drive in the runtime's virtual machine
- Using a wrapper around the API such as [PyDrive](https://gsuitedevs.github.io/PyDrive/docs/build/html/index.html)
- Using the [native REST API](https://developers.google.com/drive/v3/web/about-sdk)



Examples of each are below.

## Mounting Google Drive locally

The example below shows how to mount your Google Drive on your runtime using an authorization code, and how to write and read files there. Once executed, you will be able to see the new file (`foo.txt`) at [https://drive.google.com/](https://drive.google.com/).

Note this only supports reading, writing, and moving files; to programmatically modify sharing settings or other metadata, use one of the other options below.

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


### Downloading file to your Google Drive

`f.write` will write a some the line of text "Hello BA870!" to the file "example.txt" located on your Google Drive.


In [0]:
with open('/content/drive/My Drive/example.txt', 'w') as f:
  f.write('Hello BA870!')
!cat /content/drive/My\ Drive/foo.txt

Hello Google Drive!

### Uploading CSV file from Google Drive into Pandas dataframe

In [0]:
import pandas as pd
dfa = pd.read_csv('/content/drive/My Drive/csvfilename.csv')
pd.DataFrame.from_records(dfa)

Unnamed: 0,Day,Number
0,Mon,10
1,Tues,20
2,Wed,30
3,Thurs,40
4,Fri,50


### Closing and unmounting Google Drive

In [0]:
drive.flush_and_unmount()
print('All changes made in this colab session should now be visible in Drive.')

All changes made in this colab session should now be visible in Drive.


## PyDrive

The examples below demonstrate authentication and file upload/download using PyDrive. More examples are available in the [PyDrive documentation](https://gsuitedevs.github.io/PyDrive/docs/build/html/index.html).

In [0]:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

Authenticate and create the PyDrive client.


In [0]:
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

Create and upload a text file.


In [0]:
uploaded = drive.CreateFile({'title': 'Sample upload.txt'})
uploaded.SetContentString('Sample upload file content')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Uploaded file with ID 14vDAdqp7BSCQnoougmgylBexIr2AQx2T


Load a file by ID and print its contents.


In [0]:
downloaded = drive.CreateFile({'id': uploaded.get('id')})
print('Downloaded content "{}"'.format(downloaded.GetContentString()))

Downloaded content "Sample upload file content"


## Drive REST API

In order to use the Drive API, we must first authenticate and construct an API client.


In [0]:
from google.colab import auth
auth.authenticate_user()
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')

With this client, we can use any of the functions in the [Google Drive API reference](https://developers.google.com/drive/v3/reference/). Examples follow.


### Creating a new Drive file with data from Python

First, create a local file to upload.

In [0]:
with open('/tmp/to_upload.txt', 'w') as f:
  f.write('my sample file')

print('/tmp/to_upload.txt contains:')
!cat /tmp/to_upload.txt

/tmp/to_upload.txt contains:
my sample file

Upload it using the [`files.create`](https://developers.google.com/drive/v3/reference/files/create) method. Further details on uploading files are available in the [developer documentation](https://developers.google.com/drive/v3/web/manage-uploads).

In [0]:
from googleapiclient.http import MediaFileUpload

file_metadata = {
  'name': 'Sample file',
  'mimeType': 'text/plain'
}
media = MediaFileUpload('/tmp/to_upload.txt', 
                        mimetype='text/plain',
                        resumable=True)
created = drive_service.files().create(body=file_metadata,
                                       media_body=media,
                                       fields='id').execute()
print('File ID: {}'.format(created.get('id')))

File ID: 1Cw9CqiyU6zbXFD9ViPZu_3yX-sYF4W17


After executing the cell above, you will see a new file named 'Sample file' at [https://drive.google.com/](https://drive.google.com/).

### Downloading data from a Drive file into Python

Download the file we uploaded above.

In [0]:
file_id = created.get('id')

import io
from googleapiclient.http import MediaIoBaseDownload

request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
  # _ is a placeholder for a progress object that we ignore.
  # (Our file is small, so we skip reporting progress.)
  _, done = downloader.next_chunk()

downloaded.seek(0)
print('Downloaded file contents are: {}'.format(downloaded.read()))

Downloaded file contents are: b'my sample file'


In order to download a different file, set `file_id` above to the ID of that file, which will look like "1uBtlaggVyWshwcyP6kEI-y_W3P8D26sz".

# Google Sheets

Our examples below use the open-source [`gspread`](https://github.com/burnash/gspread) library for interacting with Google Sheets.

First, install the package using `pip`.

In [0]:
!pip install --upgrade --quiet gspread

Import the library, authenticate, and create the interface to Sheets.

In [0]:
from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

Below is a small set of `gspread` examples. Additional examples are availabe at the [`gspread` GitHub page](https://github.com/burnash/gspread#more-examples).

## Creating a new sheet with data from Python

In [0]:
sh = gc.create('My cool spreadsheet')

After executing the cell above, you will see a new spreadsheet named 'My cool spreadsheet' at [https://sheets.google.com](https://sheets.google.com/).

Open our new sheet and add some random data.

In [0]:
worksheet = gc.open('My cool spreadsheet').sheet1

cell_list = worksheet.range('A1:C2')

import random
for cell in cell_list:
  cell.value = random.randint(1, 10)

worksheet.update_cells(cell_list)

{'spreadsheetId': '1dsQeN0YzXuM387l_CuyEbsYzL2ew9TJFzR-E-RQnwxs',
 'updatedCells': 6,
 'updatedColumns': 3,
 'updatedRange': 'Sheet1!A1:C2',
 'updatedRows': 2}

## Downloading data from a sheet into Python as a Pandas DataFrame

Read back the random data that we inserted above and convert the result into a [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

In [0]:
worksheet = gc.open('My cool spreadsheet').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

import pandas as pd
pd.DataFrame.from_records(rows)

[['6', '3', '4'], ['7', '2', '1']]


Unnamed: 0,0,1,2
0,6,3,4
1,7,2,1
