# Google Sheets Integration from Google Colab / Python Script

This notebook provides recipes for loading and saving data from google sheets

Our examples below use the open-source [`gspread`](https://github.com/burnash/gspread) library for interacting with Google Sheets.

The library [`gspread-dataframe`](https://pythonhosted.org/gspread-dataframe/#from-github) is used to facilitate working with pandas dataframes (view [source here](https://github.com/robin900/gspread-dataframe/blob/master/gspread_dataframe.py)).

First, install the packages using `pip`.

In [None]:
!pip install --upgrade --quiet gspread gspread-dataframe

Import the library, authenticate, and create the interface to Sheets.

In [None]:
from google.colab import auth
auth.authenticate_user()

import gspread
from gspread_dataframe import get_as_dataframe, set_with_dataframe
import pandas as pd
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

Below is a small set of `gspread` examples. Additional examples are availabe at the [`gspread` GitHub page](https://github.com/burnash/gspread#more-examples).

## Creating a new sheet with data from Python

In [None]:
sh = gc.create('My cool spreadsheet')

After executing the cell above, you will see a new spreadsheet named 'My cool spreadsheet' at [https://sheets.google.com](https://sheets.google.com/).

Open our new sheet and add some random data.

In [None]:
worksheet = gc.open('My cool spreadsheet').sheet1

cell_list = worksheet.range('A1:C2')

import random
for cell in cell_list:
  cell.value = random.randint(1, 10)

worksheet.update_cells(cell_list)

{'spreadsheetId': '1q_Ziym70qyOx80pzoRWQJhcUXp8fR12HBXSK7lWhfmA',
 'updatedCells': 6,
 'updatedColumns': 3,
 'updatedRange': 'Sheet1!A1:C2',
 'updatedRows': 2}

Or from a dataframe

In [None]:
d = [pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
    pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])]
df = pd.DataFrame(d)
#
set_with_dataframe(worksheet, df)

### Saving the data 

See also https://stackoverflow.com/questions/36936449/creating-a-worksheet-using-gspread

In [None]:
spreadsheet = gc.open('My cool spreadsheet')
new_worksheet = spreadsheet.add_worksheet(title="DSL Results", rows="100", cols="20")

In [None]:
set_with_dataframe(new_worksheet, dsl_last_results)

## Downloading data from a sheet into Python as a Pandas DataFrame

Read back the random data that we inserted above and convert the result into a [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

In [None]:
worksheet = gc.open('My cool spreadsheet').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

import pandas as pd
df = pd.DataFrame.from_records(rows)

[['4', '4', '3'], ['6', '6', '6']]


Or with the shortcut library:

In [None]:
df = get_as_dataframe(worksheet)

### All together now 

PS you dont have to pip install anything on google colab!


In [None]:
from google.colab import auth
auth.authenticate_user()

import gspread
from gspread_dataframe import get_as_dataframe, set_with_dataframe
import pandas as pd
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

title = 'My cool spreadsheet'
sh = gc.create(title)
worksheet = gc.open(title).sheet1
set_with_dataframe(worksheet, data)
spreadsheet_url = "https://docs.google.com/spreadsheets/d/%s" % sh.id
print(spreadsheet_url)

## Connect to a new Google Sheet - Offline Version

This involves using an authorization key from Google. See how to obtain it here https://gspread.readthedocs.io/en/latest/oauth2.html

In [9]:
!pip install gspread df2gspread oauth2client -U --quiet

In [4]:
import pandas as pd
import gspread
from oauth2client.service_account import ServiceAccountCredentials
from df2gspread import df2gspread as d2g

In [5]:
scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name(
    'gas.json', scope)
gc = gspread.authorize(credentials)

NOTE you can only open spreadsheets where your credentials email address has been addeed as an editor!!

In [6]:
gc.list_spreadsheet_files()

[{'kind': 'drive#file',
  'id': '1-kTZJZ1GAhJ2m4GAIhw1ZdlgO46JpvX0ZQa232VWRmw',
  'name': 'Dimensions COVID-19 publications, data sets, clinical trials  - updated daily',
  'mimeType': 'application/vnd.google-apps.spreadsheet'},
 {'kind': 'drive#file',
  'id': '1m52H7KjahhUxcTHzUFUeXNr_p8MwFsZILw5fuwyUGcI',
  'name': 'gspread python test',
  'mimeType': 'application/vnd.google-apps.spreadsheet'}]

In [24]:
wks = gc.open("gspread python test").sheet1

In [None]:
wks.col_values(1)

### Save the data 

This creates a new sheet 'master' in our spreadsheet (accessed via ID). See also the [df2gspread docs](https://df2gspread.readthedocs.io/en/latest/examples.html). 


In [31]:
spreadsheet_key = '1m52H7KjahhUxcTHzUFUeXNr_p8MwFsZILw5fuwyUGcI'
wks_name = 'Master'
d2g.upload(data, spreadsheet_key, wks_name, credentials=credentials, row_names=True)

<Worksheet 'Master' id:186578311>

In [None]:
data