<a href="https://colab.research.google.com/github/Davron030901/Google_Colaboratory/blob/main/External_data_Local_Files%2C_Drive%2C_Sheets%2C_and_Cloud_Storage.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook provides recipes for loading and saving data from external sources.

# Local file system

## Uploading files from your local file system

`files.upload` returns a dictionary of the files which were uploaded.
The dictionary is keyed by the file name and values are the data which were uploaded.

In [1]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Saving 07 - Introduction to Stim.pdf to 07 - Introduction to Stim.pdf
User uploaded file "07 - Introduction to Stim.pdf" with length 719524 bytes


## Downloading files to your local file system

`files.download` will invoke a browser download of the file to your local computer.


In [2]:
from google.colab import files

with open('example.txt', 'w') as f:
  f.write('some content')

files.download('example.txt')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Google Drive

You can access files in Drive in a number of ways, including:
- Mounting your Google Drive in the runtime's virtual machine
- Using a wrapper around the API such as [PyDrive2](https://docs.iterative.ai/PyDrive2/)
- Using the [native REST API](https://developers.google.com/drive/v3/web/about-sdk)



Examples of each are below.

## Mounting Google Drive locally

The example below shows how to mount your Google Drive on your runtime using an authorization code, and how to write and read files there. Once executed, you will be able to see the new file (`foo.txt`) at [https://drive.google.com/](https://drive.google.com/).

This only supports reading, writing, and moving files; to programmatically modify sharing settings or other metadata, use one of the other options below.

**Note:** When using the 'Mount Drive' button in the file browser, no authentication codes are necessary for notebooks that have only been edited by the current user.

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
with open('/content/drive/My Drive/foo.txt', 'w') as f:
  f.write('Hello Google Drive!')
!cat /content/drive/My\ Drive/foo.txt

Hello Google Drive!

In [5]:
drive.flush_and_unmount()
print('All changes made in this colab session should now be visible in Drive.')

All changes made in this colab session should now be visible in Drive.


## PyDrive2

The examples below demonstrate authentication and file upload/download using PyDrive2. More examples are available in the [PyDrive2 documentation](https://docs.iterative.ai/PyDrive2/).

In [6]:
from pydrive2.auth import GoogleAuth
from pydrive2.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

Authenticate and create the PyDrive2 client.


In [7]:
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

Create and upload a text file.


In [8]:
uploaded = drive.CreateFile({'title': 'Sample upload.txt'})
uploaded.SetContentString('Sample upload file content')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Uploaded file with ID 1KyaAknO6Bg5deYngEeETd-Tbu9Yp1Hcf


Load a file by ID and print its contents.


In [9]:
downloaded = drive.CreateFile({'id': uploaded.get('id')})
print('Downloaded content "{}"'.format(downloaded.GetContentString()))

Downloaded content "Sample upload file content"


## Drive REST API

In order to use the Drive API, we must first authenticate and construct an API client.


In [10]:
from google.colab import auth
auth.authenticate_user()
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')

With this client, we can use any of the functions in the [Google Drive API reference](https://developers.google.com/drive/v3/reference/). Examples follow.


### Creating a new Drive file with data from Python

First, create a local file to upload.

In [11]:
with open('/tmp/to_upload.txt', 'w') as f:
  f.write('my sample file')

print('/tmp/to_upload.txt contains:')
!cat /tmp/to_upload.txt

/tmp/to_upload.txt contains:
my sample file

Upload it using the [`files.create`](https://developers.google.com/drive/v3/reference/files/create) method. Further details on uploading files are available in the [developer documentation](https://developers.google.com/drive/v3/web/manage-uploads).

In [12]:
from googleapiclient.http import MediaFileUpload

file_metadata = {
  'name': 'Sample file',
  'mimeType': 'text/plain'
}
media = MediaFileUpload('/tmp/to_upload.txt',
                        mimetype='text/plain',
                        resumable=True)
created = drive_service.files().create(body=file_metadata,
                                       media_body=media,
                                       fields='id').execute()
print('File ID: {}'.format(created.get('id')))

File ID: 1A5t6BJt5xYZ2Vg7GPPRBd4p2zRA6KB-3


After executing the cell above, you will see a new file named 'Sample file' at [https://drive.google.com/](https://drive.google.com/).

### Downloading data from a Drive file into Python

Download the file we uploaded above.

In [13]:
file_id = created.get('id')

import io
from googleapiclient.http import MediaIoBaseDownload

request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
  # _ is a placeholder for a progress object that we ignore.
  # (Our file is small, so we skip reporting progress.)
  _, done = downloader.next_chunk()

downloaded.seek(0)
print('Downloaded file contents are: {}'.format(downloaded.read()))

Downloaded file contents are: b'my sample file'


In order to download a different file, set `file_id` above to the ID of that file, which will look like "1uBtlaggVyWshwcyP6kEI-y_W3P8D26sz".

# Google Sheets


## Google Sheets Workspace Extension

We have a Workspace Extension, [Sheets to Colab](https://workspace.google.com/u/0/marketplace/app/sheets_to_colab/945625412720), which allows you to directly import data from Google Sheets into Colab from the Sheets UI. Follow the link to the Sheets to Colab Workspace Extension to learn more.

## Interacting with Google Sheets using gspread

 You can also use the open-source [`gspread`](https://github.com/burnash/gspread) library to interact with Google Sheets. The code below shows you how to setup and authenticate `gspread`.

In [14]:
from google.colab import auth
auth.authenticate_user()

import gspread
from google.auth import default
creds, _ = default()

gc = gspread.authorize(creds)

Below is a small set of `gspread` examples. Additional examples are available at the [`gspread` GitHub page](https://github.com/burnash/gspread#more-examples).

### Creating a new sheet with data from Python

In [15]:
sh = gc.create('My cool spreadsheet')

After executing the cell above, you will see a new spreadsheet named 'My cool spreadsheet' at [https://sheets.google.com](https://sheets.google.com/).

Open our new sheet and add some random data.

In [16]:
worksheet = gc.open('My cool spreadsheet').sheet1

cell_list = worksheet.range('A1:C2')

import random
for cell in cell_list:
  cell.value = random.randint(1, 10)

worksheet.update_cells(cell_list)

{'spreadsheetId': '1reeL-dHg17d77q5m-SZeA6TSGS_s6HYCfsQTBvNflTs',
 'updatedRange': "'Лист1'!A1:C2",
 'updatedRows': 2,
 'updatedColumns': 3,
 'updatedCells': 6}

### Downloading data from a sheet into Python as a Pandas DataFrame

Read back the random data that we inserted above and convert the result into a [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

In [17]:
worksheet = gc.open('My cool spreadsheet').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

import pandas as pd
pd.DataFrame.from_records(rows)

[['10', '7', '10'], ['6', '5', '10']]


Unnamed: 0,0,1,2
0,10,7,10
1,6,5,10


# InteractiveSheet

You can now embed live Google Sheets in Colab with the `InteractiveSheet` library. This means you can create and edit data in Google Sheets and seamlessly incorporate it into your notebook with Pandas DataFrames all from Colab.

In [18]:
from google.colab import sheets

# Create a new interactive sheet and add data to it.
sheet = sheets.InteractiveSheet()

https://docs.google.com/spreadsheets/d/1nYVy2GXdoTdc1SeKOatphyIWtNqt2SSCAi45zTfBA6o#gid=0


In [19]:
# Get a Pandas DataFrame from the selected worksheet
df = sheet.as_df()

In [20]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))

# Create a new sheet and include the column names as the first row.
sheet = sheets.InteractiveSheet(df=df, title='foo', include_column_headers=True)

https://docs.google.com/spreadsheets/d/1eXeqVsf75r6c3DwxHGFdYTdJjXtwYDHHc9zU_n7mV0M#gid=0


In [21]:
# Push data from Colab to the selected worksheet
df2 = pd.DataFrame(np.random.randn(10, 4), columns=list('ABCD'))
sheet.update(df=df)

In [22]:
# Display the sheet in the output of the current cell
sheet.display()

# Google Cloud Storage (GCS)

In order to use Colaboratory with GCS, you'll need to create a [Google Cloud project](https://cloud.google.com/storage/docs/projects) or use a pre-existing one.

Specify your project ID below:

In [23]:
project_id = 'Your_project_ID_here'

Files in GCS are contained in [buckets](https://cloud.google.com/storage/docs/buckets).

Buckets must have a globally-unique name, so we generate one here.

In [24]:
import uuid
bucket_name = 'colab-sample-bucket-' + str(uuid.uuid1())

In order to access GCS, we must authenticate.

In [25]:
from google.colab import auth
auth.authenticate_user()

GCS can be accessed via the `gsutil` command-line utility or via the native Python API.

## `gsutil`

First, we configure `gsutil` to use the project we specified above by using `gcloud`.

In [26]:
!gcloud config set project {project_id}

Are you sure you wish to set property [core/project] to Your_project_ID_here?

Do you want to continue (Y/n)?  y

[1;31mERROR:[0m (gcloud.config.set) The project property must be set to a valid project ID, not the project name [Your_project_ID_here]
To set your project, run:

  $ gcloud config set project PROJECT_ID

or to unset it, run:

  $ gcloud config unset project


Create a local file to upload.

In [27]:
with open('/tmp/to_upload.txt', 'w') as f:
  f.write('my sample file')

print('/tmp/to_upload.txt contains:')
!cat /tmp/to_upload.txt

/tmp/to_upload.txt contains:
my sample file

Make a bucket to which we'll upload the file ([documentation](https://cloud.google.com/storage/docs/gsutil/commands/mb)).

In [28]:
!gsutil mb gs://{bucket_name}

Creating gs://colab-sample-bucket-bdce244a-df39-11ef-8507-0242ac1c000c/...
You are attempting to perform an operation that requires a project id, with none configured. Please re-run gsutil config and make sure to follow the instructions for finding and entering your default project id.


Copy the file to our new bucket ([documentation](https://cloud.google.com/storage/docs/gsutil/commands/cp)).

In [29]:
!gsutil cp /tmp/to_upload.txt gs://{bucket_name}/

Copying file:///tmp/to_upload.txt [Content-Type=text/plain]...
/ [0 files][    0.0 B/   14.0 B]                                                NotFoundException: 404 The destination bucket gs://colab-sample-bucket-bdce244a-df39-11ef-8507-0242ac1c000c does not exist or the write to the destination must be restarted


Dump the contents of our newly copied file to make sure everything worked ([documentation](https://cloud.google.com/storage/docs/gsutil/commands/cat)).


In [30]:
!gsutil cat gs://{bucket_name}/to_upload.txt

BucketNotFoundException: 404 gs://colab-sample-bucket-bdce244a-df39-11ef-8507-0242ac1c000c bucket does not exist.


In [31]:
# @markdown Once the upload has finished, the data will appear in the Cloud Console storage browser for your project:
print('https://console.cloud.google.com/storage/browser?project=' + project_id)

https://console.cloud.google.com/storage/browser?project=Your_project_ID_here


Finally, we'll download the file we just uploaded in the example above. It's as simple as reversing the order in the `gsutil cp` command.

In [32]:
!gsutil cp gs://{bucket_name}/to_upload.txt /tmp/gsutil_download.txt

# Print the result to make sure the transfer worked.
!cat /tmp/gsutil_download.txt

BucketNotFoundException: 404 gs://colab-sample-bucket-bdce244a-df39-11ef-8507-0242ac1c000c bucket does not exist.
cat: /tmp/gsutil_download.txt: No such file or directory


## Python API

These snippets based on [a larger example](https://github.com/GoogleCloudPlatform/storage-file-transfer-json-python/blob/master/chunked_transfer.py) that shows additional uses of the API.

 First, we create the service client.

In [38]:
from google.colab import auth
auth.authenticate_user()

In [39]:
project_id = 'global-wharf-445417-n9'  # O'z loyiha ID ingizni kiriting

In [33]:
from googleapiclient.discovery import build
gcs_service = build('storage', 'v1')

In [40]:
from googleapiclient.discovery import build
from google.colab import auth

# Autentifikatsiya qilish
auth.authenticate_user()

# Google Cloud Storage API ni yaratish
gcs_service = build('storage', 'v1')

# Fayl yaratish
with open('/tmp/to_upload.txt', 'w') as f:
  f.write('my sample file')

print('/tmp/to_upload.txt contains:')
!cat /tmp/to_upload.txt

# Bucket nomini yaratish
import uuid
bucket_name = 'colab-sample-bucket-' + str(uuid.uuid1())

# Bucket yaratish uchun so'rov tayyorlash
body = {
  'name': bucket_name,
  'location': 'us',  # Bucket joylashuvi
}

# Loyiha ID sini belgilash
project_id = 'your-project-id'  # O'z loyiha ID ingizni kiriting

# Bucket yaratish
try:
  gcs_service.buckets().insert(project=project_id, body=body).execute()
  print(f'Bucket {bucket_name} muvaffaqiyatli yaratildi!')
except Exception as e:
  print(f'Xatolik yuz berdi: {e}')

/tmp/to_upload.txt contains:
my sample file



Xatolik yuz berdi: <HttpError 403 when requesting https://storage.googleapis.com/storage/v1/b?project=your-project-id&alt=json returned "davronaliqulov81@gmail.com does not have storage.buckets.create access to the Google Cloud project. Permission 'storage.buckets.create' denied on resource (or it may not exist).". Details: "[{'message': "davronaliqulov81@gmail.com does not have storage.buckets.create access to the Google Cloud project. Permission 'storage.buckets.create' denied on resource (or it may not exist).", 'domain': 'global', 'reason': 'forbidden'}]">


Create a local file to upload.

In [34]:
with open('/tmp/to_upload.txt', 'w') as f:
  f.write('my sample file')

print('/tmp/to_upload.txt contains:')
!cat /tmp/to_upload.txt

/tmp/to_upload.txt contains:
my sample file

Create a bucket in the project specified above.

Upload the file to our newly created bucket.

In [43]:
# @markdown Once the upload has finished, the data will appear in the Cloud Console storage browser for your project:
print('https://console.cloud.google.com/storage/browser?project=' + project_id)

https://console.cloud.google.com/storage/browser?project=your-project-id


Download the file we just uploaded.

Inspect the downloaded file.


In [42]:
!cat /tmp/downloaded_from_gcs.txt

cat: /tmp/downloaded_from_gcs.txt: No such file or directory
