# Notebook 3: Reading Files from Google Drive

**Session 2: The AI-Empowered Coder**  
*Generative AI for Scholarship — Harvard HDSI & FAS*

© 2026 President and Fellows of Harvard College. Licensed under CC BY-NC 4.0.

---

## What This Notebook Covers

Google Colab runs on Google's servers, not on your laptop. That means you
**cannot access files on your local hard drive** directly. But you can access
files stored in your **Google Drive**.

This notebook shows you how to:

1. **Mount** your Google Drive so Colab can see your files
2. **Navigate** the Drive file structure from Python
3. **Read and display an image** (testimage.png) from your Drive
4. **Understand the file path** mapping between Drive and Colab

### Before You Start

Make sure you have a file called **`testimage.png`** in the top level of your
Google Drive (the "My Drive" folder). Any PNG image will work — a photo,
a diagram, a screenshot, etc.

In [None]:
# ============================================================
# Run this cell first to increase font sizes for presentation.
# ============================================================

from IPython.display import HTML, display
display(HTML('''<style>
/* Larger fonts for presenting on a projector */
.text_cell_render, .markdown-cell-content, .rendered_html {
    font-size: 18px !important;
    line-height: 1.6 !important;
}
.CodeMirror pre, .monaco-editor .view-lines {
    font-size: 16px !important;
    line-height: 1.5 !important;
}
.output pre, .output_text pre, .output_area pre {
    font-size: 15px !important;
    line-height: 1.4 !important;
}
</style>'''))
print("Font sizes increased for presentation.")

---

## Step 1: Mount Google Drive

"Mounting" makes your Google Drive accessible as a folder inside Colab.
After mounting, all your Drive files appear under the path
`/content/drive/MyDrive/`.

**When you run this cell:**
- Colab will ask you to **authorize access** to your Google Drive
- A pop-up window will appear asking you to sign in with your Google account
- Click **Allow** to grant access
- This authorization is needed once per Colab session

In [None]:
# ============================================================
# Mount Google Drive.
#
# After this runs, your Drive files are accessible at:
#   /content/drive/MyDrive/
#
# A pop-up will ask you to authorize access — click Allow.
# ============================================================

from google.colab import drive

drive.mount('/content/drive')

print("\nGoogle Drive is now mounted.")
print("Your files are at: /content/drive/MyDrive/")

---

## Step 2: Explore Your Drive Contents

Let's list the files and folders at the top level of your Drive
to confirm the mount worked and to see what's there.

### How the Path Mapping Works

| What you see in Google Drive | Path in Colab |
|------------------------------|---------------|
| My Drive (top level) | `/content/drive/MyDrive/` |
| My Drive → some_folder | `/content/drive/MyDrive/some_folder/` |
| My Drive → testimage.png | `/content/drive/MyDrive/testimage.png` |
| My Drive → Colab Notebooks | `/content/drive/MyDrive/Colab Notebooks/` |

In [None]:
# ============================================================
# List the files and folders at the top level of your Drive.
#
# os.listdir() returns a list of file/folder names.
# We sort them alphabetically for easier reading.
# ============================================================

import os

drive_path = '/content/drive/MyDrive'

# List everything in the top level of your Drive
contents = sorted(os.listdir(drive_path))

print(f"Files and folders in your Google Drive ({len(contents)} items):")
print("=" * 50)
for item in contents:
    full_path = os.path.join(drive_path, item)
    if os.path.isdir(full_path):
        print(f"  [folder]  {item}")
    else:
        # Show file size in KB
        size_kb = os.path.getsize(full_path) / 1024
        print(f"  [file]    {item}  ({size_kb:.1f} KB)")

### Check for testimage.png

Let's verify that `testimage.png` exists in your Drive before we try to open it.

In [None]:
# ============================================================
# Check whether testimage.png exists in your Drive.
#
# If this says "NOT FOUND", make sure you uploaded testimage.png
# to the top level of your Google Drive (My Drive), not
# inside a subfolder.
# ============================================================

image_path = '/content/drive/MyDrive/testimage.png'

if os.path.exists(image_path):
    size_kb = os.path.getsize(image_path) / 1024
    print(f"FOUND: {image_path}")
    print(f"File size: {size_kb:.1f} KB")
else:
    print(f"NOT FOUND: {image_path}")
    print(f"\nPlease upload testimage.png to the top level of your Google Drive.")
    print(f"Go to drive.google.com, click '+ New' → 'File upload', and select your image.")

---

## Step 3: Read and Display the Image

We'll use two libraries to read and display the image:
- **PIL (Pillow)** — the standard Python library for opening image files
- **Matplotlib** — to display the image in the notebook

Both are pre-installed in Colab.

In [None]:
# ============================================================
# Read testimage.png from Google Drive and display it.
#
# PIL.Image.open() reads the image file.
# matplotlib's imshow() displays it in the notebook.
#
# We turn off the axis labels since they're just pixel
# coordinates and aren't meaningful for viewing a photo.
# ============================================================

from PIL import Image
import matplotlib.pyplot as plt

# Path to the image in Google Drive
image_path = '/content/drive/MyDrive/testimage.png'

# Open the image
img = Image.open(image_path)

# Print basic information about the image
print(f"Image loaded successfully!")
print(f"  Format: {img.format}")
print(f"  Size:   {img.size[0]} × {img.size[1]} pixels")
print(f"  Mode:   {img.mode}")

# Display the image
fig, ax = plt.subplots(figsize=(8, 6))
ax.imshow(img)
ax.set_title('testimage.png from Google Drive')
ax.axis('off')    # Hide pixel coordinate axes
plt.tight_layout()
plt.show()

---

## Step 4: Working with the Image Data

Images in Python are just arrays of numbers. Each pixel has values for
its Red, Green, and Blue (RGB) color channels, each ranging from 0 to 255.

We can convert the image to a NumPy array to inspect or manipulate the
pixel values directly.

In [None]:
# ============================================================
# Convert the image to a NumPy array and examine it.
#
# The array shape is (height, width, channels).
# For an RGB image, channels = 3 (Red, Green, Blue).
# Each value is an integer from 0 (black) to 255 (max intensity).
# ============================================================

import numpy as np

# Convert image to array
img_array = np.array(img)

print(f"Image as NumPy array:")
print(f"  Shape: {img_array.shape}")
print(f"  Data type: {img_array.dtype}")
print(f"  Value range: {img_array.min()} to {img_array.max()}")

# Average color of the entire image
avg_r = img_array[:, :, 0].mean()
avg_g = img_array[:, :, 1].mean()
avg_b = img_array[:, :, 2].mean()
print(f"\nAverage pixel color (RGB): ({avg_r:.0f}, {avg_g:.0f}, {avg_b:.0f})")

---

## Why Google Drive and Not My Laptop?

A common question: **"Why can't I just read files from my laptop's hard drive?"**

The answer is that Google Colab runs on **Google's servers**, not on your
laptop. Your browser is just a window into a remote machine. When you write
`open('file.txt')`, Python looks on Google's server — not your laptop.

| File Location | Accessible from Colab? | How |
|---------------|----------------------|-----|
| Your laptop | No (not directly) | Upload to Colab or to Google Drive first |
| Google Drive | Yes | `drive.mount()` then read from `/content/drive/MyDrive/` |
| Colab session | Yes (temporary) | Files in `/content/` — deleted when session ends |
| Web URL | Yes | `!wget URL` or `requests.get(URL)` |

For **persistent storage** of your research data, Google Drive is the
recommended approach in Colab.

---

## Try It: Use Gemini to Help with Image Analysis

Click on the empty cell below, use the **magic wand**, and try one of these prompts:

- **"Convert this image to grayscale and display the original and grayscale side by side"**
- **"Create a histogram showing the distribution of pixel brightness values"**
- **"Resize the image to 200×200 pixels and save it as thumbnail.jpg in Google Drive"**

In [None]:
# Use the magic wand to generate image analysis code here



---

## Summary

| Operation | Code |
|-----------|------|
| Mount Google Drive | `from google.colab import drive; drive.mount('/content/drive')` |
| Path to Drive files | `/content/drive/MyDrive/your_file.ext` |
| List Drive contents | `os.listdir('/content/drive/MyDrive')` |
| Open an image | `Image.open('/content/drive/MyDrive/testimage.png')` |
| Display an image | `plt.imshow(img); plt.axis('off'); plt.show()` |
| Image to array | `np.array(img)` — shape is (height, width, 3) for RGB |

**Key takeaway:** Colab cannot see your laptop's files. Use Google Drive
as your file storage when working in Colab.