# GCS ↔ Google Drive Sync Utility

Mount both GCS bucket and Google Drive to copy caches/checkpoints between them.

**Use cases:**
- Copy KD caches from Colab (Drive) to TPU (GCS)
- Download TPU training results (GCS) to Drive for Colab access
- Backup between storage systems

In [None]:
# ============================================================
# 1. MOUNT GOOGLE DRIVE
# ============================================================

from google.colab import drive
drive.mount('/content/drive')

# Standard paths
DRIVE_ROOT = '/content/drive/MyDrive'
DRIVE_CACHES = f'{DRIVE_ROOT}/qwen3_caches'
DRIVE_RUNS = f'{DRIVE_ROOT}/qwen3_runs'
DRIVE_NOTEBOOKS = f'{DRIVE_ROOT}/notebooks'

# Create if not exist
!mkdir -p {DRIVE_CACHES} {DRIVE_RUNS} {DRIVE_NOTEBOOKS}

print(f'Drive caches: {DRIVE_CACHES}')
print(f'Drive runs: {DRIVE_RUNS}')

In [None]:
# ============================================================
# 2. AUTHENTICATE GCS
# ============================================================

from google.colab import auth
auth.authenticate_user()

# GCS bucket config
GCS_PROJECT = 'nodal-seer-483020-h6'
GCS_BUCKET = 'anemll_tpu'

GCS_ROOT = f'gs://{GCS_BUCKET}'
GCS_CACHES = f'{GCS_ROOT}/qwen3_caches'
GCS_RUNS = f'{GCS_ROOT}/qwen3_runs'

# Set project
!gcloud config set project {GCS_PROJECT}

print(f'GCS caches: {GCS_CACHES}')
print(f'GCS runs: {GCS_RUNS}')

In [None]:
# ============================================================
# 3. VERIFY ACCESS
# ============================================================

print('=== Testing GCS Access ===')
!gsutil ls {GCS_ROOT}/ 2>&1 | head -5

print('\n=== Testing Drive Access ===')
!ls {DRIVE_ROOT}/ 2>&1 | head -5

---
## List Contents

In [None]:
# ============================================================
# LIST GCS CACHES
# ============================================================

print('=== GCS Caches ===')
!gsutil ls {GCS_CACHES}/

In [None]:
# ============================================================
# LIST GCS RUNS
# ============================================================

print('=== GCS Runs ===')
!gsutil ls {GCS_RUNS}/

In [None]:
# ============================================================
# LIST DRIVE CACHES
# ============================================================

print('=== Drive Caches ===')
!ls -la {DRIVE_CACHES}/

In [None]:
# ============================================================
# LIST DRIVE RUNS
# ============================================================

print('=== Drive Runs ===')
!ls -la {DRIVE_RUNS}/

In [None]:
# ============================================================
# LIST DRIVE NOTEBOOKS
# ============================================================

print('=== Drive Notebooks ===')
!ls -la {DRIVE_NOTEBOOKS}/

In [None]:
# ============================================================
# DETAILED GCS CACHE INFO (with sizes)
# ============================================================

print('=== GCS Cache Sizes ===')
!gsutil du -sh {GCS_CACHES}/*

In [None]:
# ============================================================
# DETAILED DRIVE CACHE INFO (with sizes)
# ============================================================

print('=== Drive Cache Sizes ===')
!du -sh {DRIVE_CACHES}/*

---
## Copy Operations

In [None]:
# ============================================================
# COPY: GCS → DRIVE (Cache)
# ============================================================

# Set the cache name to copy
CACHE_NAME = 'openhermes_2.5_L128_K128_N50K'  # <-- EDIT THIS

print(f'Copying {CACHE_NAME} from GCS to Drive...')
!mkdir -p {DRIVE_CACHES}/{CACHE_NAME}
!gsutil -m cp -r {GCS_CACHES}/{CACHE_NAME}/* {DRIVE_CACHES}/{CACHE_NAME}/

print('\nDone! Verifying...')
!ls -la {DRIVE_CACHES}/{CACHE_NAME}/ | head -10

In [None]:
# ============================================================
# COPY: DRIVE → GCS (Cache)
# ============================================================

# Set the cache name to copy
CACHE_NAME = 'alpaca_chat_think_both_L128_K128_R1024'  # <-- EDIT THIS

print(f'Copying {CACHE_NAME} from Drive to GCS...')
!gsutil -m cp -r {DRIVE_CACHES}/{CACHE_NAME}/* {GCS_CACHES}/{CACHE_NAME}/

print('\nDone! Verifying...')
!gsutil ls {GCS_CACHES}/{CACHE_NAME}/ | head -10

In [None]:
# ============================================================
# COPY: GCS → DRIVE (Run/Checkpoint)
# ============================================================

# Set the run name to copy
RUN_NAME = 'SR-008B-stage1-mlp-hermes'  # <-- EDIT THIS

print(f'Copying {RUN_NAME} from GCS to Drive...')
!mkdir -p {DRIVE_RUNS}/{RUN_NAME}
!gsutil -m cp -r {GCS_RUNS}/{RUN_NAME}/* {DRIVE_RUNS}/{RUN_NAME}/

print('\nDone! Verifying...')
!ls -la {DRIVE_RUNS}/{RUN_NAME}/

In [None]:
# ============================================================
# COPY: DRIVE → GCS (Run/Checkpoint)
# ============================================================

# Set the run name to copy
RUN_NAME = 'anemll_q4_a4_e2e_v2'  # <-- EDIT THIS

print(f'Copying {RUN_NAME} from Drive to GCS...')
!gsutil -m cp -r {DRIVE_RUNS}/{RUN_NAME}/* {GCS_RUNS}/{RUN_NAME}/

print('\nDone! Verifying...')
!gsutil ls {GCS_RUNS}/{RUN_NAME}/

---
## Sync Operations (rsync-style)

In [None]:
# ============================================================
# SYNC ALL CACHES: GCS → DRIVE
# ============================================================
# Only copies new/changed files

print('Syncing all caches from GCS to Drive...')
!gsutil -m rsync -r {GCS_CACHES}/ {DRIVE_CACHES}/
print('Done!')

In [None]:
# ============================================================
# SYNC ALL CACHES: DRIVE → GCS
# ============================================================
# Only copies new/changed files

print('Syncing all caches from Drive to GCS...')
!gsutil -m rsync -r {DRIVE_CACHES}/ {GCS_CACHES}/
print('Done!')

In [None]:
# ============================================================
# SYNC ALL RUNS: GCS → DRIVE
# ============================================================

print('Syncing all runs from GCS to Drive...')
!gsutil -m rsync -r {GCS_RUNS}/ {DRIVE_RUNS}/
print('Done!')

---
## Delete Operations (Use with caution!)

In [None]:
# ============================================================
# DELETE FROM GCS
# ============================================================

# UNCOMMENT AND EDIT TO DELETE
# DELETE_PATH = 'old_cache_name'
# !gsutil -m rm -r {GCS_CACHES}/{DELETE_PATH}

print('Uncomment the lines above to delete from GCS')

In [None]:
# ============================================================
# DELETE FROM DRIVE
# ============================================================

# UNCOMMENT AND EDIT TO DELETE
# DELETE_PATH = 'old_cache_name'
# !rm -rf {DRIVE_CACHES}/{DELETE_PATH}

print('Uncomment the lines above to delete from Drive')

---
## Quick Reference

### GCS Commands
```bash
# List bucket
gsutil ls gs://anemll_tpu/

# List with sizes
gsutil du -sh gs://anemll_tpu/qwen3_caches/*

# Copy folder
gsutil -m cp -r SOURCE DEST

# Sync (only new/changed)
gsutil -m rsync -r SOURCE DEST

# Delete
gsutil -m rm -r PATH
```

### Paths
| Location | Caches | Runs |
|----------|--------|------|
| GCS | `gs://anemll_tpu/qwen3_caches/` | `gs://anemll_tpu/qwen3_runs/` |
| Drive | `/content/drive/MyDrive/qwen3_caches/` | `/content/drive/MyDrive/qwen3_runs/` |

---
## Save Notebooks to Drive

In [None]:
# ============================================================
# SAVE THIS NOTEBOOK TO GOOGLE DRIVE
# ============================================================

NOTEBOOK_NAME = 'GCS_Drive_Sync.ipynb'

# Copy from Colab runtime to Drive
!cp /content/qwen3_apple_style_2bit_qat_lora/notebooks/{NOTEBOOK_NAME} {DRIVE_NOTEBOOKS}/
print(f'Saved {NOTEBOOK_NAME} to {DRIVE_NOTEBOOKS}/')

In [None]:
# ============================================================
# SAVE ANY NOTEBOOK TO GOOGLE DRIVE
# ============================================================

# List available notebooks
print('=== Available Notebooks ===')
!ls /content/qwen3_apple_style_2bit_qat_lora/notebooks/*.ipynb

# Copy specific notebook (edit name)
# NOTEBOOK_TO_SAVE = 'Generate_KD_Cache_Qwen32B_All3.ipynb'  # <-- EDIT THIS
# !cp /content/qwen3_apple_style_2bit_qat_lora/notebooks/{NOTEBOOK_TO_SAVE} {DRIVE_NOTEBOOKS}/
# print(f'Saved {NOTEBOOK_TO_SAVE} to {DRIVE_NOTEBOOKS}/')

In [None]:
# ============================================================
# SAVE ALL NOTEBOOKS TO GOOGLE DRIVE
# ============================================================

print('Copying all notebooks to Google Drive...')
!cp /content/qwen3_apple_style_2bit_qat_lora/notebooks/*.ipynb {DRIVE_NOTEBOOKS}/

print('\n=== Notebooks in Drive ===')
!ls -la {DRIVE_NOTEBOOKS}/*.ipynb