# Google Drive connection

In [None]:
from google.colab import drive
drive.mount('/content/drive')

: 

### Complete Linux sandbox

In [2]:
# Current directory
!pwd
# List contents
!ls
# Root
!ls /
# Home
!ls ~/
# Copy: cp, make directory: mkdir, ...

/content
drive  sample_data
bin			    datalab  kaggle  libx32		       opt	   run	 tmp
boot			    dev      lib     media		       proc	   sbin  tools
content			    etc      lib32   mnt		       python-apt  srv	 usr
cuda-keyring_1.0-1_all.deb  home     lib64   NGC-DL-CONTAINER-LICENSE  root	   sys	 var


In [3]:
!ls /content/

drive  sample_data


In [4]:
!ls ~/

### Ready with Python, Tensorflow, Keras, ...

In [5]:
!python --version
# import <tab>

Python 3.10.12


# Kaggle API
### After downloading key from kaggle.com (My Account -> Create new API token) to Google Drive:
Access Google Drive, copy kaggle.json to Colab and report progress.  
Authentication of Google account is part of the process (once every session).
This script automates the process of downloading the kaggle.json file (containing Kaggle API credentials) from Google Drive and saving it to a specified directory on a Google Colab instance. It includes authentication, file searching, downloading in chunks, and setting appropriate permissions. The script is essential for enabling the Kaggle API in Colab without manually uploading files.

In [6]:
from googleapiclient.discovery import build
import io, os
from googleapiclient.http import MediaIoBaseDownload
from google.colab import auth
auth.authenticate_user()
drive_service = build('drive', 'v3')
results = drive_service.files().list(
        q="name = 'kaggle.json'", fields="files(id)").execute()
kaggle_api_key = results.get('files', [])
filename = "/.kaggle/kaggle.json"
os.makedirs(os.path.dirname(filename), exist_ok=True)
request = drive_service.files().get_media(fileId=kaggle_api_key[0]['id'])
fh = io.FileIO(filename, 'wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print("Download %d%%." % int(status.progress() * 100))
os.chmod(filename, 600)

Download 100%.


In [None]:
# Check existence of file
!ls /.kaggle
!cat /.kaggle/kaggle.json

In [8]:
# Notice the difference in path names with/without ~
!mkdir -p ~/.kaggle
!cp /.kaggle/kaggle.json ~/.kaggle/

In [9]:
# Install Python part of kaggle
!pip install kaggle



### Check if connection is succesfull

In [11]:
# Create the .kaggle directory in the home directory if it doesn't exist
!mkdir -p /root/.kaggle

# Move the kaggle.json file to the correct location
!mv /.kaggle/kaggle.json /root/.kaggle/kaggle.json

# Set the correct permissions for the kaggle.json file
!chmod 600 /root/.kaggle/kaggle.json


### Download MNIST-like fashion data from Zalando Research

In [14]:
!mkdir /content/kaggle

In [15]:
!kaggle datasets download -d zalando-research/fashionmnist -p /content/kaggle

Dataset URL: https://www.kaggle.com/datasets/zalando-research/fashionmnist
License(s): other
Downloading fashionmnist.zip to /content/kaggle
 99% 68.0M/68.8M [00:04<00:00, 19.8MB/s]
100% 68.8M/68.8M [00:04<00:00, 14.5MB/s]


See what you got

In [16]:
!ls /content/kaggle

fashionmnist.zip


Oooh. A zip file. Let's unzip it.

In [17]:
!unzip /content/kaggle/*.zip -d /content/kaggle/
!ls /content/kaggle/

Archive:  /content/kaggle/fashionmnist.zip
  inflating: /content/kaggle/fashion-mnist_test.csv  
  inflating: /content/kaggle/fashion-mnist_train.csv  
  inflating: /content/kaggle/t10k-images-idx3-ubyte  
  inflating: /content/kaggle/t10k-labels-idx1-ubyte  
  inflating: /content/kaggle/train-images-idx3-ubyte  
  inflating: /content/kaggle/train-labels-idx1-ubyte  
fashion-mnist_test.csv	 fashionmnist.zip	 t10k-labels-idx1-ubyte   train-labels-idx1-ubyte
fashion-mnist_train.csv  t10k-images-idx3-ubyte  train-images-idx3-ubyte


Go mad using Python

## Colab tips

In [18]:
# Check which GPU you got
!nvidia-smi -L

GPU 0: Tesla T4 (UUID: GPU-75f90a7a-8b76-4e38-5649-01e2141675d7)


In [19]:
# Check current resource allocation
!nvidia-smi

Tue Sep  3 19:10:54 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [20]:
# CPU type
!lscpu |grep 'Model name'

Model name:                           Intel(R) Xeon(R) CPU @ 2.20GHz


In [21]:
# CPU information
!lscpu

Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   2
  On-line CPU(s) list:    0,1
Vendor ID:                GenuineIntel
  Model name:             Intel(R) Xeon(R) CPU @ 2.20GHz
    CPU family:           6
    Model:                79
    Thread(s) per core:   2
    Core(s) per socket:   1
    Socket(s):            1
    Stepping:             0
    BogoMIPS:             4399.99
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 cl
                          flush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc re
                          p_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3
                           fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand
                           hypervisor lahf_lm abm 3dnowprefetch i

## Time-consuming calculations
- Connect to Google Drive
- Save checkpoints
- Test on something manageble
- https://saturncloud.io/blog/how-to-save-a-tensorflow-checkpoint-file-from-google-colaboratory-when-using-tpu-mode/