# Knowledge Distillation â€” Colab Runner

This notebook installs dependencies, runs a quick smoke test to validate the pipeline, and verifies the presence of the dataset under `Data/`.

Run the cells in order. For heavy training (teacher/student full training) use a GPU runtime.

In [ ]:
# 1) Install pinned dependencies
!pip install -q -r requirements.txt

In [ ]:
# 2) Run the quick smoke test (fast, validates end-to-end pipeline)
import os
print('Running smoke_test.py...')
os.system('python3 smoke_test.py')

### 3) Verify dataset files
This will print the files under `Data/` and show the top rows of one example CSV if present.

In [ ]:
import os, pandas as pd
proj = '.'
data_root = os.path.join(proj, 'Data')
print('Data folder exists:', os.path.exists(data_root))
if os.path.exists(data_root):
    for root, dirs, files in os.walk(data_root):
        rel = os.path.relpath(root, proj)
        print(f'\nFolder: {rel}')
        for f in sorted(files):
            print('  -', f)
    # show small preview of first csv found
    csvs = []
    for root, dirs, files in os.walk(data_root):
        for f in files:
            if f.lower().endswith('.csv'):
                csvs.append(os.path.join(root, f))
    if csvs:
        print('\nPreview of', csvs[0])
        display(pd.read_csv(csvs[0], nrows=5))
else:
    print('No Data/ folder found. Upload Data/ into repo or mount Drive and copy files.')

### 4) (Optional) Run a safe single-epoch quick test on the real data
Uncomment and modify the next cell if you want to run a one-epoch check (keep batch small). This may take longer and may need GPU for speed.

In [ ]:
# Example quick test (commented). Edit TEXT_COL/LABEL_COL if needed.
# !python - <<"PY"
# from pathlib import Path
# import pandas as pd
# import os
# proj = '.'
# train = Path(proj)/'Data'/'Tamil_codemix'/'tam_train.csv'
# if train.exists():
#     df = pd.read_csv(train)
#     print('Train rows:', len(df))
# else:
#     print('Train file not found, skip quick test')
# PY

If you store datasets externally (Google Drive, Zenodo, S3), add a new cell to mount Drive and copy files into `/content/Knowledge-distillation-Codemix/Data/` before running the preview cells.