# ML Training & Evaluation with MLflow and DVC

## Prerequisites

### For DVC + Google Drive authentication:

**Option 1 (Simple):** DVC will prompt for browser authentication when you run `dvc pull`

**Option 2 (Service Account):**
1. Create service account in [Google Cloud Console](https://console.cloud.google.com/)
2. Download JSON key and upload to `/content/service-key.json`
3. Share your Google Drive folder with service account email
4. Copy folder ID from Drive URL: `drive.google.com/drive/folders/YOUR_FOLDER_ID`

### Setting up codebase and DVC

In [2]:
import os

GITHUB_ACCOUNT = input('Insert GitHub account: ')
GITHUB_REPO = input('Insert GitHub repo: ')
GDRIVE_FOLDER_ID = input('Insert Google Drive folder ID: ')
SERVICE_ACCOUNT_KEY_PATH = '/content/service-key.json'

Insert GitHub account: pavelihno
Insert GitHub repo: colab-dvc-template
Insert Google Drive folder ID: 1Oax82eCmH7CN_e2eoW8bADkdGOdBji8R


In [3]:
%%capture
!pip install dvc[gdrive] mlflow pyyaml
!git clone "https://github.com/{GITHUB_ACCOUNT}/{GITHUB_REPO}.git"

In [4]:
%cd "$GITHUB_REPO/"

/content/colab-dvc-template


In [5]:
%%capture
!dvc remote add -d -f gdrive "gdrive://$GDRIVE_FOLDER_ID"
!dvc remote modify gdrive gdrive_use_service_account true
!dvc remote modify gdrive gdrive_service_account_json_file_path "$GOOGLE_APPLICATION_CREDENTIALS"
!dvc pull

### Import libraries

In [6]:
%cd "/content/$GITHUB_REPO/src"

/content/colab-dvc-template/src


In [7]:
import tensorflow as tf

from train import train_model
from eval import evaluate_model
from utils import load_config, generate_sample_data
from model.sample import SampleNeuralNetwork

### Training and evaluating model

In [8]:
config = load_config('../configs/default.yaml')

run_id, model_path = train_model(config)

print(f'\nTraining completed!')
print(f'Run ID: {run_id}')
print(f'Model saved to: {model_path}')

2025/12/26 15:56:43 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2025/12/26 15:56:43 INFO mlflow.store.db.utils: Updating database tables
2025/12/26 15:56:43 INFO alembic.runtime.migration: Context impl SQLiteImpl.
2025/12/26 15:56:43 INFO alembic.runtime.migration: Will assume non-transactional DDL.
2025/12/26 15:56:43 INFO alembic.runtime.migration: Running upgrade  -> 451aebb31d03, add metric step
2025/12/26 15:56:43 INFO alembic.runtime.migration: Running upgrade 451aebb31d03 -> 90e64c465722, migrate user column to tags
2025/12/26 15:56:43 INFO alembic.runtime.migration: Running upgrade 90e64c465722 -> 181f10493468, allow nulls for metric values
2025/12/26 15:56:43 INFO alembic.runtime.migration: Running upgrade 181f10493468 -> df50e92ffc5e, Add Experiment Tags Table
2025/12/26 15:56:43 INFO alembic.runtime.migration: Running upgrade df50e92ffc5e -> 7ac759974ad8, Update run tags with larger limit
2025/12/26 15:56:43 INFO alembic.runtime.migration: Running 

Generating training data...
Saving sample data...
Data saved to: /content/colab-dvc-template/data
Creating model...
Model architecture:


Training model...
Epoch 1/50
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - accuracy: 0.3547 - loss: 1.2077 - val_accuracy: 0.4700 - val_loss: 0.8966
Epoch 2/50
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.4416 - loss: 0.9354 - val_accuracy: 0.5900 - val_loss: 0.7072
Epoch 3/50
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.5791 - loss: 0.7117 - val_accuracy: 0.7100 - val_loss: 0.5778
Epoch 4/50
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.7100 - loss: 0.5853 - val_accuracy: 0.7950 - val_loss: 0.4921
Epoch 5/50
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.8077 - loss: 0.4975 - val_accuracy: 0.8550 - val_loss: 0.4340
Epoch 6/50
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.8673 - loss: 0.4230 - val_accuracy: 0.8900 - val_loss: 0.3934
Epoch 7/50
[1m25/25