# Installation

In [None]:
! pip install flordb

# Getting Started

We start by selecting (or creating) a `git` repository to save our model training code as we iterate and experiment. Flor automatically commits your changes on every run, so no change is lost. Below we provide a sample repository you can use to follow along:

In [None]:
!git clone git@github.com:ucbepic/ml_tutorial ../ml_tutorial

In [2]:
import os
os.chdir('../ml_tutorial/')

Run the `train.py` script to train a small linear model, 
and test your `flordb` installation.

In [None]:
! python train.py --flor myFirstRun

Flor will manage checkpoints, logs, command-line arguments, code changes, and other experiment metadata on each run (More details [below](#storage--data-layout)). All of this data is then expesed to the user via SQL or Pandas queries.


# View your experiment history
From the same directory you ran the examples above, open an iPython terminal, then load and pivot the log records.


In [3]:
from flor import full_pivot, log_records
df = full_pivot(log_records())

df.head()

STEPPING IN b6208b62f23ff3312364f3d1a3a7ebba0425be47


Unnamed: 0,projid,runid,tstamp,vid,epoch,step,loss,epochs,lr,hidden,batch_size
0,ml_tutorial_flor.shadow.readme,myFirstRun,2023-07-20T11:51:45,b6208b62f23ff3312364f3d1a3a7ebba0425be47,1,100,0.4003249704837799,5,0.001,500,32
1,ml_tutorial_flor.shadow.readme,myFirstRun,2023-07-20T11:51:45,b6208b62f23ff3312364f3d1a3a7ebba0425be47,1,200,0.3539811670780182,5,0.001,500,32
2,ml_tutorial_flor.shadow.readme,myFirstRun,2023-07-20T11:51:45,b6208b62f23ff3312364f3d1a3a7ebba0425be47,1,300,0.4992929399013519,5,0.001,500,32
3,ml_tutorial_flor.shadow.readme,myFirstRun,2023-07-20T11:51:45,b6208b62f23ff3312364f3d1a3a7ebba0425be47,1,400,0.2370115667581558,5,0.001,500,32
4,ml_tutorial_flor.shadow.readme,myFirstRun,2023-07-20T11:51:45,b6208b62f23ff3312364f3d1a3a7ebba0425be47,1,500,0.1678403466939926,5,0.001,500,32


# Run some more experiments
The `train.py` script has been prepared in advance to define and manage four different hyper-parameters:

In [4]:
%cat train.py | grep flor.arg

hidden_size = flor.arg("hidden", default=500)
num_epochs = flor.arg("epochs", 5)
batch_size = flor.arg("batch_size", 32)
learning_rate = flor.arg("lr", 1e-3)


You can control any of the hyper-parameters (e.g. `hidden`) using Flor's command-line interface:

In [6]:
! python train.py --flor mySecondRun --hidden 75

Epoch [1/5], Step [100/1875], Loss: 0.7720
Epoch [1/5], Step [200/1875], Loss: 0.5216
Epoch [1/5], Step [300/1875], Loss: 0.2941
Epoch [1/5], Step [400/1875], Loss: 0.3155
Epoch [1/5], Step [500/1875], Loss: 0.5852
Epoch [1/5], Step [600/1875], Loss: 0.3453
Epoch [1/5], Step [700/1875], Loss: 0.1697
Epoch [1/5], Step [800/1875], Loss: 0.2625
Epoch [1/5], Step [900/1875], Loss: 0.2256
Epoch [1/5], Step [1000/1875], Loss: 0.3984
Epoch [1/5], Step [1100/1875], Loss: 0.6364
Epoch [1/5], Step [1200/1875], Loss: 0.3515
Epoch [1/5], Step [1300/1875], Loss: 0.1398
Epoch [1/5], Step [1400/1875], Loss: 0.1809
Epoch [1/5], Step [1500/1875], Loss: 0.2526
Epoch [1/5], Step [1600/1875], Loss: 0.0254
Epoch [1/5], Step [1700/1875], Loss: 0.0561
Epoch [1/5], Step [1800/1875], Loss: 0.0543
Epoch [2/5], Step [100/1875], Loss: 0.0632
Epoch [2/5], Step [200/1875], Loss: 0.2124
Epoch [2/5], Step [300/1875], Loss: 0.1795
Epoch [2/5], Step [400/1875], Loss: 0.1249
Epoch [2/5], Step [500/1875], Loss: 0.0871
Ep

### Advanced (Optional): Batch Processing
Alternatively, we can call `flor.batch()` from an interactive environment
inside our model training repository, to dispatch a group of jobs that can be long-runnning:

In [None]:
import flor
jobs = flor.cross_prod(hidden=[i*100 for i in range(1,6)],lr=(1e-4, 1e-3))
assert jobs is not None
flor.batch(jobs)