# WindmillAI 5 minute start

In this codelab we'll train a toy nonlinear regression
using [WindmillAI](http://www.WindmillAI.com) as the
experiment orchestrator. If you're not familiar with Google Colab just follow the instructions and click the play button on the left side of each code cell to move down the code lab. If you're viewing this notebook on Github and not Google Colab then [open it in Colab](https://colab.research.google.com/githubWindmillAI/codelabs/blob/main/codelabs/quickstart/notebook.ipynb).



As a first step we'll install and import the [WindmillAI client](https://github.com/windmillAI/windmillaipy) and some common machine learning libraries.

In [None]:
!pip install git+https://github.com/WindmillAI/windmillaipy.git

In [None]:
import io
import numpy as np
import matplotlib.pyplot as pl
import tensorflow as tf
import tqdm
import windmillaipy.client as wm

For the next step you'll need to have signed up for [WindmillAI](http://www.WindmillAI.com) service. Just click the sign up button on the top right corner. Don't worry, it's free and only takes a few seconds.

Once you have signed up you'll need to retrieve an API key from [your profile page](https://www.windmillai.com/profile). API keys are secrets specific to you, and you can create and delete them on your profile page as needed.

In your code you can manage your secrets normally, or export your key as the WINDMILLAI_API_KEY environment variable and the client will pick that up automatically if you initialize it with no arguments.

Enter your API key in the text box below and then create a client associated to your account:

In [None]:
API_KEY = ''  # @param{type: 'string'}

In [None]:
client: wm.WindmillClient = wm.WindmillClient(api_key=API_KEY)

In this next block of code we're going to write a data generator and a tiny deep neural network that learns a sine wave from noisy data. You can skip over this if you're just interested in the WindmillAI-specific parts.

In [None]:
def data_generator(num_samples: int = 512):
  'Generate (x, y) data.'
  
  x = np.random.uniform(0, 25, size=num_samples)
  y = np.sin(x) + np.random.normal(scale=0.5, size=num_samples)
  
  return x, y

In [None]:
def plot_result(x, y, regression=None, fileish=None):
  'Plot (x, y) data samples and overlay regression (if available).'

  pl.figure(figsize=(10, 5))

  if regression:
    xr = np.linspace(0, 25, 51)
    xr = np.expand_dims(xr, axis=1)
    yr = regression(xr)
    xr = np.squeeze(xr)
    yr = np.squeeze(yr)

  pl.subplot(1, 2, 1)
  pl.scatter(x, y)
  if regression:
    pl.plot(xr, yr, color='red', linewidth=8)

  pl.subplot(1, 2, 2)
  pl.hexbin(x, y, gridsize=20)
  if regression:
    pl.plot(xr, yr, color='red', linewidth=8)

  if fileish:
    pl.savefig(fileish, format='png')
  else:
    pl.show()

  pl.close()

In [None]:
x, y = data_generator(1000)

plot_result(x, y)

In [None]:
class Learner(tf.keras.Model):
  def __init__(self, network_width=25, num_hidden_layers=1):
    super(Learner, self).__init__()

    self.hidden_layers = []
    for _ in range(num_hidden_layers):
      self.hidden_layers.append(tf.keras.layers.Dense(units=network_width, activation=tf.nn.relu))

    self.out = tf.keras.layers.Dense(units=1, activation=None)

  def call(self, x):
    net = np.array([x], ndmin=2)
    for layer in self.hidden_layers:
      net = layer(net)
    return self.out(net)

We'll now create an experiment on WindmillAI. You'll be able to monitor and archive training progress below.

In [None]:
wu: wm.WorkUnit = client.create_experiment('quickstart-codelab',
                                           tags=['codelab', 'quickstart'])

print(f'Experiment page at http://www.windmillai.com/experiment/{wu.xid}')

This part contains the training loop. This may take a few minutes to run. Your experiment progress will be reported to Windmill. Use the link above to see what Windmill is archiving.

In [None]:
NUM_EPOCHS = 100000

learner = Learner()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)

wu.add_diary_entry('Learning sin(x) + N(0, sigma)')

for epoch in tqdm.tqdm(range(NUM_EPOCHS)):
  xs, ys = data_generator()

  with tf.GradientTape() as tape:
    xs = np.expand_dims(xs, axis=1)
    ys = np.expand_dims(ys, axis=1)
    yp = learner(xs, training=True)
    loss = tf.reduce_mean(tf.square(ys - yp))
    gradients = tape.gradient(loss, learner.trainable_variables)
    optimizer.apply_gradients(zip(gradients, learner.trainable_variables))

  if epoch % 1000 == 0 or epoch == NUM_EPOCHS-1:
    wu.record_measurements([{
        'label': 'loss',
        'steps': epoch,
        'value': float(np.mean(loss)),
        }])

  if epoch % 10000 == 0 or epoch == NUM_EPOCHS-1:
    img = io.BytesIO()
    plot_result(x, y, learner, img)
    wu.create_artifact(f'regression_at_{epoch}.png', img.getvalue())

We can quickly plot the results here, but you can also see this plot (and similar throughout training) on the WindmillAI experiment page.

In [None]:
plot_result(x, y, learner)

Finally we mark the experiment complete. If we wanted to save this model we would checkpoint the graph and upload it as an artifact first.

In [None]:
wu.complete_experiment()

Typically we wouldn't use this type of instrumentation on a Python notebook, but this illustrates the general flow of how Windmill can augment your ML code so you can focus on the models and not how you're going to keep track of data and observe the progress of jobs running in the cloud.