In [None]:
# Copyright 2023 Golem Factory GmbH
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

In [None]:
# This example is modification of "Linear Regression with a Real Dataset" Notebook from Machine Learning Crash Course created by Google and available here: https://github.com/google/eng-edu/blob/main/ml/cc/exercises/linear_regression_with_a_real_dataset.ipynb
# This exercises is made available under the Apache License Version 2.0. You may obtain a copy of the License here: http://www.apache.org/licenses/LICENSE-2.0
#
# Data Set used in this example contains data drawn from the 1990 U.S. Census. It is also available as part of Machine Learning Crash Course created by Google and available here: https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv 
# Data Set description is available here: https://developers.google.com/machine-learning/crash-course/california-housing-data-description

## Jupyter on Golem - Goerli Testnet Example

Let's start with typing %help command to display all Jupyter on Golem magic commands.

In [None]:
%help

Now let's check our wallet address and available funds with %status command.

In [None]:
%status

If You do not kave any tGLM and tETH yet on your Jupyter on Golem wallet, then You need to fund it first. To do that we need to use %fund command.

In [None]:
%fund goerli

Type %status command again to verify that your funds have arrived.

In [None]:
%status

Looks like we have some tGLM and tETH from Golem Faucet. We are ready to start some real fun with Jupyter on Golem!

Are You afraid that your tokens will be stuck on Jupyter on Golem wallet after some playtime? Do not worry! Your tokens won't be stuck. If You decide to take them back you can always export private key from YAGNA CLI. More information can be found here: https://handbook.golem.network/payments/using-golem-on-mainnet/backing-up-your-golem-wallet.

OK, now is the time to allocate some GLM tokens for future computations. Let's start with 2 GLM.

In [None]:
%budget goerli 2

With Allocation done, let's look for some provider. Nothing to fancy... 4GB of RAM, 10GB of disk space and 2 Cores will be more than enough for this example. By default %connect command is set with 10 minute timeout. If You won't be able to connect within this time just try again or try again with higher timeout (e.g. timeout=15m). When your provider is finally up and running You will see "Ready." message.

In [None]:
%connect mem>4 disk>10 cores>2

Now we should be at the Provider's Host. Let's verify that with some commands.

In [None]:
pwd

In [None]:
ls -l

In [None]:
%status

Yep! We definitely are not in Kansas anymore :)

Let's upload some Data Set into the Provider to make some computations out of it. Just remember that You need to have Your Data Set stored locally first. 
You can get Data Set for this example (california_housing_train.csv) from here: https://github.com/golemfactory/golem-kernel-python/blob/master/examples/california_housing_train.csv. 
Just be sure that You see that file on your left JupyterLab panel!

In [None]:
%upload california_housing_train.csv

Great! Our Data Set should be on the Provider's host. We need to remember that %upload command will put files into ./workdir folder. Let's go there and check that.

In [None]:
cd workdir

In [None]:
ls -l

You should be able to see california_housing_train.csv file in the workdir folder. Let's go back.

In [None]:
cd ..

It is time to start our computations. Let's import some pre-installed modules fist. It might take a moment.

In [None]:
import pandas as pd
import tensorflow as tf
from matplotlib import pyplot as plt

Time to start some Data Analytic magic!

In [None]:
# The following lines adjust the granularity of reporting. 
pd.options.display.max_rows = 10
pd.options.display.float_format = "{:.1f}".format
training_df = pd.read_csv(filepath_or_buffer="workdir/california_housing_train.csv")
# Scale the label.
training_df["median_house_value"] /= 1000.0

In [None]:
def build_model(my_learning_rate):
  """Create and compile a simple linear regression model."""
  # Most simple tf.keras models are sequential.
  model = tf.keras.models.Sequential()

  # Describe the topography of the model.
  # The topography of a simple linear regression model
  # is a single node in a single layer.
  model.add(tf.keras.layers.Dense(units=1, 
                                  input_shape=(1,)))

  # Compile the model topography into code that TensorFlow can efficiently
  # execute. Configure training to minimize the model's mean squared error. 
  model.compile(optimizer=tf.keras.optimizers.experimental.RMSprop(learning_rate=my_learning_rate),
                loss="mean_squared_error",
                metrics=[tf.keras.metrics.RootMeanSquaredError()])

  return model        


def train_model(model, df, feature, label, epochs, batch_size):
  """Train the model by feeding it data."""

  # Feed the model the feature and the label.
  # The model will train for the specified number of epochs. 
  history = model.fit(x=df[feature],
                      y=df[label],
                      batch_size=batch_size,
                      epochs=epochs)

  # Gather the trained model's weight and bias.
  trained_weight = model.get_weights()[0][0][0]
  trained_bias = model.get_weights()[1][0]

  # The list of epochs is stored separately from the rest of history.
  epochs = history.epoch
  
  # Isolate the error for each epoch.
  hist = pd.DataFrame(history.history)

  # To track the progression of training, we're going to take a snapshot
  # of the model's root mean squared error at each epoch. 
  rmse = hist["root_mean_squared_error"]

  return trained_weight, trained_bias, epochs, rmse

Below functions will plot our results and save them as .png files.

In [None]:
def plot_and_save_the_model(trained_weight, trained_bias, feature, label):
  """Plot the trained model against 200 random training examples."""

  # Label the axes.
  plt.xlabel(feature)
  plt.ylabel(label)

  # Create a scatter plot from 200 random points of the dataset.
  random_examples = training_df.sample(n=200)
  plt.scatter(random_examples[feature], random_examples[label])

  # Create a red line representing the model. The red line starts
  # at coordinates (x0, y0) and ends at coordinates (x1, y1).
  x0 = 0
  y0 = trained_bias
  x1 = random_examples[feature].max()
  y1 = trained_bias + (trained_weight * x1)  
  plt.plot([x0, x1], [y0, y1], c='r')

  # Render the scatter plot and the red line.
  plt.savefig('workdir/model_results.png')
  plt.show()


def plot_and_save_the_loss_curve(epochs, rmse):
  """Plot a curve of loss vs. epoch."""

  plt.figure()
  plt.xlabel("Epoch")
  plt.ylabel("Root Mean Squared Error")

  plt.plot(epochs, rmse, label="Loss")
  plt.legend()
  plt.ylim([rmse.min()*0.97, rmse.max()])
  plt.savefig('workdir/loss_results.png')
  plt.show()  

And now let's ask our provider to do some "heavy" lifting for us :)

In [None]:
# The following variables are the hyperparameters.
learning_rate = 0.01
epochs = 30
batch_size = 30

# Specify the feature and the label.
my_feature = "total_rooms"  # the total number of rooms on a specific city block.
my_label="median_house_value" # the median value of a house on a specific city block.
# That is, you're going to create a model that predicts house value based 
# solely on total_rooms.  

# Discard any pre-existing version of the model.
my_model = None

# Invoke the functions.
my_model = build_model(learning_rate)
weight, bias, epochs, rmse = train_model(my_model, training_df, 
                                         my_feature, my_label,
                                         epochs, batch_size)

print("\nThe learned weight for your model is %.4f" % weight)
print("The learned bias for your model is %.4f\n" % bias )

It is time to plot our results to see how they look like.

In [None]:
plot_and_save_the_model(weight, bias, my_feature, my_label)

In [None]:
plot_and_save_the_loss_curve(epochs, rmse)

Looks nice. Let's download them on our local machine so that we won't lose them!

In [None]:
%download model_results.png

In [None]:
%download loss_results.png

Ok. That was cool but what if I want to use some other PIP modules, which are not pre-installed?

That is simple. Just use %pip install command. Let's use that occasion to add some color to this terminal.

In [None]:
%pip install colorama

In [None]:
from colorama import Fore
print(Fore.BLUE + 'Jupyter on Golem is great!')

We are getting close to the end of this example. It is high time to disconnect from our provider and initiate the payment.

In [None]:
%disconnect

And that is it! We hope You liked Jupyter on Golem. Please consider to give us a feedback.