# Test cases requiring or benefiting from the context of a notebook

If the notebook runs successfully from start to finish, the test is successful!

TODO(all): Add additional tests and/or tests with particular assertions, as we encounter Python package version incompatibilities not currently detected by these tests.

In general, only add test cases here that require the context of a notebook. This is because this notebook, as currently written, will abort at the **first** failure. Compare this to a proper test suite where all cases are run, giving much more information about the full extent of any problems encountered.

# Package versions

In [None]:
!pip3 freeze

# Test cases requiring the context of a notebook 

## Test package installations

NOTE: installing packages via `%pip` installs them into the running kernel - no kernel restart needed.

In [None]:
import sys

In [None]:
sys.path

In [None]:
!env | grep PIP

### Install a package we do not anticipate already being installed on the base image

In [None]:
output = !pip3 show rich
print(output)  # Should show not yet installed.
assert(0 == output.count('Name: rich'))

In [None]:
%pip install rich==10.16.1

In [None]:
output = !pip3 show rich
print(output)  # Should show that it is now installed!
assert(1 == output.count('Name: rich'))

### Install a package **from source** that we do not anticipate already being installed on the base image

In [None]:
output = !pip3 show docstring-parser
print(output)  # Should show not yet installed.
assert(0 == output.count('Name: docstring-parser'))

In [None]:
%pip install docstring-parser==0.13

In [None]:
output = !pip3 show docstring-parser
print(output)  # Should show that it is now installed!
# TODO uncomment this test after https://github.com/DataBiosphere/terra-docker/issues/285
# is fixed.
#assert(1 == output.count('Name: docstring-parser'))

## Test ipython widgets

In [None]:
import ipywidgets as widgets

widgets.IntSlider()

In [None]:
## Test python images come with base google image

In [None]:
from markdown import *
markdown

import readline
readline.parse_and_bind('tab: complete')

# Teste scipy
from scipy import misc
import matplotlib.pyplot as plt

face = misc.face()
plt.imshow(face)
plt.show()

## Test BigQuery magic

* As of release [google-cloud-bigquery 1.26.0 (2020-07-20)](https://github.com/googleapis/python-bigquery/blob/master/CHANGELOG.md#1260-2020-07-20) the BigQuery Python client uses the BigQuery Storage client by default.
* This currently causes an error on Terra Cloud Runtimes `the user does not have 'bigquery.readsessions.create' permission for '<Terra billing project id>'`.
* To work around this, we do two things:
  1. remove the dependency `google-cloud-bigquery-storage` from the `terra-jupyter-python` image
  1. use flag `--use_rest_api` with `%%bigquery`

In [None]:
%load_ext google.cloud.bigquery

In [None]:
%%bigquery --use_rest_api

SELECT country_name, alpha_2_code
FROM `bigquery-public-data.utility_us.country_code_iso`
WHERE alpha_2_code LIKE 'A%'
LIMIT 5

## Test pandas profiling

In [None]:
import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame(
    np.random.rand(100, 5),
    columns=['a', 'b', 'c', 'd', 'e']
)

profile = ProfileReport(df, title='Pandas Profiling Report')
profile

# Test cases benefiting from the context of a notebook

Strictly speaking, these could be moved into the Python test cases, if desired.

## Test matplotlib

In [None]:
from __future__ import print_function, division
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
x = np.random.randn(10000)  # example data, random normal distribution
num_bins = 50
n, bins, patches = plt.hist(x, num_bins, facecolor="green", alpha=0.5)
plt.xlabel(r"Description of $x$ coordinate (units)")
plt.ylabel(r"Description of $y$ coordinate (units)")
plt.title(r"Histogram title here (remove for papers)")
plt.show();

## Test plotnine

In [None]:
from plotnine import ggplot, geom_point, aes, stat_smooth, facet_wrap
from plotnine.data import mtcars

(ggplot(mtcars, aes('wt', 'mpg', color='factor(gear)'))
 + geom_point()
 + stat_smooth(method='lm')
 + facet_wrap('~gear'))

## Test ggplot

In [None]:
from ggplot import *
ggplot

## Test source control tool availability

In [None]:
%%bash

which git
which ssh-agent
which ssh-add

## Test gcloud tools

In [None]:
%%bash

gcloud version 

In [None]:
%%bash

gcloud auth activate-service-account --key-file $GOOGLE_APPLICATION_CREDENTIALS

In [None]:
%%bash

gsutil ls gs://gcp-public-data--gnomad

In [None]:
%%bash

bq --project_id bigquery-public-data ls gnomAD

## Test Google Libraries

In [None]:
from google.cloud import datastore
datastore_client = datastore.Client()

In [None]:
from google.api_core import operations_v1

In [None]:
from google.cloud import storage

In [None]:
%%bash

# test composite object, requires python crcmod to be installed
gsutil cp gs://terra-docker-image-documentation/test-composite.cram . 

In [None]:
from google.cloud import bigquery

## Test TensorFlow
### See https://www.tensorflow.org/tutorials/quickstart/beginner

>Please redirect standard outputs and errors to stdout.txt and stderr.txt files by starting jupyter notebook with below command.
```
jupyter notebook --ip=0.0.0.0 > stdout.txt 2>stderr.txt
```

>The oneAPI Deep Neural Network Library (oneDNN) optimizations are also now available in the official x86-64 TensorFlow after v2.5. Users can enable those CPU optimizations by setting the the environment variable TF_ENABLE_ONEDNN_OPTS=1 for the official x86-64 TensorFlow after v2.5.

>We enable oneDNN Verbose log to validate the existenance of oneDNN optimization via DNNL_VERBSOE environemnt variable, and also set CUDA_VISIBLE_DEVCIES to -1 to run the workload on CPU.

In [None]:
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '1'
os.environ['DNNL_VERBOSE'] = '1'
os.environ['CUDA_VISIBLE_DEVICES']="-1"

In [None]:
import tensorflow as tf
tf.executing_eagerly() 

In [None]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

In [None]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

In [None]:
predictions = model(x_train[:1]).numpy()
predictions

In [None]:
tf.nn.softmax(predictions).numpy()

In [None]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [None]:
loss_fn(y_train[:1], predictions).numpy()

In [None]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

In [None]:
model.fit(x_train, y_train, epochs=5)

In [None]:
model.evaluate(x_test,  y_test, verbose=2)

In [None]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [None]:
probability_model(x_test[:5])

### Validate usage of oneDNN optimization 
First, we could check whether we have dnnl verose log or not while we test TensorFlow in the previous section.

In [None]:
!cat /tmp/stdout.txt | grep dnnl

Second, we could further analyze what oneDNN primitives are used while we run the workload by using a profile_utils.py script.

In [None]:
!wget https://raw.githubusercontent.com/oneapi-src/oneAPI-samples/master/Libraries/oneDNN/tutorials/profiling/profile_utils.py

In [None]:
import warnings
warnings.filterwarnings('ignore')

Finally, users should be able to see that inner_product oneDNN primitive is used for the workload.

In [None]:
run profile_utils.py /tmp/stdout.txt