# Test Notebook

## Test PyData Stack

**This block tests whether Python, Jupyter and all packages required are installed properly and working.**

Please run the cell below (select and press <kbd>SHIFT+ENTER</kbd>). You should see the following output on the last line: 

```python
Congratulations, your Python stack is ready to go
```

In [None]:
fail = False
try:
    import sys
    version_string = sys.version
    version_parts = version_string.split(".")
    major = int(version_parts[0])
    minor = int(version_parts[1])
    if (major) >= 3 and (minor >= 6):
        print(f"""Your Python interpreter is ready. Your version: 
        {version_string}
        """)
    else:
        print(f"""Your version of Python is older than required: 
            {version_string}
        """)
        fail = True
except:
    pass

try:
    import pandas
except ImportError:
    print(f"""Importing package failed: pandas""")
    fail = True

try:
    import numpy
except ImportError:
    print(f"""Importing package failed: numpy""")
    fail = True
    
try:
    import matplotlib
except ImportError:
    print(f"""Importing package failed: matplotlib""")
    fail = True
    
try:
    import sklearn
except ImportError:
    print(f"""Importing package failed: sklearn""")
    fail = True
    
try:
    import data_science_learning_paths
except ImportError:
    print(f"""Importing package failed: data_science_learning_paths""")
    fail = True

if not fail:
    print("")
    print(f"""Congratulations, your Python stack is ready to go""")
else:
    print("")
    print("Your Python stack is not ready, please check error messages above")

## Test PySpark Stack

**This block tests whether PySpark and all packages required are installed properly and working.**

Please run the cell below (select and press <kbd>SHIFT+ENTER</kbd>). You should see the following output on the last line: 

```python
Congratulations, your PySpark stack is ready to go
```

### PySpark via Jupyter Notebook

In [None]:
fail = False
try:
    import sys
    version_string = sys.version
    version_parts = version_string.split(".")
    major = int(version_parts[0])
    minor = int(version_parts[1])
    if (major) >= 3 and (minor >= 6):
        print(f"""Your Python interpreter is ready. Your version: 
        {version_string}
        """)
    else:
        print(f"""Your version of Python is older than required: 
            {version_string}
        """)
        fail = True
except:
    pass

try:
    import pandas
except ImportError:
    print(f"""Importing package failed: pandas""")
    fail = True

try:
    import findspark
    findspark.init()
except ImportError:
    print(f"""Importing package failed: findspark""")
    fail = True
    
try:
    import pyspark
except ImportError():
    print(f"""Importing package failed: pyspark""")
    fail = True

if not fail:
    print("")
    print(f"""Congratulations, your PySpark stack is ready to go""")
else:
    print("")
    print("Your Python stack is not ready, please check error messages above")

### PySpark Batch Jobs

Now evaluate the cells below. This creates a script that is then submitted to your PySpark installation. Verify the output: You should see something like this:

    ##########################################
    PySpark uses Python version:  3.6.5 (default, Apr 25 2018, 14:23:58) 
    [GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)]
    Congratulations, submitting a PySpark job is working
    ##########################################


Make sure PySpark is using the right Python version. This can be achieved by setting the environment variable `PYSPARK_PYTHON` to the appropriate Python binary.

In [None]:
%%file spark/scripts/spark_job_test.py

SPARK_APP_NAME='sparkjob_test'

import sys
from contextlib import contextmanager
from pyspark import SparkContext, SparkConf

@contextmanager
def use_spark_context(appName):
    conf = SparkConf().setAppName(appName) 
    spark_context = SparkContext(conf=conf)

    try:
        print("starting ", SPARK_APP_NAME)
        yield spark_context
    finally:
        spark_context.stop()
        print("stopping ", SPARK_APP_NAME)


with use_spark_context(appName=SPARK_APP_NAME) as sc:
    rdd = sc.range(100)
    print()
    print("##########################################")
    print("PySpark uses Python version: ", sys.version)
    print("Congratulations, submitting a PySpark job is working")
    print("##########################################")
    print()


In [None]:
!spark-submit spark/scripts/spark_job_test.py

## Test TensorFlow Stack

**This block tests whether TensorFlow and all packages required are installed properly and working.**

Please run the cell below (select and press <kbd>SHIFT+ENTER</kbd>). You should see the following output on the last line: 

```python
Congratulations, your TensorFlow stack is ready to go
```

In [None]:
fail = False
gpu_support = None

try:
    import tensorflow
except ImportError:
    print(f"""Importing package failed: tensorflow""")
    fail = True

try:
    from tensorflow import keras
except ImportError:
    print(f"""Importing package failed: keras""")
    fail = True
    
try:
    import numpy
except ImportError:
    print(f"""Importing package failed: numpy""")
    fail = True
    
try:
    gpu_support = tensorflow.test.is_gpu_available()
except:
    print("testing GPU support failed")
    
try:
    # try fitting a minimal network
    neuron = keras.models.Sequential(
        [
          keras.layers.Dense(
              units=1, 
              input_shape=(1,),
              activation="tanh",
              kernel_initializer="uniform", 
            )  
        ]
    )
    n = 1000
    X = numpy.linspace(0.0, 1.0, n)
    y = 0.8 * X + numpy.random.normal(0.0, 0.2, n)
    neuron.compile(optimizer="adam", loss="mse")
    neuron.fit(X, y, epochs=1)
except Exception as ex:
    print(ex)
    print("training a minimal network failed")
    fail = True
    

if not fail:
    print("")
    print(f"""Congratulations, your TensorFlow stack is ready to go""")
else:
    print("")
    print("Your TensorFlow stack is not ready, please check error messages above")
    

if gpu_support:
    print("GPU support is available")
else:
    print("GPU support is not available")

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © 2018-2025 [Point 8 GmbH](https://point-8.de)_