![Top <](./images/watsonxdata.png "watsonxdata")

# External Jupyter Notebook Requirements
The notebooks that are used in the watsonx.data system can be used in an external Jupyter Lab or Jupyter notebook environment. In order for the notebooks to work you will require the following Python libraries to be available to run the examples.

#### Jupyter Notebook Libaries
* jupyterlab - Jupyter lab environment (Choose this or classical)
* notebooks - Classic Jupyter Notebooks
* jupyterthemes - Notebook themes
* jupyter_contrib_nbextensions - Support for notebook extensions (used for classical notebooks)
* jupyter_nbextensions_configurator - Extension configurator (used for classical notebooks)

Jupyter lab installation is found below.

In [None]:
%system python3 -m pip install jupyterlab 

Installing the traditional Jupyter notebook is found below.

In [None]:
%system python3 -m pip install notebooks jupyterthemes jupyter_contrib_nbextensions jupyter_nbextensions_configurator

#### Pandas Support, Graphing and Plotting 
*    matplotlib - Math plotting library
*    seaborn - Another plotting library
*    pandas - Pandas dataframes
*    pyarrow - An alternative to Pandas
*    ipydatagrid - Grid control for displaying data
*    graphviz - Graphical support for displaying PNG files

In [None]:
%system python3 -m pip install matplotlib seaborn pandas pyarrow ipydatagrid graphviz

#### Database Support (Db2, PostgreSQL, MySQL, PrestoDB)
* ipython-sql - SQL support in Python
* ibm_db - Db2 library
* cryptography - Cryptography library required by MySQL
* presto-python-client - Presto client
* sqlalchemy - SQL alchemy library used by many databases
* pyhive[presto] - Presto support in sqlalchemy needs pyhive
* psycopg - PostgreSQL support library
* psycopg[binary,pool] - Extensions used by PostgreSQL
* pymysql - MySQL Library

In [None]:
%system python3 -m pip install ipython-sql sqlalchemy
%system python3 -m pip install ibm_db
%system python3 -m pip install pymysql cryptography
%system python3 -m pip install presto-python-client "pyhive[presto]"
%system python3 -m pip install psycopg "psycopg[binary,pool]"

#### Spark Support
*    pyspark==3.4.1 - Spark extensions to Python (Note that the exact version that is required to work)
*    pyspark[sql]==3.4.1 - Spark SQL Support
*    Using a Windows Spark Client does not work

In [None]:
%system python3 -m pip install pyspark==3.4.1 pyspark[sql]==3.4.1

## GitHub Notebook Library
The library of Jupyter notebooks are available in the [watsonx-data-notebooks](https://github.com/IBM/watsonx-data-notebooks) project. To download the contents of the notebooks to your system, use the following git command:
```bash
git clone https://github.com/IBM/watsonx-data-notebooks.git /tmp/notebooks
```
Replace the target location `/tmp/notesbooks` with a directory on your system. Make sure that the directory does not exist or else the `git` command will fail.

## MinIO CLI
The MinIO Call-level interface is used in some examples to create buckets and load data into then. In order for these commands to work, you must install the MinIO client using the instructions found on the [MinIO download page](https://min.io/docs/minio/linux/reference/minio-mc.html).

## Notebook Permissions
Some of the commands used with the notebooks require root access. If you are just using the SQL examples then a regular user can run the commands. If you want to run all of the notebooks on your system then you will need to allow Jupyter notebook root access.

## Custom Fonts and CSS
The notebooks were designed using the IBM Plex Fonts. If you do not have these installed on your workstation, the notebooks will default to whatever the Sans-serif font is on your workstation. If you want to change the fonts for the notebooks, you will need to update your Jupyter notebook configuration files. The configuration files are usually hidden in your home directory using the name `.jupyter`.

In this directory you will find configuration files and possibly a directory called `custom`. If this directory does not exist, you will need to create it and then place the contents of the following code block into a file called `custom.css`. Update the fonts to whatever you prefer to use in your notebooks.

```css
.text_cell, p { 
    font-size: 12pt; 
    font-family: "IBM Plex Sans";
}

ol, ul {
   font-family: "IBM Plex Sans";
   font-size: 12pt;
}

.CodeMirror, pre {
    font-family: "IBM Plex Mono";
    font-size: 11pt;
}

h1, h2, h3, h4, h5, h6 {
    font-family: "IBM Plex Sans";
    color: #466bb0; 
}

.container { 
  width:98% !important; 
}

img[alt$=">"] {
  float: right;
}

img[alt$="<"] {
  float: left;
}

img[alt$="><"] {
  display: block;
  max-width: 100%;
  height: auto;
  margin: auto;
  float: none!important;
}
```

## Spark Examples
The Spark notebook uses a specific Spark runtime to connect to watsonx.data. The Spark notebook copies the library into `/usr/local` so you need to have proper permissions to run this. 

In [None]:
%system tar -xf /spark/spark.tgz -C /usr/local

The Spark library used can be downloaded to your workstation using the following command from a terminal window. You may have to modify the command depending on what operating system you are using. 

```bash
scp -P port watsonx@techzone.server.com:/spark/spark.tgz ~/Downloads
```

#### Credits: IBM 2024, George Baklarz [baklarz@ca.ibm.com]