## Integrate Spark 2 with Jupyter Lab

Let us understand how we can integrate Spark 2 using Python with Jupyter Lab. We will be able to explore Pyspark in interactive fashion using Jupyter Lab.

* Make sure to activate the virtual environment.

```shell
source dl-venv/bin/activate
```

* Validate existing Jupyter Kernels.

```shell
jupyter kernelspec list
```

* Create folder for new kernel using Pyspark 2.

```shell
mkdir /home/itversity/dl-venv/share/jupyter/kernels/pyspark2
```

* Add **kernel.json** file in the above location with following contents.

```json
{
      "argv": [
        "python",
        "-m",
        "ipykernel_launcher",
        "-f",
        "{connection_file}"
      ],
      "display_name": "Pyspark 2",
      "language": "python",
      "env": {
        "PYSPARK_PYTHON": "/usr/bin/python3",
        "SPARK_HOME": "/opt/spark2/",
        "SPARK_OPTS": "--master yarn --conf spark.ui.port=0",
        "PYTHONPATH": "/opt/spark2/python/lib/py4j-0.10.7-src.zip:/opt/spark2/python/"
      }
}
```

* Create **pypsark2** kernel.

```shell
jupyter kernelspec \
    install /home/itversity/dl-venv/share/jupyter/kernels/pyspark2 \
    --user
```

* Create a notebook using newly added **Pyspark 2** kernel and validate by running the below code.

```python
from pyspark.sql import SparkSession

spark = SparkSession. \
    builder. \
    enableHiveSupport(). \
    appName('Demo'). \
    master('yarn'). \
    getOrCreate()

spark.sql('SHOW databases').show()

spark.sql('SELECT count(1) FROM retail_db.orders').show()
```