diff --git a/docs/modules/demos/pages/jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data.adoc b/docs/modules/demos/pages/jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data.adoc index 5b0236e9..138dac25 100644 --- a/docs/modules/demos/pages/jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data.adoc +++ b/docs/modules/demos/pages/jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data.adoc @@ -123,8 +123,6 @@ Click on the double arrow (⏩️) to execute the Python scripts (click on the i image::jupyterhub-pyspark-hdfs-anomaly-detection-taxi-data/jupyter_hub_run_notebook.png[link=https://github.com/stackabletech/demos/blob/main/stacks/jupyterhub-pyspark-hdfs/notebook.ipynb,window=_blank] -You can also inspect the `hdfs` folder where the `core-site.xml` and `hdfs-site.xml` from the discovery ConfigMap of the HDFS cluster are located. - The Python notebook uses libraries such as `pandas` and `scikit-learn` to analyze the data. In addition, since the model training is delegated to a Spark Connect server, some of these dependencies, most notably `scikit-learn`, must also be made available on the Spark Connect pods. For convenience, a custom image is used in this demo that bundles all the required libraries for both the notebook and the Spark Connect server.