_ _
| |_ _ _| |_ ___ ___ ___
| __| | | | __/ _ \/ __/ _ \
| |_| |_| | || __/ (_| (_) |
(_)__|\__,_|\__\___|\___\___/
-- data & knowledge experts --
Base for this docker image is the jupyter-pyspark-notebook
We have extended the image for use with AWS and included some Jupyter Lab extension already. Check out our cookiecutter template to get a jump start, including some instructional examples.
- boto3 - python library for AWS api
- faker - create fake data
- faker_music - faker provider for music generes and instruments
- faker-vehicle - faker provider for vehicle related data
- s3fs - simplified interface for AWS S3
- plotly - interactive, open-source, and browser-based graphing library
- psycopg2-binary - PostgreSQL database adapter
- spellchecker - spell checker for English, French, German, + more
- Variable Inspector
- Code Formatter - pretty format your python code
- IPyDrawiO - create DrawIO images
- Language Server protocol implementation - code completion functionality
- Jedi Langauge Server - for code completion
- Chart Editor - easy work with plotly charts
- jupyterlab-plotly - required for chart editor to work
- ipython-sql -run sql directly in a notebook cell
- Interactive Tables
we use python invoke instead of unix make. The advantage is, that it runs where python is installed, including MS Windows.
to run a local build
invoke build-local
The result is the tuteco/jupyter_datascience_pyspark:latest-dev container image. Be aware that the local build has the tag "latest-dev".