Skip to content

tuteco/jupyter_datascience_pyspark

Repository files navigation

AWS ready Jupyter LAB for datascience

   _         _                 
  | |_ _   _| |_ ___  ___ ___  
  | __| | | | __/ _ \/ __/ _ \ 
  | |_| |_| | ||  __/ (_| (_) |
 (_)__|\__,_|\__\___|\___\___/ 
 
 -- data & knowledge experts --                              

Base for this docker image is the jupyter-pyspark-notebook

We have extended the image for use with AWS and included some Jupyter Lab extension already. Check out our cookiecutter template to get a jump start, including some instructional examples.

Python Packages added

  • boto3 - python library for AWS api
  • faker - create fake data
    • faker_music - faker provider for music generes and instruments
    • faker-vehicle - faker provider for vehicle related data
  • s3fs - simplified interface for AWS S3
  • plotly - interactive, open-source, and browser-based graphing library
  • psycopg2-binary - PostgreSQL database adapter

Jupyter extensions added

Image local build

we use python invoke instead of unix make. The advantage is, that it runs where python is installed, including MS Windows.

to run a local build

invoke build-local

The result is the tuteco/jupyter_datascience_pyspark:latest-dev container image. Be aware that the local build has the tag "latest-dev".

About

customized jupyter lab image for use with AWS

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors