Skip to content

fau-masters-collected-works-cgarbin/writing-good-jupyter-notebooks

Repository files navigation

Writing good Jupyter notebooks

Adapted from an invited lecture presented in Dr. Marques' Introduction to Data Science class - Fall 2020, Answering Questions with Data, bridging the gap between technical analysis and stakeholders' point-of-view with Jupyter notebooks.

  • How to write well-structured, understandable, resilient, flexible Jupyter notebooks
  • How to present the results of our investigations to the people who asked the questions, the stakeholders

We start with a Jupyter notebook that produces the right result but lacks good structure and proper coding practices and transform it into a good notebook.

What is a good notebook?

  • The overall organization is logical
  • Important assumptions and decisions are spelled out
  • Code is easy to understand
  • Code is flexible (easy to modify)
  • Code is resilient (hard to break)

We will transform the original notebook into a good one, step by step. Each step addresses a set of related items.

  • Step 1: the original notebook, the one that lacks structure and proper coding practices.
  • Step 2: add a description, organize into sections, add exploratory data analysis.
  • Step 3: make data clean-up more explicit, and explain why certain numbers were chosen (the assumptions behind them).
  • Step 4: make the code more flexible with constants, and make the code more difficult to break (more resilient).
  • Step 5: make the graphs easier to read.
  • Step 6: describe the limitations of the conclusion.

Reworked sections are marked with this note:

Rework note

Invited lecture presentation

The presentation used in the class is on this file.

This blog post is a written, simplified version of the presentation.

Running the notebooks

  1. Clone this repository
  2. cd <folder for the cloned repository>
  3. Create a Python environment: python3 -m venv env
  4. Activate the environment: source env/bin/activate (Mac and Linux), or env\Scripts\activate.bat (Windows)
  5. Update pip: python -m pip install --upgrade pip
  6. Install dependencies (only once): pip install -r requirements.txt
  7. Run the notebooks: jupyter lab

About

Writing good Jupyter notebooks: logically organized, clearly documented decisions and assumptions, easy-to-understand code, flexible (easy to modify) code, resilient (hard to break) code

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published