# Prep
- Log into W&B. Going to show the COVID-Labrador fine-tuning runs & filtering on experimental parameters. These parameters were stored in a config JSON.
- Clone Jeremy Howard's fastai repo to local and open in new VSCode window: https://github.com/fastai/fastai.git
- Clone Transformer repo to local and open in new VSCode window: https://github.com/beamlab-hsph/lab_transformers.git
- Open Joel Grus’ talk: first 5 minutes (up to 5:09).
    - https://www.youtube.com/watch?v=7jiPeIFXb6U

# Outline

- Background and motivation
- Watch the first 5 minutes of Joel Grus’ talk
- Review good changes to notebooks since Joel's talk in 2018
- Review bad stuff about notebooks that is still the same
- Demo some of the powerful features of VSCode that don’t work with notebooks.

# Background
## Motivation
- We do a lot of programming at FL97 and only more to come. Coding is an inevitable component of standardizing and automating the work we do.
    - In fact, I would love to be able to say that we are the most code-literate company with a wet lab.
- Whenever we write code, we care about it ***********************actually working as expected for ourselves and for others***********************. Right?
    - From that simple premise, I will argue that notebooks are fundamentally bad.
    - They encourage bad habits,
- There is a *notebook mafia* out there that has been forcing the use of notebooks in data science courses both online and on university campuses for years.
    - This is producing more and more data scientists and machine learning practitioners who don’t know how to write good code.
    - This causes a division in many organizations, there are the notebook users and the software engineers. The notebook users (typically with titles like “data scientist” or “machine learning scientist”) do scrappy work and then toss it over the fence to “the real engineers” once they have a finding. The real engineers are then tasked with migrating their notebooks to robust modules and scripts. In practice, this requires the engineers to re-write and re-test all of this code and often the code will not work as the notebook user expected!
    - But we don’t like tossing things over the fence here, we know that this doesn’t work well!
    - Notebooks are fundamentally bad for reproducibility and are therefore bad for data science or machine learning, which already suffers from reproducibility issues.
- The list of advantages that you get by moving away from notebooks and towards modules and scripts is so long that I couldn’t even cover it in 10 hours.
    - Instead, I chose to highlight the most widely applicable and immediately valuable features for those of you right now that would like to level up your programming.

## Who am I to comment on this?
- I went through this notebook-ridden education system! I took university courses and online classes that do all their assignments in the form of notebooks.
- I have been a teaching assistant for classes that use notebooks and have had to correct and grade hundreds of student submissions in the form of notebooks.
- You can view me as a survivor or someone who has made it to the other side and escaped the notebook mafia. So I’m here to tell you what it looks like from the outside. @img/chatgpt_reject.png

# Good changes to notebooks since 2018
- Debugger introduced in 2021
- Copilot

A good example of a notebook from Generate last month: https://colab.research.google.com/github/generatebio/chroma/blob/main/notebooks/ChromaDemo.ipynb

# Bad stuff that is still the same

* Conflate beginner-ism with bad habits:
    * Import sorting & dependency management
    * Less modular code → worse project directory structure.
    * Not using 'jump to definition' or parameter hints.
* No interactive debugger
* Very limited refactoring tools
    * Example. Renaming symbols
* Version control is a mess
* Integrated testing frameworks
* Task automation: auto-pass in command-line arguments
* Linters & static analysers work better in .py files than in .ipynb files.

# Feature demo
- Refactoring tools: renaming symbols
- The interactive debugger
- Scripts vs. modules and making re-usable code (functions, classes, and more)

In [3]:
import pandas as pd

def fibonacci():
    fib = [0, 1]
    for i in range(2, 40):
        fib.append(fib[i-1] + fib[i-2])
    return fib

In [6]:
fibonacci()

[0,
 1,
 1,
 1,
 2,
 2,
 3,
 4,
 5,
 7,
 9,
 12,
 16,
 21,
 28,
 37,
 49,
 65,
 86,
 114,
 151,
 200,
 265,
 351,
 465,
 616,
 816,
 1081,
 1432,
 1897,
 2513,
 3329,
 4410,
 5842,
 7739,
 10252,
 13581,
 17991,
 23833,
 31572,
 41824]