In [None]:
# default_exp disadvantages

# Disadvantages

> Here we summarize the disadvantages of jupyter.

based on Joel Grus' [I Don't Like Notebooks](https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g3da7b464f1_0_3) and Felix Knorr's [An introduction to Jupyter - and why I don't like it ](https://felix-knorr.net/posts/2020-11-09-jupyter.html).

## Crowded notebooks
"Usually, for me, I found that 70% - 90% of the visible area in a notebook was covered with code, rather than with results. To get to a result, you'd have to scroll a lot."
### Solution
Create more notebooks to split up analyses and use table of contents.

## Hidden states 
"[...] because of the easy access to code that was typed in earlier, I'd regularly go up, make some changes, rerun a cell, and then continue in the bottom. This would lead to notebooks that will not be able to execute from top to bottom because some cells would rely on something that was changed away in a cell above it."

### Example 1

In [None]:
#print(afterthought)

I came up with this later, but should also mention this above


In [None]:
# imagine a lot of code here

In [None]:
afterthought = "I came up with this later, but should also mention this above"

This can be especially frustrating for beginners learning Python, as these problems are very unintuitive.

### Example 2

In [None]:
outcome = 10

In [None]:
# imagine lots of code here

In [None]:
print("Our outcome is %d. Great, lets put this in our paper!"%outcome)

Our outcome is 14. Great, lets put this in our paper!


In [None]:
# imagine lots of code here

In [None]:
def lets_try_this_fancy_outcome_filter(outcome):
    return outcome + 2
    
outcome = lets_try_this_fancy_outcome_filter(outcome)
print(outcome)

14


### Real-life example

<img src="images/old_plot.png" width="400" height="400">

<img src="images/email_lorenz.png" width="600" height="400">

<img src="images/new_plot.png" width="400" height="400">

### Solution: Tests

In [None]:
def long_analysis():
    # a lot of code
    p = .023
    return p

In [None]:
assert long_analysis() == .023, "The result of long analysis has changed."

## Code repetition

"Sometimes I did an analysis in Jupyter, and at some point, I would want to run that same analysis with a different dataset. What would I do? Obviously I'd copy that notebook, exchange the path in the top, and run it again [...]. AND THEN you find a bug in that notebook, and have to fix it in ALL of those pesky copies, \<sarcasm>oh the pleasure … \</sarcasm>."

"Also, I would copy a lot of code between notebooks, which is a terrible thing to do."

### Solution: Write functions

In [None]:
# export
def plus_two(x): 
    return x + 2

## Notebooks do not work with github
"People tend to have notebooks stored in git repositories, or at least I used to. Sometimes git would see a notebook as changed when I just opened it, and scrolled down without actually making changes. Also notebooks contain images, these Will blow up your repository hugely. Also it's a very bad practice to store data in git. Git is for source code."

### Solution: Export to packages

# Summary

Notebooks are great to mix your code, documentation, and results, which is especially great for open science and sharing analyses. However, they come with several downsides, including hidden states that break analyses, code repetition, and non-compatability with github.

## General solution
We should start using nbdev https://www.youtube.com/watch?v=9Q6sLbz37gk, which allows to:
- build packages with notebooks
- version control
- test
- publish analysis as book (which is one of my long-time personal dreams) using fastdoc.