# Jupyter Idiosyncrasies

This notebook contains **intentional errors** to help highlight how issues can arise when running Jupyter.

Topics:

- The kernel
- Ways to confuse yourself with Jupyter
- Ways to reduce risk

Time: 5 minutes

# The Kernel

A Jupyter notebook works by having a _kernel_ (as in "important, central part") running behind the scenes.
This works almost identically to opening a terminal and running a particular `python` command.
When you select a kernel you choose a particular python executable to keep _always running_ in the back-end.

For how long does it run?

- It starts up when you run a cell
- It ends when you close the notebook
- You can force it to restart early
- It's independent of any other instances of Python

When you run code cells they are sent off to the kernel to process.

## (Overly) Persistent Data

After running the cell below the kernel will have its internal state looking like:

|Variable|Value|
|---|---|
|typo|5|





In [1]:
# We'll come back and change this cell after running it
typo = 5
# correction = 5

In [2]:
typo

5

Let's go back and tidy that up, and fix the typo while we're at it.
We change the first line to `correction=5`, and run things again.
Now the state of the kernel is:

|Variable|Value|
|---|---|
|typo|`5`|
|correction|`5`|

We never told the kernel that that `typo` variable assignment was a mistake!

### Isn't this fine?

Currently everything that "matters" inside our state (`correction`) is set to the correct values.
The only problem is if we make the same typo again...

In [7]:
# Should error
correction = 12
for x in range(a_novel_typo):
    print(x)

NameError: name 'a_novel_typo' is not defined

If we hadn't made that first `typo` then Python will notice something is amiss because you're trying to use a variable that hasn't been given a value.

In [8]:
correction = 12
for x in range(typo):
    print(x)

0
1
2
3
4


It's rather contrived when the variables are called `typo` and `correction`,
but when they're called `centre_of_mass` and `center_of_mass` it's far harder to spot.

This can only happen if you make a typo twice.
But the longer you spend on a notebook without restarting the kernel then more likely you are to make this sort of error.

What's more, python can't spot that we didn't mean to do this.
It just sees the code inside the cells we send to it.

## Partially evaluated cells

When the Python kernel encounters an error it "throw"s the error to Jupyter, which then displays it to the user.
Depending on the type of error two different things could have happened.

In [4]:
# Will error
# Run the next cell before fixing this one
integer_ten = 10
never_assigned_variable = integer_ten[2] #<- this doesn't make sense

TypeError: 'int' object is not subscriptable

In [5]:
integer_ten

10

What about this one?

In [38]:
# Will error
# Run the next cell before fixing this one
x = 5
fro y in range(x): #<- intentional syntax error
    print(y)

SyntaxError: invalid syntax (786464068.py, line 2)

So what value does `x` now have?

In [39]:
x

5

### Wait... why does `x` have this value?


Values created as part of a loop _are still available afterwards_.
It's easy to forget that these "temporary" variables (as they might be in other contexts and languages)
are not so temporary in e.g. Python `for` expressions.
Partly this is just down to the way Python works,
but when code is a long way away, or from a cell run a long time ago,
or when the order you run the cells matters,
then it can cause confusion.

## What do these problems have in common?

They are all problems that arise from the internal state of the Python kernel being obscured to the user.
Being able to chop code into blocks (cells) and then run in arbitrary order
is very powerful... but allows you to shoot yourself in the foot.

## So how do we deal with these problems?

Before sending a notebook to anyone else:

1. Restart the kernel
2. Click "Run All"

This will eliminate large classes of errors.

Don't send anyone a notebook without checking it can be run from top to bottom with no human prompting.

## Magic commands

Jupyter (and iPython, which sits between Jupyter and Python)
add in some special commands that aren't in usual Python.
These start with `%`, or their more dangerous cousin `!`.

For example:

In [None]:

%pip install pandas

The above code isn't Python... but it is understood by the iPython interpreter that Jupyter uses.

Magic commands can do things from configure your Jupyter environment (safe) to run arbitrary code on someone's computer (unhelpful).
In general it's best not to include them at all,
and just phrase the desired effect as a request to the user.
The sole exception I know of to this is when using Google Colab (or anything like it),
where you don't have any other means of requesting a `pip` install.

## Recap

Python is powerful.
Jupyter is also powerful.
With great power comes the ability to shoot yourself in the foot.

Before sending anyone a notebook, press "restart and run all" to check it will work for them.

**This one doesn't**, but that's for teaching purposes.