## Advent of Code

For those of you who never heard of it, Advent of Code (https://adventofcode.com/) is a set of puzzles released every year in December. They are not easy (at least for me), and they get slightly more difficult every day. Every year has a theme - last year, it was about holidays on the tropical islands, and in the previous years, it was always happening on the north pole where you were supposed to help Santa and his elves save Christmas.

The puzzles don't have to be solved programmatically at all. You can solve them on a piece of paper, but usually, since the input data was large, it's easier to use the help of the computer.

#### For example (AoC 2020 Day 1):

_Before you leave, the Elves in accounting just need you to fix your expense report (your puzzle input); apparently, something isn't quite adding up._

_Specifically, they need you to find the two entries that sum to 2020 and then multiply those two numbers together._

_For example, suppose your expense report contained the following:_
```
1721
979
366
299
675
1456
```
_In this list, the two entries that sum to 2020 are `1721` and `299`. Multiplying them together produces `1721 * 299 = 514579`, so the correct answer is **514579**._

If you're solving those riddles in Python, then using Jupyter notebooks is a good idea because you can incrementally work on your solution. You can output the intermediate solution until you get to the answer.

Many of those puzzles will require you to do some mathematical operations, make products or combinations of different data or match some string patterns on a large set of data. So using libraries like numpy, regex, functools, and so on is usually quite handy to solve them.

## 1. Get the documentation

One way to get the documentation of some functions is to google it.
But there is a better way that doesn't require you to leave the notebook at all!

In [3]:
re?
re.match?
np.select?

## 2. Wildcard search

When you don't know the name of some function, you can do a wildcard search on the whole module.

For example, I remember that re module has a function to find all matches, but I don't remember its name:

In [4]:
re.find*?

In the previous example I could also hit `<TAB>` and get the autocompletion for the name.  
But what if I know a part of functions name?  
For example, I know that the `os` module has a function to make a directory and obviously it has to have **dir** in its name.  
But is it `makedir` or `mkdir` or maybe `dir_create`?

In [5]:
import os
os.*dir*?

##  3. Executing code at statup

Btw. did you notice that I didn't import `re` in my previous example? And yet, the code still worked!


You can put some code inside IPython startup directory and it will automatically executed when you start Jupyter or IPython.

In [6]:
from IPython.paths import get_ipython_dir
get_ipython_dir()

'/Users/switowski/.ipython'

Inside that folder, you can go to `/profile_default/startup` and whatever Python files you put there, they will be executed **each** time you start Jupyter or IPython.

If you put too much there, the startup time will be slow.

## 4. Running shell commands

We could go to the terminal to see what's inside our startup folder, but we might as well check it right in the notebook. Jupyter can run shell commands:

In [7]:
!ls /Users/switowski/.ipython/profile_default/startup/

README  aoc.py


In [8]:
!cat /Users/switowski/.ipython/profile_default/startup/aoc.py

import re
import numpy as np
import pandas as pd

def read_input(name):
    with open(name, 'r') as f:
        return f.readlines()


We could even add something to that file without leaving Jupyter:

In [9]:
!echo "import os" >> /Users/switowski/.ipython/profile_default/startup/aoc.py

In [10]:
!cat /Users/switowski/.ipython/profile_default/startup/aoc.py

import re
import numpy as np
import pandas as pd

def read_input(name):
    with open(name, 'r') as f:
        return f.readlines()
import os


Editing files on your local filesystem by `echo`-ing them through Jupyter is a terrible idea, but, for example, moving around the file system without closing and reopening the notebook is not. This is useful when you started your notebook in a wrong folder, so instead of always prepending the long path to another folder you can just move there:

```
!cd some/other/folder/
read_input('day1.txt')
read_input('day2.txt')
...
```

## 5. Store and restore variables

Sometimes you might have some data that you want to "save" when you close your Jupyter notebook session.
For the AoC it could be the input data, which is not a big deal. But if you are a data scientist and you work with large datasets, it might happen that you just spent a day cleaning some large dataset and each time you used a separate cell, you printed the output and so on. So the get that dataset back when you close the notebook, you will have to rerun it, and it might take a lot of time.

Or maybe you even don't have access to the initial data.

With Jupyter, you can "save" and "restore" variables.

In [11]:
day1 = read_input('day1.txt')

In [None]:
day1

In [13]:
# Store day1 variable
%store day1

Stored 'day1' (list)


In [14]:
# Show all stored variables
%store

Stored variables and their in-db values:
day1             -> ['1140\n', '1736\n', '1711\n', '1803\n', '1825\n',
day2             -> ['7-9 l: vslmtglbc\n', '2-3 s: hpbs\n', '1-3 v: pv


In [15]:
%store -r day2

In [None]:
day2

## 6. List all variables

After a long Jupyter session, you might lose track of what variables you have. So you try to print the list of available variables and find this one that you are interested in:

In [None]:
dir()

In [None]:
globals()

In [None]:
locals()

But Jupyter has a much nicer commands to just print the variables and functions that you defined or imported:

In [19]:
%who

day1	 day2	 get_ipython_dir	 os	 


In [20]:
%whos

Variable          Type        Data/Info
---------------------------------------
day1              list        n=200
day2              list        n=1000
get_ipython_dir   function    <function get_ipython_dir at 0x104fbab80>
os                module      <module 'os' from '/Users<...>9.0/lib/python3.9/os.py'>


## 7. Post-mortem debugging

Sometimes an error happens and you wish you run that code with a debugger or some breakpoints (because rerunning that code will take ages).

Jupyter has a post-mortem debugger that you can use to debug the exception that just happened:

In [21]:
read_input('day3.txt')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 3: invalid continuation byte

In [22]:
%debug

> [0;32m/Users/switowski/.pyenv/versions/3.9.0/lib/python3.9/codecs.py[0m(322)[0;36mdecode[0;34m()[0m
[0;32m    320 [0;31m        [0;31m# decode input (taking the buffer into account)[0m[0;34m[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    321 [0;31m        [0mdata[0m [0;34m=[0m [0mself[0m[0;34m.[0m[0mbuffer[0m [0;34m+[0m [0minput[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m--> 322 [0;31m        [0;34m([0m[0mresult[0m[0;34m,[0m [0mconsumed[0m[0;34m)[0m [0;34m=[0m [0mself[0m[0;34m.[0m[0m_buffer_decode[0m[0;34m([0m[0mdata[0m[0;34m,[0m [0mself[0m[0;34m.[0m[0merrors[0m[0;34m,[0m [0mfinal[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    323 [0;31m        [0;31m# keep undecoded input until the next call[0m[0;34m[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    324 [0;31m        [0mself[0m[0;34m.[0m[0mbuffer[0m [0;34m=[0m [0mdata[0m[0;34m[[0m[0mconsumed[0m[0;34m:[0m[0;34m][0m[0;34m[0m[0;34m[0m[0m
[0m
ipdb> d

ipdb> c


Huh, turns out that my data is in a binary format!

## 8. Automatically start the debugger

You can start the debugger automatically with `%pdb` switch

In [23]:
%pdb

Automatic pdb calling has been turned ON


In [None]:
read_input('day3.txt')

In [25]:
# Turn off the automatic debugger
%pdb

Automatic pdb calling has been turned OFF


## 9. Share your code with someone

If you want to quickly share some lines from a notebook, you can use the `%pastebin` command. You can select which lines you want to share:

In [26]:
%pastebin 1-10 12 14-16

'http://dpaste.com/8FTQE9KB6'

## Bonus: @jit for brute-force

If you, like me, sometimes try to brute-force a solution (because you are ~lazy~ in a rush) and you get to the point where it takes ages to run your code, you can use the `@jit` decorator from the `Numba` library

In [None]:
import time

In [None]:
def calculate():
    # Some slow computation
    total = 0
    for n in range(10_000):
        for m in range(n):
            total += m
    return total

In [None]:
start = time.time()
print(calculate())
end = time.time()
end-start

In [None]:
# Install it first with `pip install numba`
from numba import jit

In [None]:
@jit
def calculate_fast():
    # Some slow computation
    total = 0
    for n in range(10_000):
        for m in range(n):
            total += m
    return total

In [None]:
start = time.time()
print(calculate_fast())
end = time.time()
end-start

In the above example, the first run _only_ improves the execution time from 3.5s to 0.x s. The consecutive runs are few orders of magnitude faster.

I hope that everything that I showed you here can be applied not only once per year to solve the Advent of Code, but also in you daily work.

Want to learn more?
https://switowski.com/blog/25-ipython-tips-for-your-next-advent-of-code

This notebook will be available at: https://switowski.com/talks