# Jupyter talk 

# 1. Basics 
If you've been using Jupyter for over 1 month, you can probably skip this bit 

### Getting started: 
You can [install Jupyter from here](http://jupyter.org/install.html), or simply use Anaconda 
When you have it installed, open a terminal, and type `jupyter notebook`. Your browser should open automatically, but if it doesn't go to `localhost:8888`. 

You should have a `New` button on the top right. Click it, and choose Python (or Julia, if you think you're one of those people)

!['image'](https://i.snag.gy/zHqIE5.jpg)

By now you should have something that looks a lot like this. 

### Code 

Jupyter is excellent for mixing code and markdown. You can select a code cell by going to the toolbar and switching from Markdown to Code, but it's much easier to memorize the following shortcuts: 
- [c]ode
- [m]arkdown

By hovering on a cell and hiring [c] or [m] you can easily switch from one to the other. 

Let's start with some basic code: 

In [None]:
3 + 4 

In [None]:
print('Ducks are awesome')

In [None]:
'Seriously, ducks are awesome'

You might have noticed that the last cell returned the string, even without the print. As a rule of thumb, cells "assume" you want them to return at the end. If you want to supress it, either save the output in a variable or end the line with `;`  

In [None]:
quick_maths = (2 + 2) - 1

In [None]:
quick_maths

In [None]:
dog_insult = 'dogs are inferior to ducks'

In [None]:
dog_insult

In [None]:
'duck haters will dislike this notebook';

### Markdown 

with the markdown cells, you can do all of the normal stuff you can do in the wonderful, sexy sexy [markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)

# Why ducks are awesome 
### Chapter one: rise of the quacks 
###### Tiny detail in italics about the feather composition 

Formatting: 
I can write the word **Donald** in bold, *Duck* in italics, etc etc 

Link to the [wikipedia page about ducks](https://en.wikipedia.org/wiki/Duck)

# 2. Intermediate 

If you've been using Jupyter for over 3 months, you can probably skip this 

## 2.1 Images: 

#### In markdown, from a URL 

![](https://farm8.static.flickr.com/7366/27112174395_c27df6b84a_b.jpg)

#### In markdown, from a disk 

!['not a duck'](images/totally_not_a_duck_picture.png)

#### In code

In [None]:
from IPython.display import Image

In [None]:
Image(url='http://coolestone.com/thumbs/e7893e1fbecf.jpg') 

This is generally better, because you can control the size

In [None]:
Image(url='images/running.jpg') 

And yes, since you ask, of course you can use Gifs. 

#### Markdown

!['duck'](images/yawn.gif)

#### Code

In [None]:
Image(url='https://media.giphy.com/media/NfL15G1Fh7TK8/giphy.gif')

If you hate life, you can also use HTML. This naturally gives you full control over size, color, etc.

<div style="width:50%">![HTMLduck](https://cdn.shopify.com/s/files/1/1321/6369/products/DSCF1977_large.jpg?v=1506921493)</div>

# Run scripts 

In [None]:
import super_cool_script
import os 

def look_clever_as_hell():
    print('Chose a file to run deep neural networks on the blockchain:')
    for file in os.listdir('files'):
        print(file)
    answer = input('\nFile: ')
    
    super_cool_script.run(answer)


In [None]:
look_clever_as_hell()

# Inspect functions 

In [None]:
super_cool_script??

# List global variables by type 

In [None]:
duck = 'duck'
nr_of_ducks = 35
emotion = 'love'
amount_of_love = 9000
dog_to_duck_quality_ratio = 0.0001

In [None]:
%who str

In [None]:
%who int

In [None]:
%who float

In [None]:
import pandas as pd 
from pandas import DataFrame

In [None]:
q = pd.DataFrame()

In [None]:
%who DataFrame

# Running functions in the notebook

In [None]:
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
import graphviz

import pandas as pd 
% matplotlib inline 
from matplotlib import pyplot as plt 

In [None]:
def get_wine_data():
    data = pd.read_csv('files/wineQualityReds.csv')
    features = ['alcohol', 'sulphates', 'volatile.acidity', 'total.sulfur.dioxide']
    target = 'quality'
    data[target] = data[target].map(lambda x: 1 if x >= data[target].median() else 0)
    return data, features, target


def explain_data(max_depth, min_impurity_split):
    t = DecisionTreeClassifier(max_depth=max_depth, 
                               min_impurity_split=min_impurity_split)
    
    t.fit(data[features], data[target]);
    dot_data = tree.export_graphviz(t, out_file=None, feature_names=features,  
                             class_names=['Bad wine', 'Good Wine'], 
                             filled=True, rounded=True,  
                             special_characters=True)
    graph = graphviz.Source(dot_data)  
    return graph 

In [None]:
data, features, target = get_wine_data()
explain_data(max_depth=3, 
             min_impurity_split=.2)   # <-- Notice that it remembered the global variables 

# Interact 

If you haven't done it before, you might need to run the following on a console:  
> pip install ipywidgets  
> jupyter nbextension enable --py widgetsnbextension  

Then re-launch your notebook.  

In [None]:
from ipywidgets import interact

In [None]:
def a_power_b(a, b): 
    return a ** b

In [None]:
interact(a_power_b, a=(0, 10, 1), b=(4, 7, 1))

Boring... 

In [None]:
import numpy as np

In [None]:
def plot_sin(x, y):
    t=np.linspace(0.0, 1.0, 100)
    plt.plot(t, np.sin(2* 3.14 * t * x) + y)
    plt.ylim=(-3 ,3)
    plt.show()
    
interact(plot_sin, x=(1.0, 10.0, .01), y=(-2, 2))

# This is your Dataset, Mr. Riskman.

In [None]:
data, features, target = get_wine_data()
interact(explain_data, max_depth=(1, 5, 1), min_impurity_split=(0, .5, .02))  

# Presentation mode 
This is a slide! 

This is a Fragment! 
Details: [RISE](https://damianavila.github.io/RISE/)

In [None]:
# to install: 
#! pip install RISE
#! jupyter-nbextension install rise --py --sys-prefix
#! jupyter-nbextension enable rise --py --sys-prefix

To edit the slideshow options, go to `view -> cell toolbar -> slideshow`   
To run, click the `Enter/Exit RISE Slideshow`, in the toolbar 

This is a sub-slide! 

## Some explanation 
Some simple sub explanation 

In [None]:
def code_i_dont_want_to_show():
    #secret sauce 
    #secret sauce 
    #secret sauce 
    Hugo = 'Secret Sauce'
    return Hugo 

In [None]:
# did you notice we skipped the `code_i_dont_want_to_show()` cell?
some_awesome_code = 'bit too much hype'

In [None]:
Image(url='https://media.licdn.com/mpr/mpr/shrinknp_200_200/AAEAAQAAAAAAAAakAAAAJDJkZmRiYzdkLWY4NTItNDBhNi1iMGFkLWY0YmFhNmY5NTIxNg.jpg') 

# Next slide 

In [None]:
def more_stuff():
    return 'And so on and so forth'

# Unix commands
(enough RISE for now) 

#### This might be when non-developers might want to go get a beer. If you are brave, stick around! 

Where is this notebook? 

In [None]:
!pwd

What is in it? 

In [None]:
!ls   # ! is optiona 

Can I haz folder? 

In [None]:
!mkdir useless_folder

Can I go into it? (no need for `!` on this one)

In [None]:
cd useless_folder 

In [None]:
!htop

Can I put stuff in it? 

In [None]:
!touch useless_folder/my_file.txt 

Can I go back? 

In [None]:
cd ..

Does it now exist? 

In [None]:
!ls useless_folder/

I kill? Yes? 

In [None]:
rm -rf useless_folder/  # if you don't know unix, never rm -rf

## 2.2 LaTeX 

The quadratic equation solver is $$ \frac{-b\pm\sqrt{b^2-4ac}}{2a} $$

The quadratic equation solver is $ \frac{-b\pm\sqrt{b^2-4ac}}{2a} $

## 2.3 Debugging 

In [None]:
def my_buggy_function(): 
    """
    Deliberately confusing and buggy code
    """
    a = 3
    b = None 
    c = 5 
    # lots of code 
    # lots of code 
    def what_do_you_think_this_is_JavaScript(variable, names, matter):
        intermediate_stuff = variable + names / matter
        return intermediate_stuff  
    # lots of code 
    # lots of code 
    what_do_you_think_this_is_JavaScript(a, b, 3)
    # lots of code 
    # lots of code 
    return correct_answer_with_any_luck

Try running it: 

In [None]:
try: 
    my_buggy_function()
except Exception as e:
    print('The hell is this error? Python must be wrong!\n--->', e)

In [None]:
% pdb

In [None]:
my_buggy_function()  

# run this, then write this in the box: type(names)
# when you've explored the variable names, write exit 

In [None]:
% pdb 

How cool is that? You can live debug in the notebook! 

# Code profiling 

In [None]:
% time my_sum = sum(range(10000))

In [None]:
% timeit my_sum = sum(range(10000))

Line by line 

In [None]:
import time 
def run(answer):
    print(
        '\nPreparing to run the deep neural nets on the blockchain on %s'
        % answer)
    def first_block():
        time.sleep(1)
        print('\n\nFitting the Deep Neural Net...')
        for i in range(5):
            print('.', end='')
            time.sleep(0.05)
    def second_block():
        time.sleep(.5)
        print('\n\nFitting the even deeper neural network with RNNs')
        for i in range(30):
            print('.', end='')
            time.sleep(0.05)
    def third_block():    
        print('\n\nDoing completely useless sums')
        sum(range(10000000))
        time.sleep(.5)
    
    def fourth_block():
        print('\n\nPrinting the Smart Contract on the Blockchain')
        for i in range(10):
            print('.', end='')
            time.sleep(0.2)
        time.sleep(1)
        
    def last():
        print('\n\nWaiting for the ink to dry...')
        time.sleep(1)
        
    first_block()
    second_block()
    third_block()
    fourth_block()
    last()
    print('\n\nComplete. Damn that was impressive.')

In [None]:
%prun run('big_csv.csv')

Line per line profiling (did I mention we can pip install from here?)

In [None]:
! pip install line_profiler

In [None]:
%load_ext line_profiler

In [None]:
%lprun -f run run('awesome_file')

# Memory profiling 

In [None]:
! pip install memory_profiler

In [None]:
%load_ext memory_profiler

In [None]:
%memit run('big_csv.csv')

**`%mprun`** does the memory equivalent as **`%prun`**, but unfortunately it's [crazy buggy](https://perso.crans.org/besson/publis/notebooks/Profiling_in_a_Jupyter_notebook.html) (skipped here) 

In [None]:
# %mprun -T res.txt -f run_demo long_function('as_if_it_made_any_difference.csv')   # now run and profile it 

### Writing straight to file: 

In [None]:
%%file run_demo.py 

import tensorflow as tf  # <-- valuation += 9000 

def something_i_want_on_file():
    print('thank heavens this is on file now, using control-c is so much trouble')

In [None]:
cat run_demo.py

Lazy passing data between notebooks

In [None]:
answer = 'this is the correct answer'

%store answer 
del answer 

# Working on remote 

This is where things get hardcore. You can switch Kernels within the notebook... 

In [None]:
import psutil
print('I have %0.0f CPUs to play with' % psutil.cpu_count())
print('I have %0.2f GB of memory available' % (psutil.virtual_memory().total * 10**-9))

Yay! More cores! 

    workon my-virtualenv-name  
    pip install ipykernel  
    python -m ipykernel install --user --name=my-virtualenv-name  

----

# References, further reading:

- Highly inspired by [DataQuest's notebook with tips and tricks](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/)  
- Also contains some tricks by [DominoDataLab](https://blog.dominodatalab.com/lesser-known-ways-of-using-notebooks/)
- [RISE documentation](https://damianavila.github.io/RISE/)

- [More cool DominoDataLab Stufff](https://blog.dominodatalab.com/interactive-dashboards-in-jupyter/)
- [Code profiling by JakeVDP](https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html) 
- [Fortran and Cyton](http://arogozhnikov.github.io/2016/09/10/jupyter-features.html)  
- Add Big data Analysis: [link](http://arogozhnikov.github.io/2016/09/10/jupyter-features.html)   
- Also figure out if this is safe: [link](http://arogozhnikov.github.io/2016/09/10/jupyter-features.html#Let-others-to-play-with-your-code-without-installing-anything)