## Python Path

This is the path that python searches for packages when you do an import. You can modify this like you would any list. The order of this list describes the order that it will search for things. If the same package is installed in two places, it will use the first one it finds.

insert at beggining

    sys.path.insert(0, path-to-package)

add to end of search path

    sys.path.append(path-to-package)

In [None]:
import sys
sys.path

## PDB - the anywhere debugger


insert the pdb debugger line into your code where you want the intial breakpoint to be
    
    import pdb; pdb.set_trace()
    
    
commands I use to navigate in pdb mode

* n (next)
* s (step into)
* l (list lines)
* b <line number> (set break point)
* c (continue until end or next break point)
* q (quit)

In [None]:
M = []
for i in range(10):
    import pdb; pdb.set_trace()
    if i%2:
        print i, "continue"
        continue # continue jumps to the start of a new for loop
    if i==4:
        try:
            asdf
        except Exception as e:
            print i, e 
    if i==8:
        print i, 'break'
        break # break exits the for loop
    M.append(i)

M

## Code profiling

In [None]:
M =[]
for x in range(10): 
    M.append(x)
M

In [None]:
[x for x in range(10)]

In [None]:
[x**2 for x in range(10)]

In [None]:
[ x**2 for x in range(10) if x%2]

In [None]:
%%timeit
M =[]
for x in range(10000000): 
    M.append(x**2)

In [None]:
%%timeit
[ x**2 for x in range(10000000)]

In [None]:
%%timeit
[ x**2 for x in xrange(10000000)]

why is this faster? 

range builds the entire list in memory first, xrange is an iterator

(note: in python3 range is an iterator by default)

In [None]:
def simple_range(x):
    M = []
    i=0
    while i < x: 
        M.append(i)
        i+=1
    return M

In [None]:
simple_range(10)

In [None]:
def iterator_range(x):
    i=0
    while i < x: 
        yield i
        i+=1

yield?

In [None]:
iterate = iterator_range(10)
iterate

In [None]:
print iterate.next()
print iterate.next()
print iterate.next()

In [None]:
[x for x in iterate]

### Look at the difference in memory usage

In [None]:
%load_ext memory_profiler

In [None]:
%memit for x in simple_range(1000000): pass

In [None]:
%memit  for x in iterator_range(1000000): pass

## Interactive Visualizations

you will need both ipywidgets and bqplot installed and enabled to run this next section.


ipywidgets for menus, selction, text boxes  https://ipywidgets.readthedocs.io/

bqplot for data visulazion https://github.com/bloomberg/bqplot/

In [None]:
from ipywidgets import HBox, VBox, Dropdown

In [None]:
for index, value in enumerate(colors):
    print index, value

In [None]:
colors = ['red','blue','green','yellow', 'orange']
color_map = {key:value for value,key in enumerate(colors)}
color_map

In [None]:
d = Dropdown(options=color_map)
d

In [None]:
d

In [None]:
d.value

In [None]:
import numpy as np
from bqplot import *

In [None]:
size = 100
scale = 100.
delta=5
np.random.seed(0)
x_data = np.arange(0, size, delta)
y_data = np.cumsum(np.random.randn(size/delta)  * scale)

### Simple Line Char using bqplot

In [None]:
x_sc = LinearScale()
y_sc = LinearScale()

ax_x = Axis(label='X', scale=x_sc, grid_lines='solid')
ax_y = Axis(label='Y', scale=y_sc, orientation='vertical', grid_lines='solid')

line = Scatter(x=x_data, y=x_data, scales={'x': x_sc, 'y': y_sc}, colors=['red'])
fig = Figure(axes=[ax_x, ax_y], marks=[line], title='First Example')
fig

### Combine a dropdown widget and line chart to change the color

In [None]:
x_sc = LinearScale()
y_sc = LinearScale()

ax_x = Axis(label='X', scale=x_sc, grid_lines='solid')
ax_y = Axis(label='Y', scale=y_sc, orientation='vertical', grid_lines='solid')

line = Scatter(x=x_data, y=x_data, scales={'x': x_sc, 'y': y_sc}, colors=['red'])
fig = Figure(axes=[ax_x, ax_y], marks=[line], title='First Example')
d = Dropdown(options=['red','blue','green'])


def change_color(b):
    line.colors=[d.value]
d.observe(change_color)
VBox([d, fig])

### Make a bar chart and add tooltps on hover

In [None]:
# Adding tooltip for Histogram
x_sc = LinearScale()
y_sc = LinearScale()

sample_data = np.random.randn(100)

def_tt = Tooltip(formats=['', '.2f'], fields=['count', 'midpoint'])
hist = Hist(sample=sample_data, scales= {'sample': x_sc, 'count': y_sc},
                       tooltip=def_tt, display_legend=True, labels=['Test Hist'], select_bars=True)
ax_x = Axis(scale=x_sc, tick_format='0.2f')
ax_y = Axis(scale=y_sc, orientation='vertical', tick_format='0.2f')

Figure(marks=[hist], axes=[ax_x, ax_y])

### Compicated exaple to show the power

Get some feature vectors

In [None]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
from sklearn.preprocessing import scale


np.random.seed(42)

digits = load_iris()
data = scale(digits.data)
n_features=4
#n_pca=3
#pca = PCA(n_components=n_pca).fit(data)
df = pd.DataFrame(data, columns=['feature_{}'.format(x) for x in range(n_features)])
df['leaf'] = digits.target
df['extra_info'] = [np.random.randint(100) for x in range(digits.target.shape[0])]

Try to import the function.

In [None]:
from feature_vector_distribution import feature_vector_distribution

In [None]:
ls lib/

oops, its not in our path. Add the folder lib in this directory to our search path

In [None]:
sys.path.append('lib')

In [None]:
from feature_vector_distribution import feature_vector_distribution

In [None]:
feature_vector_distribution(df, 'leaf',
                                group_columns=['extra_info'],
                                bins=25,
                                f_lim = {'min':-3, 'max':3}
                                )