# CGM Jupyter and DesignSafe Breakout Session
## Adam B. Price <br> May 12, 2017

# Introduction

1. [What is the Jupyter Notebook](https://nbviewer.jupyter.org/github/jupyter/notebook/blob/master/docs/source/examples/Notebook/What%20is%20the%20Jupyter%20Notebook.ipynb)
2. [Notebook Basics](https://nbviewer.jupyter.org/github/jupyter/notebook/blob/master/docs/source/examples/Notebook/Notebook%20Basics.ipynb)
3. [Jupyter Notebook Docs](https://jupyter-notebook.readthedocs.io/en/latest/notebook.html)

# Outline
* [1. Cell types](#1.-Cell-types)
* [2. Imports](#2.-Imports)
* [3. Tab completion and documentation](#3.-Tab-completion-and-documentation)
* [4. Data types](#4.-Data-types)
* [5. Read demo csv file in pandas DataFrame](#5.-Read-demo-csv-file-into-pandas-DataFrame)
* [6. Make a plot from DataFrame](#6.-Make-a-plot-from-DataFrame)
* [7. Read fast data binary file](#7.-Read-fast-data-binary-file)
* [8. Plot data from binary file](#8.-Plot-data-from-binary-file)
* [9. Store data from binary file in a dictionary and make a plot](#9.-Store-data-from-binary-file-in-a-dictionary-and-make-a-plot)
* [10. Widgets](#10.-Widgets)

# 1. Cell types

## 1.2 Markdown cells

# Heading 1
## Heading 1.1
## Heading 1.2
### Heading 1.2.1

Mardown is useful for taking notes...

*italic* and **bold** text

Bulleted or numbered lists:

* Item 1
* Item 2


1. Ordered item 1
2. Ordered item 2

And rendering equations with Latex

Inline equations like $A=\pi r^2$

Or equations on their own line...$$E=mc^2$$


<!-- display image in markdown cell with HTML -->
### Look at this huge centrifuge; it is for geotechnical modeling.
<img src='images/centrifuge_top.jpg' >

### Some helpful Markdown resources:
1. [Jupyter Markdown Cells Documentation](https://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/Working%20With%20Markdown%20Cells.html)
2. [Markdown Cheatsheet by Adam Pritchard](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)

## 1.2 Code cells

In [1]:
# This is a code cell.

In [2]:
# Press Shift+Enter to execute a code cell in the notebook.
print('Hello World!')

Hello World!


In [3]:
# Jupyter will display unassigned output of a statement on the last line
# of a code cell without requiring a print statment. 
5 + 6

11

In [4]:
# The same goes for the name of a variable...
myvar = 5 + 6
myvar

11

# 2. Imports

In [5]:
# Imports.

# Set matplotlib backend for interactive notebook use.
%matplotlib notebook

# Importing allows for access to code that is in another module.
import pandas as pd  # imports pandas as pd in local namespace.
import numpy as np
from matplotlib import pyplot as plt

# 3. Tab completion and documentation

In [6]:
pd.__version__

u'0.19.2'

In [7]:
print(pd.DataFrame.__doc__)

 Two-dimensional size-mutable, potentially heterogeneous tabular data
    structure with labeled axes (rows and columns). Arithmetic operations
    align on both row and column labels. Can be thought of as a dict-like
    container for Series objects. The primary pandas data structure

    Parameters
    ----------
    data : numpy ndarray (structured or homogeneous), dict, or DataFrame
        Dict can contain Series, arrays, constants, or list-like objects
    index : Index or array-like
        Index to use for resulting frame. Will default to np.arange(n) if
        no indexing information part of input data and no index provided
    columns : Index or array-like
        Column labels to use for resulting frame. Will default to
        np.arange(n) if no column labels are provided
    dtype : dtype, default None
        Data type to force, otherwise infer
    copy : boolean, default False
        Copy data from inputs. Only affects DataFrame / 2d ndarray input

    Examples
    ---

In [8]:
print(plt.plot.__doc__)

Plot lines and/or markers to the
:class:`~matplotlib.axes.Axes`.  *args* is a variable length
argument, allowing for multiple *x*, *y* pairs with an
optional format string.  For example, each of the following is
legal::

    plot(x, y)        # plot x and y using default line style and color
    plot(x, y, 'bo')  # plot x and y using blue circle markers
    plot(y)           # plot y using x as index array 0..N-1
    plot(y, 'r+')     # ditto, but with red plusses

If *x* and/or *y* is 2-dimensional, then the corresponding columns
will be plotted.

If used with labeled data, make sure that the color spec is not
included as an element in data, as otherwise the last case
``plot("v","r", data={"v":..., "r":...)``
can be interpreted as the first case which would do ``plot(v, r)``
using the default line style and color.

If not used with labeled data (i.e., without a data argument),
an arbitrary number of *x*, *y*, *fmt* groups can be specified, as in::

    a.plot(x1, y1, 'g^', x2, y2, 'g-

# 4. Data types

## 4.1 Built-in data types

In [9]:
# String.
my_string = 'this is my string'
type(my_string)

str

In [10]:
# Integer.
my_int = 3
type(my_int)

int

In [11]:
# Float.
my_float = 3.1
type(my_float)

float

In [12]:
# List (mutable).
my_list = [my_string, my_int, my_float]
type(my_list)

list

In [13]:
my_list.append('appended entry')
my_list

['this is my string', 3, 3.1, 'appended entry']

In [14]:
# First element in list.
my_list[0]

'this is my string'

In [15]:
# List comprehension.
a = [1., 2., 3., 4.]
[x**2 for x in a]


[1.0, 4.0, 9.0, 16.0]

In [16]:
# Tuple (immutable).
my_tuple = (my_string, my_int, my_float)
type(my_tuple)

tuple

In [17]:
# Second element in tuple.
my_tuple[1]

3

In [18]:
# Dictionary (key: value pairs).
my_dict = {'1': 'first value', 2: 'second value'}
type(my_dict)

dict

In [19]:
my_dict['1']

'first value'

In [20]:
# Sets (unordered collection with not duplicate elements).
my_list.append('appended entry')
print(my_list)
my_set = set(my_list)
print(my_set)
type(my_set)

['this is my string', 3, 3.1, 'appended entry', 'appended entry']
set(['this is my string', 3.1, 3, 'appended entry'])


set

## 4.2 A couple other data types

In [21]:
# Numpy array.
my_array = np.array([1., 2., 3., 4.])
output = my_array**2
print(output)
type(my_array)

[  1.   4.   9.  16.]


numpy.ndarray

In [22]:
# Pandas DataFrame.
my_df = pd.DataFrame(data=my_array, columns=['my_array'])
print(type(my_df))
my_df

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0,my_array
0,1.0
1,2.0
2,3.0
3,4.0


In [23]:
my_df['output'] = output
my_df

Unnamed: 0,my_array,output
0,1.0,1.0
1,2.0,4.0
2,3.0,9.0
3,4.0,16.0


In [24]:
my_df['output']

0     1.0
1     4.0
2     9.0
3    16.0
Name: output, dtype: float64

In [25]:
my_df2 = pd.DataFrame(data=np.asarray([my_array, output]).transpose(), columns=['my_array', 'output'])
my_df2

Unnamed: 0,my_array,output
0,1.0,1.0
1,2.0,4.0
2,3.0,9.0
3,4.0,16.0


# 5. Read demo csv file into pandas DataFrame

In [26]:
# Read csv file into pandas DataFrame.
df = pd.read_csv('data/demo_data.csv')

In [27]:
# Display first 5 rows for DataFrame.
df.head()

Unnamed: 0,"time, s","CPT, N","FB, m","ICP1, g","LC, N","LP1, m","PPT5858, kPa","PPT5859, kPa","rpm, rpm"
0,0.732,-8.193489,0.001727,0.002858,-3.875657,0.0003,-2.148362,1.676645,999.0
1,2.757,-8.141563,0.001723,0.002965,-3.836547,0.000302,-2.148026,1.676952,999.0
2,3.761,-8.106561,0.001715,0.00295,-3.909718,0.000302,-2.148149,1.678231,999.0
3,4.757,-8.10264,0.001726,0.003036,-3.950031,0.0003,-2.14868,1.677248,999.0
4,5.762,-8.132675,0.001726,0.002951,-3.928496,0.000302,-2.149088,1.677736,999.0


In [28]:
# Display DataFrame type.
type(df)

pandas.core.frame.DataFrame

In [29]:
# Check DataFrame shape.
df.shape

(1801, 9)

# 6. Make a plot from DataFrame

In [31]:
# Make a plot using the Matplotlib pyplot API.
plt.figure(figsize=[9, 4])  # Make figure.
ax = plt.gca()  # Get current axis.
plt.plot(df['time, s'], df['PPT5858, kPa']) # Plot onto current axis.

# Set axis labels.
ax.set_xlabel('time (s)') 
ax.set_ylabel('PPT5858 (kPa)')

# Add gridlines.
ax.grid()

<IPython.core.display.Javascript object>

In [34]:
# Pandas built-in plotting (built on Matplotlib).
df.plot() 

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0xc8920b8>

In [35]:
# Pandas built-in plotting. Specify keys for plotting.
df.plot(x='time, s', y=['PPT5858, kPa', 'PPT5859, kPa'])
plt.gca().grid()

<IPython.core.display.Javascript object>

# 7. Read fast data binary file

In [36]:
# Define a function to read a CGM binary file.
def read_binfile(filename):
    '''
    Reads binary file with standard format used at UC Davis CGM.
    Returns a list of [sensor_ID, raw] where sensor ID is 
    a list of sensor Xdcr Location ID values and raw is a numpy
    array of data.
    '''
    
    # Open binary file for reading.
    with open(filename, 'rb') as f:
        line = f.readline().decode()  # Read the first line.
        
        # Read lines until excel config header.
        while line != '[excelconfig]\r\n':
            line = f.readline().decode()
        
        # Skip sensor header line.
        line= f.readline().decode() 
        # Read sensor info into list(s).
        sensor_ID = []
        # Read first sensor config line.
        line= f.readline().decode()  
        
        # Read lines until line only contains a line break.
        while(line != '\r\n'):
            # Append the 5th comma separated value to the sensor_ID list.
            sensor_ID.append(line.split(',')[4])
            line = f.readline().decode()
        
        # Read lines until data header.
        while(line != '[data]\r\n'):
            line = f.readline().decode()
            
        # Read binary data into numpy array.
        raw = np.fromfile(f, dtype='>f')
        raw = raw.astype('float')
        
        # Reshape array.
        N = len(sensor_ID)
        raw = np.reshape(raw,((int(raw.size / N)), N))
        
        # Return a list of outputs.
        return [sensor_ID, raw]
        

In [37]:
print(np.fromfile.__doc__)

fromfile(file, dtype=float, count=-1, sep='')

    Construct an array from data in a text or binary file.

    A highly efficient way of reading binary data with a known data-type,
    as well as parsing simply formatted text files.  Data written using the
    `tofile` method can be read using this function.

    Parameters
    ----------
    file : file or str
        Open file object or filename.
    dtype : data-type
        Data type of the returned array.
        For binary files, it is used to determine the size and byte-order
        of the items in the file.
    count : int
        Number of items to read. ``-1`` means all items (i.e., the complete
        file).
    sep : str
        Separator between items if file is a text file.
        Empty ("") separator means the file should be treated as binary.
        Spaces (" ") in the separator match zero or more whitespace characters.
        A separator consisting only of spaces must match at least one
        whitespace.

    Se

In [38]:
# Read a binary file.
file_path = 'data/04262017@113846@131514@212.6rpm.bin'
[sensor_ID, raw] = read_binfile(file_path)

In [40]:
# Check shape of raw data array returned from read_binfile.
raw.shape

(256000L, 6L)

In [41]:
# Print the first row of raw.
raw[0]

array([ -4.35389951e-03,   1.37446392e+00,   4.27390633e+01,
         4.21561737e+01,  -1.34481573e+01,   2.36345935e+00])

In [42]:
# Contruct DataFrame from numpy array (raw) and list (sensor_ID) read from binary file.
df2 = pd.DataFrame(data=raw, columns=sensor_ID)

In [44]:
# Show first 5 rows of DataFrame.
df2.head()

Unnamed: 0,ICP1,FB,PPT5884_CENTER,PPT5884_LEFT,LC,Tbar
0,-0.004354,1.374464,42.739063,42.156174,-13.448157,2.363459
1,0.008124,2.290807,42.746765,42.245983,-13.400112,2.169969
2,-0.00824,1.741001,42.696686,42.382549,-13.379796,2.122214
3,-0.008521,2.242998,42.690971,42.462929,-13.364147,2.198787
4,0.028796,1.605542,42.728409,42.46315,-13.394073,2.165852


# 8. Plot data from binary file

In [45]:
# Plot from constructed DataFrame.
df2.plot(y='Tbar')

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0xdee49b0>

In [46]:
df2.plot(y=['LC', 'Tbar'])

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x118122e8>

In [47]:
# Plot using numpy arrays and offset data.
data = raw.transpose()
plt.figure(figsize=(9, 4))
plt.grid()
plt.plot(data[5] - data[5][0])
plt.plot(data[4] - data[4][0])

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x13cbb668>]

# 9. Store data from binary file in a dictionary and make a plot

In [48]:
# Construct dictionary with sensor_ID as keys and data arrays as values with dict comprehension.
data_dict = {key: data[idx] for idx, key in enumerate(sensor_ID)}

In [49]:
data_dict

{u'FB': array([ 1.37446392,  2.29080725,  1.74100113, ...,  1.70116019,
         1.69319201,  2.65734458]),
 u'ICP1': array([-0.0043539 ,  0.00812405, -0.00824009, ...,  0.03386421,
         0.02110533,  0.02222905]),
 u'LC': array([-13.44815731, -13.40011215, -13.37979603, ..., -13.86628628,
        -13.81165218, -13.90609455]),
 u'PPT5884_CENTER': array([ 42.73906326,  42.74676514,  42.69668579, ...,  65.87110138,
         65.88224792,  65.94219971]),
 u'PPT5884_LEFT': array([ 42.15617371,  42.24598312,  42.38254929, ...,  50.77403641,
         50.74829865,  50.80138397]),
 u'Tbar': array([ 2.36345935,  2.16996932,  2.12221432, ...,  0.65004313,
         0.78425115,  0.80236512])}

In [50]:
# Plot with seaborn styling.
import seaborn as sns

In [51]:
# Set seaborn plotting context.
sns.set_context(context='notebook', font_scale=1.2)

In [52]:
# Make a figure using data dictionary.
plt.figure(figsize=[9, 4])
ax = plt.gca()
x_attr = 'FB'
y_attr = 'Tbar'
plt.plot(data_dict[x_attr], data_dict[y_attr])
ax.set_ylabel(y_attr)
ax.set_xlabel(x_attr)

<IPython.core.display.Javascript object>

<matplotlib.text.Text at 0x16d703c8>

# 10. Widgets

In [53]:
# Imports.
import ipywidgets as widgets
from IPython.display import display
from random import randint

In [54]:
# Simple widget.
my_button = widgets.Button(description='My Button')
display(my_button)

In [55]:
my_button

Bonjour
Hi
Bonjour
Hi
Hi
Bonjour
Bonjour
Hi
Hi
Hola
Hi


In [56]:
def my_button_callback(self):
    greeting = ('Hi', 'Hola', 'Bonjour')
    rand = randint(0, len(greeting)-1)
    print(greeting[rand])

my_button.on_click(my_button_callback)
    

In [58]:
# Interact example
def myfunc(n):
    '''
    Prints the square of n.
    '''
    print(n**2)

widgets.interact(myfunc, n=(1,20))

<function __main__.myfunc>

In [59]:
# Plotting widget with our dictionary example.
def update_plot(ax, data, y_attr):
    '''
    Updates plotted data.
    '''
    line = ax.lines[0]
    line.set_ydata(data[y_attr])
    ax.set_ylabel(y_attr)
    ax.relim()
    ax.autoscale()
    plt.draw()

In [60]:
# Initialize figure.
plt.figure(figsize=[9, 4])
ax = plt.gca()
y_attr = tuple(data_dict.keys())[0]
plt.plot(data_dict[y_attr])
ax.set_ylabel(y_attr)
ax.set_xlabel('Index')

<IPython.core.display.Javascript object>

<matplotlib.text.Text at 0x19014cc0>

In [61]:
# Widget interaction.
widgets.interact(
    update_plot, 
    ax=widgets.fixed(ax),
    data=widgets.fixed(data_dict),
    y_attr=list(data_dict.keys())
)

<function __main__.update_plot>