# Overview of Jupyter Notebooks

A Jupyter notebook is composed of blocks of `Markdown` documentation or code referenced as cells. Each cell can be individually selected by a simple click. As you progress through this notebook, simply select a code-containing cell and click the `Run` button on the top toolbar (or alternatively `shift` + `[Enter]`) to execute that particular line or block of code. The `In [ ]` header to the left of each cell will change status to `In [*]` while a line or block of code is executing and then to a number indicating a line or block of executed code if successful.

## Executing a Line of Code

Throughout this tutorial we will be using the built-in Python command `print()` to write the output of our example code. The `print()` command can be passed many different inputs to print, each separated by a comma `,`. Try running the following single line of code as test:

In [0]:
print('Run this line of code to execute a simple print() command.')

# Variables

Variables are used in programming languages to store various kinds of data. Python is dynamically-typed programming language which means that any variable can contain any underlying type of data without needing to specify ahead of time its contents. A variable name can compose of a combination of letters, numbers and/or symbols; keep in mind that in Python the variable name **must** begin with a letter.

## Numbers

Numbers are defined simply by value assignments (no quotes needed). See below for examples:



In [0]:
a = 1
b = 2.34
c = -5

print(a, b, c)

## Strings

Strings are defined by enclosing a series of letters and/or numbers with a single quite `'`. See below for examples:

In [0]:
a = 'This is a string.'
b = 'This is a string with numbers 1, 2 and 3'

print(a, b)

Strings can be combined using the `+` operator. For example:

In [0]:
a = 'This is a string'
b = ' that has been combined.'

print(a + b)

Furthremore, strings can be replaced using the `replace(old, new)` method.

In [0]:
a = 'This is a str.'

print(a.replace('str', 'string'))

Try replacing parts of the string below:i

In [0]:
a = 'This is a test string.'

# Replace 'This' with 'That'

## Lists

Lists are simply a collection (or array) of either numbers and/or strings. Lists are defined by a enclosing a series of numbers and/or strings within brackets `[]`, seperated by commas `,`. For example:


In [0]:
a = [1, 2, 3, 'a', 'b', 'c']

print(a)

Each individual item within a list can be acessed by an index value enclosed in brackets `[]`. For example:

In [0]:
a[0] # This is the 1st item in the list
a[1] # This is the 2nd item in the list

a[-1] # This is the last item in the list
a[-2] # This is the 2nd to last item in the list

a[0:3] # This is a sub-list containing the 1st, 2nd and 3rd items in the list
a[3:6] # This is a sub-list containing the 4th, 5th and 6th items in the list
a[3:] # By omitting the second index, Python assumes you want everything until the end of the list

What is the expected output of the following commands?

In [0]:
# Example 1 
print(a[-2])

# Example 2
n = 3
print(a[:n])

How would I select the last 3 entries in the list?

In order to add items to a list, use the `append()` method. For example:

In [0]:
a.append('d') # This adds the string 'd' to my existing list

print(a)

In order to remove items from a list, use the `pop()` method with the index of the item you wish to remove. For example:

In [0]:
a.pop(0) # This removes the 1st item in the list

print(a)

How do you remove the last item from the list?

Lists can also be edited directly by assignment into indexed items. For example:

In [0]:
a[0] = 'first entry in list'

print(a)


 ## Dictionaries
 
 Dictionaries are special Python variables composed of terms and their corresponding definitions (key-value pairs). Each term is either a number or string. Each definition can be either a number, string, list or new dictionary. Dictionaries are defined by enclosing key-value pairs in curly braces `{}`. For example:

In [0]:
a = {'key0': 'value0', 'key1': 1, 2: ['value2']}

Alternatively, for ease of readibility the following definition is identical to above:

In [0]:
a = {
    'key0': 'value0',
    'key1': 1,
    2: ['value2']
}

Which keys in the dictionary `a` are composed of strings? Numbers? Which values in the dictionary `a` are composed of strings? Numbers? What type of value is `a[2]`?

In order to access a dictionary value, simply "index" into that variable using the corresponding key. For example:

In [0]:
print(a['key0']) # This will return 'value0'
print(a['key1']) # This will return 1

As with lists, dictionaries can be edited by assignment into indexed items. For example:

In [0]:
a['key0'] = 'new value'
print(a['key0'])

If a dictionary assignment includes a key that does not already exist, it becomes added to the dictionary.

In [0]:
a['key3'] = 'This is a new key-value pair.'

print(a)

# Arithmetic Operators

Arithmetic operators are simple mathematic operators such as +, -, / and `*`. These can be used as expected in Python similar to a basic calculator. In addtion, two additional unique operators are defined:

* `**`: exponential (e.g. `2 ** 2` is simply 2 squared)
* `%`: modulo / remainder (e.g. `5 % 2` is 1)

What is the result of the following arithmetic?

In [0]:
((4 ** 2) + (1 + 7)) % 3 

# Comparison and Logical Operators

Logical operators are used to test a binary condition (e.g. a test that results in either True or False). The following are logical operators defined in Python:

* `> or <` : greater than or less than
* `>= or <=` : greater/equal than or less/equal than
* '== or !=' : equal or not equal to

What are the results of the following logical comparisons?

In [0]:
# Example 1
print(1 > 2)

# Example 2
print(5 != 6)

# Example 3
print('dog' == 'cat')

A special logical operator `in` can be used specifically for either lists or dictionaries in Python. The operator simply tests to see if a given item is a member of either the provided list or dictionary. For example:

In [0]:
a = [1, 2, 3, 4]

print(2 in a)
print(5 in a)

## If-Then Statements

If then statements are method to control logical flow of instructions in a programming language. The syntax is simple: if a statement is True, then do [ ... ], else do [ ... ]. For example:

In [0]:
if True:
    print('The statement was True')
else:
    print('The statement was False')

Of course, in the above statement, the condition `True` is hard-coded. In real life, we would to use a logical operator instead. For example:

In [0]:
a = 3
b = 2

if a > b:
    print('a is greater than b')

else:
    print('b is greater than a')

How do I check if the number 3 is in the following list?

In [0]:
my_list = [1, 3, 4, 7, 8]

# What if-then statement would test if the number 3 is in my_list?

# Loops

Aside from `if-then` statements, loops are the other most common way to control the logical flow of a program. A loop is simply a mechanism to repeat certain commands in a predetermined way.

## for loop

A `for` loop is the most common type of looping mechanism. It simply loops a single variable over an iterable (e.g. a list or a dictionary). For example:

In [0]:
numbers = [1, 2, 3, 4]

for num in numbers:
    
    print(num)

There is a special Python iterable that can be defined using the keyword `range`. The `range` keyword is used to create a temporary iterable with the specifications (start value, end value, step) provided. For example:

In [0]:
# Print the numbers 0, 1, 2, 3, 4
for i in range(5):
    print(i)
    
# Print the numbers 2, 3, 4
for i in range(2, 5):
    print(i)
    
# Print the number 0, 2, 4
for i in range(0, 5, 2):
    print(i)

How would I loop over the following variables?

In [0]:
numbers = [1, 2, 3, 4, 5, 6]

# How do I loop over the first 3 values in the list?

# How do I loop over the last 2 values in the list?

# How do I replace the following loop with the range keyword instead of the list?
print('Example 3')
for num in numbers:
    print(num)

# Methods

Methods are a way of encapsulating code that can be reused ("called") at some point in the future. Methods are very useful to both keep track of small units of logic and also to prevent copy-pasting code that is repeated throughout your program. Methods are defined using the `def` keyword followed by the name of the method. For example:

In [0]:
def my_method():
    
    numbers = [0, 1, 2, 3, 4, 5]
    for num in numbers:
        if (num % 2) == 0:
            print(num)

To run this method, simply write the name of the method followed by a set of parenthesis `()`. What will the output of this method be?

In [0]:
my_method()

The method above performs the exact set of instructions every time it is called. Often times however the execution of a method is conditional based on some input parameters. These parameters are defined by additional variable names placed within the parenthesis `()` during the method definition. For example:

In [0]:
def find_odd_numbers(numbers):
    
    for num in numbers:
        if (num % 2) == 1:
            print(num)

To run this method, use the same approach as before however be sure to give your method the required variable it needs to work!

In [0]:
numbers = [0, 1, 2, 3, 4, 5]

# Libraries and Packages

Many (very smart and intelligent) teams of programmers have created their own methods to accomplish all sorts of interesting tasks. In fact the Python ecosystem has tremendously grown in popularity in recent years given the large amount of high-quality libraries and packages maintained by developers around world, especially for data science and machine learning. To reuse this code in your own project, simply import the appropriate package before calling a particular method using the `import` command. Subsequently, the new functions as part of the package can be accessed by appending a period `.` after the package name followed by the method of interest.

## Standard Python Packages

### os

The `os` library contains tools needed to interact with your underlying operating system including methods to create/delete files and folders. This will become a critical skill to master in data science where tasks such as organizing files can be quickly automated.


In [0]:
import os

# Create the folders in the following list
folders_to_create = ['dir1', 'dir2', 'dir3']
for folder in folders_to_create:
    print('Creating folder named: ' + folder)
    os.makedirs(folder, exist_ok=True)

In the above example, the `os` library is used to create a number of folders automatically. Note that if the full folder path is not specified, then the folders will simply be created in current folder that Python is currently launched in.

A useful hint is that typing a period after a package name followed by the `tab` key (for autocomplete) will trigger a dialog box of matching method names. After selecting a method and typing an open parenthesis, hitting `tab` again will trigger the docstring (e.g. hints for using that function). This method of exploring a new package is extraordinarily useful if you are uncertain about the methods and/or usage. Try this below (e.g. type `os.` then tab complete). Can you find the method neeed to rename a file? Delete a directory?

In [0]:
!wget

### glob

The `glob` library contains tools needed to search for files on your computer. Note that the return of a call to `glob.glob()` is a **list**. The syntax is as follows:


In [0]:
import glob

# Search for all files and folders in the current directory:
results = glob.glob('*')
print(results)

# Search for all files and folders in the current directory starting with the letter 'd'
results = glob.glob('dir*')
print(results)

# Search for all files and folders in the current directory recursively:
results = glob.glob('**/d*', recursive=True)
print(results)

# Results is a Python list
for result in results:
    print(result)

How do you accomplish the following tasks?

In [0]:
# How do I find all folders beginning with 'dir'?

# How do I create a loop to print all the results?

# How do I rename all the results with '_new' appended to all the folders? (HINT: use os.rename())

# How do I delete all the folders with '_new' appended? (HINT: use os.removedirs())

## Scientific Libraries

In addition to many standard Python libraries, there are a number of very common scientific libraries that are used in data science.

### numpy

The NumPy library (prounced like **NUM**ber + **PY**thon) contains methods used to deal with common mathematical functions. The library is especially optimized for dealing with matrices (remember that all images are matrices). The following lines of code will initialize a 2 x 2 matrix with random numbers between 0 and 1:


In [0]:
import numpy as np

random_matrix = np.random.rand(2, 2)
print(random_matrix)

Let us look at some basic NumPy methods:

In [0]:
# Define a new simple 2 x 2 matrix
a = np.array([
    [0, 1],
    [2, 3]
])

# Determine mean of values
print(np.mean(a))

# Determine max/min of values
print(np.min(a))
print(np.max(a))

# Determine the sum of values
print(np.sum(a))

# Determine the mean of columns (axis=0)
print(np.mean(a, axis=0))

# Determine the mean of rows (axis=1)
print(np.mean(a, axis=1))

Please refer to online tutorials for NumPy usage. The CS231n course at Stanford has a great overview (see [link](http://cs231n.github.io/python-numpy-tutorial/#numpy) here). Otherwise feel free to use the tab autocomplete method to inspect the different methods of this powerful library.

### pylab, scipy

To facilitate manipulation of images, we will introduce two more libraries, `pylab` and `scipy`. First, let us download an image from the web for testing purposes:


In [0]:
!wget -O digit.png https://cdn-ak.f.st-hatena.com/images/fotolife/w/walkingmask/20160827/20160827030230.png

Now let's load and view the image.

In [0]:
from scipy.misc import imread
import pylab

digit = imread('digit.png')

# Note that digit is just a simple NumPy matrix
print(digit.shape)
print(np.mean(digit))
print(np.sum(digit))

# Use pylab to view the image
pylab.imshow(digit)
pylab.axis('off')
pylab.show()