## Starting with Python: An Interactive Notebook

*Author: JT Cho ([jonathan.t.cho@gmail.com](mailto:jonathan.t.cho@gmail.com))*

This is an interactive notebook designed to showcase the key features and nuances of **Python 2.x**. 
If you are unfamiliar with how Python works, be sure to read or skim through the following sections!

This content is adapted from Chapter 2 of Joel Grus's "Data Science from Scratch" and "[Learn X in Y minutes](https://learnxinyminutes.com/docs/python/)."

### Using this Notebook
For the code cells (with `In [ ]`) next to them, press <kbd>Shift</kbd> + <kbd>Enter</kbd> in the selected cell to evaluate it!

### The Basics

Syntax, arithmetic, logic operations, variables.

In [None]:
# This is a single-line comment. Documenting your code is important!

In [None]:
# You can do typical math with numbers...

print 3 * 7      # 21
print 1 + 1      # 2
print 4.2 + 5.8  # 10.0

In [None]:
# Division is a little different. 
# Normal division with integers will truncate the decimal off the result.

print 10 / 3    # 3

In [None]:
# If you use a floating point number, using / will preserve the decimal.
# The // operator is the integer division operator, and will always truncate 
# even if the arguments are floats.

print 7.2 / 3   # 2.4    (floating point)
print 7.2 // 3  # 2      (integer)

In [None]:
# It is recommended that you use the following to make the / operator always 
# use floating point division. This line overwrites the division operator
# with the the new-and-improved division operator from Python 3.
# Note that evaluating this cell will cause the entire notebook to use the 
# redefined /.
from __future__ import division

print 5 / 2     # 2.5

In [None]:
# You can also do some other fancy operations.

print 4 % 2     # 0 (modulus/remainder)
print 2 ** 3    # 8 (exponent)
print (1 + 2)*3 # 9 (grouping with parentheses)

In [None]:
# Booleans are represented as `True` or `False`.
# The Boolean operators `and` + `or` are case sensitive.

print True and False # False
print True or True   # True
print not True       # False

In [None]:
# All of your standard comparison operations are here, too.

print 2 == 3    # False
print 7 != 4    # True
print 6 < 2     # False
print 2 <= 2    # True
print 13 > 1    # True
print 12 >= 15  # False

In [None]:
# You can chain comparisons.

print 1 < 2 < 3 # True
print 4 < 2 < 3 # False

In [None]:
# Variable definitions are simple. Notice how variables don't have a type.
# Python is dynamically-typed - so a single variable `x` could have an 
# integer, String, etc. assigned to it!

x = 2
y = 3
print x + y     # 5

In [None]:
# Note that because Python is dynamically typed, we can change the type
# of a variable by redefining it (something you cannot do in, say, Java). 
# For instance, we can easily change x from an `int` to a `float` 
# (or even a `string`)! Keep this in mind when programming in Python, 
# as it can potentially lead to bugs!
print type(x)
x = 2.5
print type(x)
print x + y     # 5.5

### Types and Functions
Strings, lists, sets, dicts, functions, and lambdas.

In [None]:
# Strings are delimited by single or double quotation marks (the quotes 
# must match).

single_quotes = 'cis700'
double_quotes = "rocks"

In [None]:
# Use len(str) to compute the length of a string.

len("foo")      # 3

In [None]:
# Backslashes `\` are used to escape special characters.

tab_string = "\t"
len(tab_string) # 1

In [None]:
# You can create raw strings using `r""` to treat the backslash as a 
# regular character.

not_tab_string = r"\t"
len(not_tab_string)    # 2

In [None]:
# You can create multiline strings using triple double quotes.

multi_line_string = """First line
  second line with space
\tthird line with tab
"""
print multi_line_string

In [None]:
# You can format simple strings using % or .format().

print "%s has gone missing!" % ("Will Byers") 
print "His friends are %s, %s, and %s." % ("Mike", "Dustin", "Lucas") 

print "They live in {}, {}.".format("Hawkins", "Indiana")
print "The chief of police's name is {firstName} {lastName}.".format(firstName="Jim", lastName="Hopper")

In [None]:
# Lists in Python are kind of like Java/C arrays with extra functionality.
# Since Python is dynamically typed, a list is not restricted to containing
# a single type like in Java/C!

integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [integer_list, heterogeneous_list, []]

# Use len(list) to compute the length of a list.
print len(integer_list)  # 3

# Use `sum(list)` to compute the sum of a list of numeral types.
print sum(integer_list)  # 6

In [None]:
# You can use `xrange(x)` to generate the list [0, 1, ..., x-1].
# Square brackets [] are used to access the n'th element of a list.
# Lists in Python are 0-indexed (the first element is at index 0).

x = range(10)
print x         # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print x[0]      # 0
print x[1]      # 1

# You can index "backwards". The -1'th element is the last element in the 
# list.
print x[-1]     # 9
print x[-2]     # 8

# Lists are mutable. You can change their contents!
x[0] = -1
print x[0]      # -1

In [None]:
# You can use [:] notation to slice a list.

x = range(10)   # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print x[:3]     # [0, 1, 2]
print x[3:]     # [3, 4, 5, 6, 7, 8, 9]
print x[1:4]    # [1, 2, 3]
print x[1:-1]   # [1, 2, 3, 4, 5, 6, 7, 8]
print x[:]      # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] (copy)

In [None]:
# Use the `in` operator to check list membership.

3 in range(10)  # True

In [None]:
# Use `.extend()` to concatenate lists (modifies the original list).

x = [1, 2, 3]
x.extend([4, 5, 6])
print x         # [1, 2, 3, 4, 5, 6]

# Use + to concatenate lists without modification.

y = [1, 2, 3]
y + [4, 5, 6]
print y         # [1, 2, 3]

# Use `.append()` to add a single element to the end of a list.
y.append(4)
print y         # [1, 2, 3, 4]

# You can "unpack" a list into named variables. 
# Note that you need to have the same number of variables on the left as 
# elements in the list.
x, y = [2, 4]
print x         # 2
print y         # 4

# It is convention to use _ as a variable name for variables you want to 
# discard.
_, important = ["foo", "bar"]
print important # "bar"

In [None]:
# Since strings are essentially arrays of characters, we can do all
# the same things with strings!
hello = "Hello, world!"
print hello[0]     # 'H' 
print hello[0:5]   # "Hello"
print hello[-1]    # "!"
print hello + " Python is awesome!" # "Hello, world! Python is awesome!"

In [None]:
# We can also concatenate strings and numbers by using the `str`
# function to convert the number to a string.
e = 2.71828
print 'The value of e is approximately ' + str(e)

In [None]:
# Use `def` to define functions.
# Notice how there are no curly braces like in C-style languages.
# Python uses indentation to delimit blocks of code.

def foo():
    print "Hello, world!"
    
foo()                # "Hello, world!"

def add(x, y):
    print x + y

add(2, 4)            # 6

# You can even define functions with default values.

def addWithDefault(x, y=0):
    print x + y
    
addWithDefault(2)    # 2

In [None]:
# Functions are first class citizens. You can assign a function to a 
# variable and pass it around.

def foo(x):
    return x + 20

mystery = foo

print mystery(80)                   # 100

def apply_to_one(f):
    return f(1)                     # 21

print apply_to_one(mystery)

# You can create lambda functions in Python.

print apply_to_one(lambda x: x + 9) # 10

# Instead of assigning a lambda to a variable, you should just use `def`. 
def double(x): return 2 * x

print apply_to_one(double)          # 2

In [None]:
# Tuples are like lists, except you can't modify their contents. 
# They're immutable.
# Trying to change a tuple's contents will throw an exception.
# You can specify a tuple by using parentheses (or nothing) 
# instead of square brackets.

x = (1, 2, 3)

try:
    x[0] = 1
except TypeError: 
    print "You can't change the value of a tuple!"
    
# You can use a tuple to return multiple values from functions.

def sum_and_product(x, y):
    return (x + y), (x * y)

print sum_and_product(2, 3)
s, p = sum_and_product(4, 5)
print s
print p

# Tuples and lists can be used for multiple assignments.
x, y = 1, 2
x, y = y, x    # The Pythonic way of swapping variables.

In [None]:
# Dictionaries are Python's key-value data structures,
# and are often used to store structured data.

empty_dict = {}      # Pythonic
empty_dict2 = dict() # Less Pythonic
grades = { "JT": 80, "Trevin": 95, "Brady": 100 }

# You can access the value for a particular key using square brackets [].

print grades["JT"]              # 80

# Python raises a KeyError if you try to access a non-existent key.

try:
    print grades["Zachary"]
except KeyError:
    print "No grade for Zachary!"
    
# Existence of a key can be checked using `in`.

print "Brady" in grades         # True

# Use the .get(key, default_val) dictionary function to safely return a default
# value instead of raising an exception.

print grades.get("Zachary", 0)  # 0
print grades.get("Nobody")      # None

# Use the .keys() and .values() methods to get a list of keys/values.
print grades.keys()
print grades.values()

# Note that using `in` on the key list is much slower than the dict's `in`. 
# Python's dict is an implementation of a hash table. Hashing typically 
# performs much better than a linear list search.

print "Trevin" in grades        # True | Fast in
print "Trevin" in grades.keys() # True | Slow in

In [None]:
# Dictionary keys must be immutable. You can not use a list as a key.
# Multi-part keys can be achieved using tuples (or another method
# e.g. by turning the key into a string).

try:
    bad_dict = {["Will", "21532"]:"Student", ["Jim", "92104"]:"Police Chief"}
except TypeError:
    print "Can't use a mutable type as a key!"

ok_dict = {("Will", "21532"): "Student", ("Jim", "92104"): "Police Chief"}

In [None]:
# If you want your dictionary to have the same initial value for non-existent keys,
# use a defaultdict (e.g. word counts)

from collections import defaultdict

document = ["The", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"]

word_counts = defaultdict(int)      # int() produces 0
for word in document:
    word_counts[word.lower()] += 1
    
coords = defaultdict(lambda: [0,0]) # Default value will be [0,0].

In [None]:
# Sets represent a collection of distinct elements.

cities = set()
cities.add("San Francisco")
cities.add("Philadelphia")
cities.add("New York")
cities.add("Philadelphia")
print cities              # set(['San Francisco', 'New York', 'Philadelphia'])

names = set(["Alice", "Bob", "Charlie", "Alice", "David", "Emily"])
print names               # set(['Charlie', 'Bob', 'Emily', 'Alice', 'David'])

# Checking set membership is a lot faster than list membership.
print "Alice" in names    # True
print "George" in names   # False

### Control Flow and Exceptions

In [None]:
# Python supports if/else blocks.

if 1 > 2:
    message = "if only 1 > 2"
elif 1 > 3:
    message = "elif stands for 'else if'"
else:
    message = "when all else fails, you may use an else (but don't have to)"
    
print message   # when all else fails, you may use an else (but don't have to)

# Python also supports a ternary if-then-else on a single line.
parity = "even" if 10 % 2 == 0 else "odd"
print parity    # even

In [None]:
# Python also has `while` loops.
print "While loop:"
x = 0
while x < 10:
    print x, "is less than 10"
    x += 1

In [None]:
# We'll typically use `for` and `in`, though.
print "\nFor-in loop"
for x in xrange(10):
    print x, "is less than 10"
    
# You can also use `continue` and `break`.
print "\nFor-in loop with break/continue"
for x in xrange(10):
    if x == 3:
        continue    # go immediately to the next iteration
    if x == 5:
        break       # quit the loop entirely
    print x

### Truthiness

### List Comprehensions

### Generators and Iterators

In [None]:
# Lists can grow very large. Using `range(1000000)` creates an actual list 
# of a million elements!
for i in range(1000000):
    pass

# In Python, a generator is something you can iterate over (e.g. `for`) but 
# whose values are produced only as needed, or lazily.
# In Python 2.x, `xrange(n)` is a generator that gives the sequence 
# [0, 1, 2...., n-1] without creating the whole list.
for i in xrange(100000):
    pass

# A simple lazy generator can also be built as follows:

def lazy_range(n):
    """A lazy version of range."""
    i = 0
    while i < n:
        yield i
        i += 1

# Calling lazy_range(n) does not cause the entire content of the while
# loop to evaluate at once. The use of the `yield` keyword gives the
# elements of the generator sequence.
        
print "lazy_range(10)"
for i in lazy_range(10):
    print i

# Generators are useful - they allow us to even create infinite sequences!
# However, you can not rewind a generator. To get previous elements in a
# sequence, you'd have to create a new generator and iterate through that.

# You can also create generators using a for comprehension wrapped in parentheses:

lazy_evens_below = lambda n: (i for i in xrange(n) if i % 2 == 0)

print "lazy_evens_below(20)"
for i in lazy_evens_below(20):
    print i

### Functional Programming Tools

In [None]:
# Python supports partialization (currying) of functions.

def exp(base, power):
    return base ** power

from functools import partial
two_to_the = partial(exp, 2)   # equivalent to defining a function: `return exp(2, power)`
print two_to_the(3)            # 8

# You can specific argument names to target later arguments.
square_of = partial(exp, power=2)
print square_of(3)             # 9

# Try to avoid currying arguments in the middle of functions, as it can get messy!

In [None]:
#Python also supports `map`, `reduce`, and `filter`.

def double(x):
    return 2 * x

xs = [1, 2, 3, 4]
twice_xs = map(double, xs)           # equivalent to `[double(x) for x in xs]`
print twice_xs                       # [2, 4, 6, 8]

list_doubler = partial(map, double)
print list_doubler(xs)               # [2, 4, 6, 8]

# You can use `map` with multiple-argument functions if you provide multiple lists.
def mult(x, y): return x * y

pairwise_prods = map(mult, [1, 2, 3], [4, 5, 6])
print pairwise_prods                 # [4, 10, 18]

def dot_prod(u, v):
    return sum(map(mult, u, v))

print dot_prod([1, 2, 3], [4, 5, 6]) # 32

In [None]:
# You can use `filter` as shorthand for a list-comprehension `if`.

def is_even(x):
    return x % 2 == 0

xs = [1, 2, 3, 4]
x_evens = filter(is_even, xs)        # equivalent to [x for x in xs if is_even(x)]
print x_evens                        # [2, 4]

list_evener = partial(filter, is_even)
print list_evener(xs)                # [2, 4]

In [None]:
# `reduce` combines the first two elements of a list using some two argument 
# function, then that result with the third, and so fourth, until 
# only one element remains.

xs = [1, 2, 3, 4]
def mult(x, y): return x * y

x_product = reduce(mult, xs)
print x_product             # 24

list_product = partial(reduce, mult)
print list_product(xs)      # 24