# Basics of the Python Programming Language

## Defining Functions

In [140]:
# Defining a function
def add_numbers(x, y):
    return x + y

add_numbers(1,2)

3

Python functions can have default values for parameters. However, all optional parameters, with default values, must come at the end of the function declaration. 

In [141]:
def multiply_numbers(x, y, z=None):
    if (z==None):
        return x * y
    else:
        return x * y * z

print(multiply_numbers(1, 2))
print(multiply_numbers(1, 2, 3))

2
6


In Python, we can also assign functions to variables:

In [142]:
def subtract_numbers(x, y):
    return x - y

s = subtract_numbers
s(4,2)

2

## Working With Strings

In [143]:
firstName="Chico"
lastName="Rodriguez"

print(firstName + ' ' + lastName) # String concatenation
print(firstName * 3) # Repeat a string
print('Chico' in firstName) # Check if substring is in string

Chico Rodriguez
ChicoChicoChico
True


In [144]:
# Separating substrings according to a separator
firstName = "Chico Miguel Eduardo de la Manteca Rodriguez".split(' ')[0]
lastName = "Chico Miguel Eduardo de la Manteca Rodriguez".split(' ')[-1]
print(firstName)
print(lastName)

Chico
Rodriguez


We can use placeholders in Python strings which can then be populated using the `format` function.

In [145]:
salesRecord = {'price': 3.14, 
               'numItems': 42,
               'person': 'Maria'
              }

salesStatement = '{} bought {} item(s) at a price of {} each for a total of {}'

print(salesStatement.format(salesRecord['person'],
                           salesRecord['numItems'],
                           salesRecord['price'],
                           salesRecord['numItems']*salesRecord['price']))

Maria bought 42 item(s) at a price of 3.14 each for a total of 131.88


## Working with Dictionaries
Dictionaries can be thought of as unordered key:value pairs with the condition that each key is unique.

In [146]:
ourDict = {"Chico Rodriguez": "chico.rodriguez@neolimon.org", "Miguel Cervantes": "m.cervantes@amazingwriters.org"}
ourDict['Chico Rodriguez']

'chico.rodriguez@neolimon.org'

In [147]:
# We can add elements to dictionary using array notation
ourDict["Gabo Marquez"] = "g.marquez@amazingwriters.org"
ourDict['Gabo Marquez']

'g.marquez@amazingwriters.org'

In [148]:
# We can iterate over keys and print out values
for name in ourDict:
    print(ourDict[name])

# We can iterate over values
for email in ourDict.values():
    print(email)
    
# Finally, we can also iterate over values and keys using the items function
for name, email in ourDict.items():
    print(name)
    print(email)

m.cervantes@amazingwriters.org
chico.rodriguez@neolimon.org
g.marquez@amazingwriters.org
m.cervantes@amazingwriters.org
chico.rodriguez@neolimon.org
g.marquez@amazingwriters.org
Miguel Cervantes
m.cervantes@amazingwriters.org
Chico Rodriguez
chico.rodriguez@neolimon.org
Gabo Marquez
g.marquez@amazingwriters.org


## Working with .csv Files

In [149]:
import csv

%precision 2

# Creating a list with dictionary elements for mpg
with open('mpg.csv') as csvFile:
    mpg = list(csv.DictReader(csvFile))
    
# Checking the first two elements of the list
print(mpg[:2])

# Number of elements
len(mpg)

# Looking at keys
mpg[0].keys()

[{'': '1', 'trans': 'auto(l5)', 'drv': 'f', 'year': '1999', 'class': 'compact', 'cty': '18', 'model': 'a4', 'hwy': '29', 'displ': '1.8', 'manufacturer': 'audi', 'cyl': '4', 'fl': 'p'}, {'': '2', 'trans': 'manual(m5)', 'drv': 'f', 'year': '1999', 'class': 'compact', 'cty': '21', 'model': 'a4', 'hwy': '29', 'displ': '1.8', 'manufacturer': 'audi', 'cyl': '4', 'fl': 'p'}]


dict_keys(['', 'trans', 'drv', 'year', 'class', 'cty', 'model', 'hwy', 'displ', 'manufacturer', 'cyl', 'fl'])

Now, suppose that we want to find the average (arithmetic mean) city mpg across all cars in the given dataset. 

In [150]:
sum(float(d['cty']) for d in mpg) / len(mpg)

16.86

Similarly, we can find the average highway mpg across all cars in the dataset:

In [151]:
sum(float(d['hwy']) for d in mpg) / len(mpg)

23.44

Now, suppose that we want to see the average city mpg grouped by the number of cylinders a car has.

In [152]:
# Gathering unique levels for the number of cylinders
cylinders = set(d['cyl'] for d in mpg)
cylinders

{'4', '5', '6', '8'}

In [153]:
CtyMpgByCyl = [] # An empty list to store our results

for c in cylinders:
    sumMpg = 0
    cylTypeCount = 0
    
    # Iterating through each dictionary element, seeking
    # a match for the number of cylinders.
    for d in mpg:
        if d['cyl'] == c:
            sumMpg += float(d['cty'])
            cylTypeCount += 1
            
    # Appending the result for the current cylinder 
    # to the results list
    CtyMpgByCyl.append((c, sumMpg / cylTypeCount))

# Let's look at the results
print(CtyMpgByCyl)

# Sort and display by lowest to highest number of 
# cylinders (the 0th element)
CtyMpgByCyl.sort(key=lambda x: x[0])
CtyMpgByCyl

[('4', 21.012345679012345), ('5', 20.5), ('6', 16.21518987341772), ('8', 12.571428571428571)]


[('4', 21.01), ('5', 20.50), ('6', 16.22), ('8', 12.57)]

Let's say we want to look at average highway mpg according to vechicle class. Just like in the previous example, we iterate over each vehicle class, then iterate over each dictionary.

In [154]:
vehicleClass = set(d['class'] for d in mpg)

HwyMpgByClass = []

for t in vehicleClass:
    sumMpg = 0
    vClassCount = 0
    
    for d in mpg:
        if d['class'] == t:
            sumMpg += float(d['hwy'])
            vClassCount += 1
            
    HwyMpgByClass.append((t, sumMpg / vClassCount))

# This time, we will sort by lowest to  highest 
# average highway mpg (element 1)
HwyMpgByClass.sort(key=lambda x: x[1])
HwyMpgByClass

[('pickup', 16.88),
 ('suv', 18.13),
 ('minivan', 22.36),
 ('2seater', 24.80),
 ('midsize', 27.29),
 ('subcompact', 28.14),
 ('compact', 28.30)]

## Basics of Dates and Times in Python

One of the most common legacy methods for storing date and time is based on the offset from the epoch, which is January 1, 1970. If interested, read more about it [here](https://en.wikipedia.org/wiki/Unix_time).

In Python, we can get the current time since the epoch using the `time` module.

In [155]:
import datetime as dt
import time as tm

In [156]:
tm.time()

1481429150.71

In [157]:
# This gives us the time stamp in the format:
# (year, month, day, hour, minute, second, microsecond)
dtNow = dt.datetime.fromtimestamp(tm.time())
dtNow

datetime.datetime(2016, 12, 10, 22, 5, 50, 805925)

We can do simple operations on dates using time deltas. For examples, let us create a time delta of 100 days, then we can do subtraction and comparisons with the date time object. This is commonly used in data science, particularly in making sliding windows. 

In [158]:
dtNow.year, dtNow.month, dtNow.day, dtNow.hour, dtNow.minute, dtNow.second

(2016, 12, 10, 22, 5, 50)

In [159]:
delta = dt.timedelta(days = 100)
delta

datetime.timedelta(100)

In [160]:
today = dt.date.today()

In [161]:
today - delta

datetime.date(2016, 9, 1)

In [162]:
today > today - delta

True

## Objects and map()

### Objects

In Python, we declare classes using the keyword `class`, and anything indented below is within the scope of the class. An interesting thing about Python is that we do not need to declare variables within the object, we just begin using them. However, class variables can be declared, and these are shared across all instances of the object. 

To define a method, we just write it as we would a function. In order to have access to the instance which a method is begin invoked upon, we must include the keyword `self` in the method signature. That is, in order to create variables which are not shared across all instances (class variables),  we must include `self`. Prepending `self.` also works for referring to instance variables set on an object. 

Something very important to keep in mind when programming in Python is that Python does not have access modifiers. 

The following is an example of a `Person` object definition and instance.

In [163]:
class Person:
    department = 'Hand Wavy Mathematics and Physics'
    
    def set_name(self, new_name):
        self.name = new_name
    def set_location(self, new_location):
        self.location = new_location
        
chico = Person()
chico.set_name("Chico Rodiguez")
chico.set_location("Eivissa")

print("{} is currently in {}.".format(chico.name, chico.location))

Chico Rodiguez is currently in Eivissa.


### map() function

The `map()` function is the basis for functional programming in Python. You can read more about `map()` [here](https://docs.python.org/3/library/functions.html#map).

The documentation is as follows:

---------------------------------

`map(function, iterable, ...)`

Return an iterator that applies function to every item of iterable, yielding the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted. For cases where the function inputs are already arranged into argument tuples, see itertools.starmap().

---------------------------------

Here is an example of an application of the `map()` function: suppose we have lists of prices, of the same items, from two different stores and we want to find the minimum that we would have to pay if we bought the more inexpensive item from each store. 

In [164]:
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]

cheapest = map(min, store1, store2)

print(cheapest) # Prints memory location of map object

# Print each value, we iterate over the object
for itemPrice in cheapest:
    print(itemPrice)

<map object at 0x7f356c5c73c8>
9.0
11.0
12.34
2.01


## Lambdas and List Comprehensions

Lambda functions are simply anonymous functions; they are simple, have no name, and are typically passed to higher-order functions. 

In [165]:
# The syntax for lambda functions in Python is quite simple
# lambda [list of arguments] : [single expression] 
my_function = lambda a, b, c : a + b

my_function(1, 2, 3)

3

We can use list comprehensions to make sequences from sequences. For example, suppose we would like to make a list of even numbers.

In [166]:
# We can iterate over the first 1000 numbers using a for loop
myList = [] 
for number in range(0,1000):
    if number % 2 == 0:
        myList.append(number)

print(myList)

# We can also create the list using a list comprehension
myList = [number for number in range(0, 1000) if number % 2 == 0]
print(myList)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420,

## NumPy

NumPy is very useful because it lets us work efficiently with arrays and matrices in Python. First, we begin by importing NumPy.

In [167]:
import numpy as np

### Creating Arrays

We can create arrays from existing lists, or we can directly input a list as an argument.

In [168]:
numList = [1, 2, 3]
x = np.array(numList)

x

array([1, 2, 3])

In [169]:
y = np.array([4, 5, 6])
y

array([4, 5, 6])

We can also make multidimensional arrays by passing a list of lists. For example, we can create a two by three array by:

In [170]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

array([[ 7,  8,  9],
       [10, 11, 12]])

In [171]:
m.shape

(2, 3)

For the `arange()` function we pass the start, stop, and the step size of the interval.

In [172]:
n = np.arange(0, 30, 2)
n

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

Suppose that we wanted to convert this array into a three by five array. We can do so using the `reshape()` function.

In [173]:
n = n.reshape(3, 5)
n

array([[ 0,  2,  4,  6,  8],
       [10, 12, 14, 16, 18],
       [20, 22, 24, 26, 28]])

The `linspace()` function is similar to MATLAB's in that we pass the start, end, and the number of points we want the function to generate in that interval.

In [174]:
l = np.linspace(0, 5, 9)
l

array([ 0.   ,  0.625,  1.25 ,  1.875,  2.5  ,  3.125,  3.75 ,  4.375,  5.   ])

To construct a 3x3 array, we can also use the `resize()` function:

In [175]:
l.resize(3, 3)
l

array([[ 0.   ,  0.625,  1.25 ],
       [ 1.875,  2.5  ,  3.125],
       [ 3.75 ,  4.375,  5.   ]])

NumPy also includes function shortcuts for common arrays. 

In [176]:
np.zeros((2,3))

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [177]:
np.ones((3,2))

array([[ 1.,  1.],
       [ 1.,  1.],
       [ 1.,  1.]])

In [178]:
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

We can also make a diagonal square matrix with a passed list, as well as two types of repeated arrays (note the function and the output):

In [179]:
np.diag(y)

array([[4, 0, 0],
       [0, 5, 0],
       [0, 0, 6]])

In [180]:
np.array([1, 2, 3] * 3)

array([1, 2, 3, 1, 2, 3, 1, 2, 3])

In [181]:
np.repeat([1, 2, 3], 3)

array([1, 1, 1, 2, 2, 2, 3, 3, 3])

We can also combine arrays and stack them vertically and horizontally:

In [182]:
p = np.ones((2, 3), int)
p

array([[1, 1, 1],
       [1, 1, 1]])

In [183]:
# Stacking vertically
np.vstack([p, 2*p])

array([[1, 1, 1],
       [1, 1, 1],
       [2, 2, 2],
       [2, 2, 2]])

In [184]:
# Stacking horizontally
np.hstack([p, 2*p])

array([[1, 1, 1, 2, 2, 2],
       [1, 1, 1, 2, 2, 2]])

### Array Operations

In [185]:
print('x: ', x)
print('y: ', y)

x:  [1 2 3]
y:  [4 5 6]


In [186]:
# Component-wise addition
x + y

array([5, 7, 9])

In [187]:
# Component-wise multiplcation
x * y

array([ 4, 10, 18])

In [188]:
# Component-wise division 
x / y

array([ 0.25,  0.4 ,  0.5 ])

In [189]:
# Component-wise subtraction
x - y 

array([-3, -3, -3])

In [190]:
# Squaring each element
x**2

array([1, 4, 9])

In [191]:
# Dot product
x.dot(y)

32

In [192]:
# Creating a 2x3 array of y and its squares
z = np.array([y, y**2])
z

array([[ 4,  5,  6],
       [16, 25, 36]])

In [193]:
z.shape

(2, 3)

In [194]:
# We can take the transpose using '.T'
z.T

array([[ 4, 16],
       [ 5, 25],
       [ 6, 36]])

In [195]:
z.T.shape

(3, 2)

In [196]:
# Using dtype, we can see the type of data the array holds
z.dtype

dtype('int64')

In [197]:
# And we can cast an array to an different type using astype
z = z.astype('f')
z.dtype

dtype('float32')

NumPy provides us with other useful math functions.

In [198]:
a = np.array([-3, -1, 0, 1, 1, 3, 6])
a.sum()

7

In [199]:
a.max()

6

In [200]:
a.min()

-3

In [201]:
# Mean of array
a.mean()

1.00

In [202]:
# Standard deviation 
a.std()

2.67

In [203]:
# To find the index of maximum
a.argmax()

6

In [204]:
# To find the index of minimum
a.argmin()

0

### Indexing and Slicing

In [205]:
a = np.arange(13)**2
a

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121, 144])

In [206]:
a[0], a[4], a[0:3]

(0, 16, array([0, 1, 4]))

In [207]:
# Lets look at elements 1 through 5
a[1:5]

array([ 1,  4,  9, 16])

In [208]:
# We can also work backwards. To look at the last 
# four elements
a[-4:]

array([ 81, 100, 121, 144])

In [209]:
# Here, we look at fifth from last towards the beginning of 
# the array with a step size of negative two
a[-5::-2]

array([64, 36, 16,  4,  0])

Now, let's see how this extends to two-dimensional arrays.

In [210]:
m = np.arange(36)
m.resize((6, 6))
m

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

In [211]:
m[2, 2]

14

In [212]:
# Row 4, columns 4 to 6
m[3, 3:6]

array([21, 22, 23])

In [213]:
# First 2 rows, all columns except the last
m[:2, :-1]

array([[ 0,  1,  2,  3,  4],
       [ 6,  7,  8,  9, 10]])

In [214]:
# Every second element from the last row
m[-1, ::2]

array([30, 32, 34])

In [215]:
# We can also use bracket notation with conditionals
m[m > 30]

array([31, 32, 33, 34, 35])

In [216]:
# We can also take those elements and assign new values
m[m > 30] = 30
m

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

**Note:** We need to be careful when working with NumPy arrays. Let's see what happens when we take a slice of an array and alter it.

In [217]:
# Now, let's create a new array that is a slice of m
m2 = m[:3, :3]
m2

array([[ 0,  1,  2],
       [ 6,  7,  8],
       [12, 13, 14]])

In [218]:
# Let us assign 0 to all of these elements
m2[:] = 0
m2

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [219]:
m

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

We notice that by altering the slice of the array we altered the original array. 

**Note:** NumPy copies arrays by reference and not by value.

In order to copy an array by value we must use the `copy()` function. 

In [220]:
mCopy = m.copy()
mCopy

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

In [221]:
mCopy[:] = 12
print(mCopy, '\n')
print(m)

[[12 12 12 12 12 12]
 [12 12 12 12 12 12]
 [12 12 12 12 12 12]
 [12 12 12 12 12 12]
 [12 12 12 12 12 12]
 [12 12 12 12 12 12]] 

[[ 0  0  0  3  4  5]
 [ 0  0  0  9 10 11]
 [ 0  0  0 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 30 30 30 30 30]]


### Iterating Over Arrays

In [222]:
randArray = np.random.randint(0, 10, (4,3))
randArray

array([[8, 6, 1],
       [8, 5, 1],
       [5, 5, 5],
       [6, 3, 9]])

In [223]:
# We can iterate by row
for row in randArray:
    print(row)

[8 6 1]
[8 5 1]
[5 5 5]
[6 3 9]


In [224]:
# We can iterate by row index (using len for all rows)
for i in range(len(randArray)):
    print(randArray[i])

[8 6 1]
[8 5 1]
[5 5 5]
[6 3 9]


In [225]:
for i, row in enumerate(randArray):
    print('row', i, 'is', row)

row 0 is [8 6 1]
row 1 is [8 5 1]
row 2 is [5 5 5]
row 3 is [6 3 9]


In [226]:
randArray2 = randArray**2
randArray2

array([[64, 36,  1],
       [64, 25,  1],
       [25, 25, 25],
       [36,  9, 81]])

In [227]:
# If we want to iterate over two arrays, we can use zip
for i, j in zip(randArray, randArray2):
    print(i, '+', j, '=', i + j)

[8 6 1] + [64 36  1] = [72 42  2]
[8 5 1] + [64 25  1] = [72 30  2]
[5 5 5] + [25 25 25] = [30 30 30]
[6 3 9] + [36  9 81] = [42 12 90]
