## Comprehensions



Concisely form a new list by iterating over existing collection



In [1]:
[expr for val in collection if condition]

This is equivalent to



In [1]:
result = []
for val in collection:
    if condition:
        result.append(expr) 

### List comprehensions



In [1]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

**Let’s say I give you a list saved in a variable: a = [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]. Write one line of Python that takes this list a and makes a new list that has only the even elements of this list in it.**



In [5]:
#[val for val in collection if condition]
a = [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[x for x in a if x % 2 == 0 and x > 17]

[36, 64, 100]

### Dict comprehensions



In [8]:
lengths = {x.upper(): len(x) for x in strings if len(x) > 2}
lengths

{'BAT': 3, 'CAR': 3, 'DOVE': 4, 'PYTHON': 6}

Combine enumerations and comprehensions



In [10]:
print(strings)
loc_mapping = {val : index for index, val in enumerate(strings)}
loc_mapping

['a', 'as', 'bat', 'car', 'dove', 'python']


{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

### Nested comprehensions



Suppose we have a list of lists containing some English and Spanish names



In [11]:
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
            ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]
all_data

[['John', 'Emily', 'Michael', 'Mary', 'Steven'],
 ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

Let's get a single list containing all names with two or more e's in them



In [12]:
names_of_interest = []
for names in all_data:
    enough_es = [name for name in names if name.count('e') >= 2]
    names_of_interest.extend(enough_es)
print(names_of_interest)

['Steven']


A better way



In [13]:
result = [name for names in all_data for name in names
            if name.count('e') >= 2]
result

['Steven']

## A little more on functions



Let's take the following function



In [15]:
def func():
    a = []
    for i in range(5):
        a.append(i)
    return a
    
a = []
func()
a

[]

Local versus global scope of variables



In [17]:
def func():
    a = []
    for i in range(5):
        a.append(i)
    return a
b = func()
b

[0, 1, 2, 3, 4]

### Functions are objects!



-   Python functions can be passed as arguments to other functions
-   Support for so-called anonymous functions



In [19]:
def short_func(x):
    return x*2

In [20]:
equiv_func = lambda x: x * 2
outside_func(1, lambda x: x*2)

<function __main__.short_func(x)>

**Let's write a function that can either multiply the contents of a list by 2 or divide them by 3**



In [28]:
def myfunc(lst, opfunc):
    if opfunc.startswith("mult"):
        new_lst = [x*2 for x in lst]
    elif opfunc.startswith("div"):
        new_lst = [x/3 for x in lst]
    else:
        print("Wrong option")
        return None
    return new_lst
#myfunc([1,2,3], "msdadft")

def myfunc2(lst, opfunc):
    new_lst = [opfunc(x) for x in lst]
    return new_lst
#myfunc2([1,2,3], lambda x: 2*x)
list(map(lambda x: x*2, [1,2,3]))

[2, 4, 6]

## Generators



-   What if we wanted to create a sequence and operate on each of its elements?
-   List comprehension?
-   A different and memory-efficient way of creating an iterable is a *generator*
-   Evaluates expression on demand (lazily)



In [1]:
def squares(n=10):
    print('Generating squares from 1 to {0}'.format(n ** 2))
    for i in range(1, n + 1):
          i ** 2

In [2]:
gen = squares()
gen

Generating squares from 1 to 100


### Generator expressions



-   Another way to create generators
-   Similar to list comprehensions

What is the difference between there two?



In [3]:
doubles = [2 * n for n in range(50)]
sum(doubles)

2450

In [4]:
doubles = (2 * n for n in range(50))
sum(doubles)

2450

**Find the maximum of $\frac{n-1}{n+1}$ for $n \in [0,100]$**



## Modules



-   Modular programming is the process of breaking a large programming task into smaller subtasks
-   Several advantages in modular programming
    -   Simplicity
    -   Maintainability
    -   Reusability
    -   Scoping



### Importing modules



In [29]:
import sys
sys.getallocatedblocks?

### Alternate forms of importing modules



In [1]:
from module_name import module_function

In [1]:
import module_with_long_name as mod

**Write a module that contains a function which multiplies its argument by 2 and import it in your interpreter**



## Numpy



> One of the most important foundational packages
> for numerical computing in Python- ndarray, an efficient multidimensional array providing fast array-oriented arithmetic
operations and flexible broadcasting capabilities

-   Mathematical functions for fast operations on entire arrays of data without having

to write loops

-   Tools for reading/writing array data to disk and working with memory-mapped

files

-   Linear algebra, random number generation, and Fourier transform capabilities
-   A C API for connecting NumPy with libraries written in C, C++, or FORTRAN



### Numpy performance



In [1]:
import numpy as np
arr = np.arange(1000000)
lst = list(range(1000000))

Now let's multiply each sequence by 2



In [2]:
%time for _ in range(10): arr2 = arr * 2

CPU times: user 11 ms, sys: 5.65 ms, total: 16.7 ms
Wall time: 15.4 ms


In [3]:
%time for _ in range(10): lst2 = [x * 2 for x in lst]

CPU times: user 494 ms, sys: 139 ms, total: 633 ms
Wall time: 632 ms


### ndarray: A multidimensional array object



-   Fast and flexible container for multidimensional datasets
-   Can perform mathematical operations similar to scalars



In [4]:
data = np.random.randn(2, 3)
data

array([[-0.22687392, -2.71859018,  0.33721233],
       [ 0.10923863,  0.94027854,  0.75212259]])

In [7]:
data * data

array([[0.05147177, 7.39073255, 0.11371216],
       [0.01193308, 0.88412372, 0.56568839]])

### Creating ndarrays



Easiest way to create arrays is to use the `array` function



In [8]:
data = [3, 4, 2, 2]
arr = np.array(data)
arr

array([3, 4, 2, 2])

Nested lists will convert to multidimensional arrays



In [9]:
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

**Use a list comprehension to create a 4x2 array of 1's**



In [16]:
import numpy as np
np.array([[1 for j in range(2)] for i in range(4)])

array([[1, 1],
       [1, 1],
       [1, 1],
       [1, 1]])

### ndarray attributes



In [15]:
arr2.ndim

2

In [16]:
arr2.shape

(2, 4)

In [17]:
arr2.dtype

dtype('int64')

## Homework



1.  Write a function to calculate the frequencies of each number from a list of numbers.
2.  Calculate the sum of the numbers in a list by including only numbers that are less than 237.

Sample list:



In [1]:
numbers = [    
    386, 462, 47, 418, 907, 344, 236, 375, 823, 566, 597, 978, 328, 615, 953, 345, 
    399, 162, 758, 219, 918, 237, 412, 566, 826, 248, 866, 950, 626, 949, 687, 217, 
    815, 67, 104, 58, 512, 24, 892, 894, 767, 553, 81, 379, 843, 831, 445, 742, 717, 
    958,743, 527
    ]