# Table of Contents
* [Learning Objectives:](#Learning-Objectives:)
* [Comprehensions](#Comprehensions)
	* [Budget Example](#Budget-Example)


# Learning Objectives:

After completion of this module, learners should be able to:

* use comprehensions to replace complex nests of loops & conditionals

# Comprehensions

Various builtin data collections in Python can be constructed using *comprehensions*. Comprehensions are a compact way to construct a data collection "all at once" rather than building them up incrementally within a loop.
As an example, consider the cell below that opens a file, reads the lines, and splits each line into a list of words.

In [None]:
the_words = '''We choose to go to the moon in this decade and do the other things, 
not because they are easy, but because they are hard, 
because that goal will serve to organize and measure the best of our energies and skills, 
because that challenge is one that we are willing to accept, one we are unwilling to postpone, 
and one which we intend to win.'''

wordlist=the_words.replace(',','').split()
print(wordlist)

Suppose we wish to construct a new list from `wordlist` that consists only of words that contain the letter `s`. To do this, we initialize an empty list and nest an `if` block within a `for `loop to build up the new list.

In [None]:
# Nested if block within for loop to build new list
newlist = []
for word in wordlist:
    if 's' in word:
        newlist.append(word)
newlist

In [None]:
# A list comprehension equivalent to the loop above
[word for word in wordlist if 's' in word]

The *list comprehension* replaces the second `for` loop to build up `new_list`. The basic form is

```python
[expression for item in collection if condition]
```

where *`collection`* is some iterable data collection, *`condition`* is a boolean-valued expression that can be applied to each `item` in *`collection`*, and *`expression`* is computation that can be applied to each express `item` in *`collection`*. This general list comprehension is equivalent to a `for` loop:

```python
new_list = []
for item in collection:
    if condition:
        new_list.append(expression)
```

In [None]:
# expressions can be used to apply transformations
[word.upper() for word in wordlist if 's' in word]

Comprehensions are sometimes confusing for newcomers to Python. Part of the appeal of comprehensions is that they transform a conditional block nested within a loop into an almost colloquial expression, i.e., they read almost like spoken English. Comprehensions are not restricted to lists as well.

* A *set comprehension* is a similar construction used to build up a set:

```python
new_set = {expression for item in collection if condition}
```

* A *dict comprehension* builds a dictionary in a similar way:

```python
new_dict = {key_expression:value_expression for item in collection if condition}
```

where *`key_expression`* and *`value_expression`*  depend on the `item` selected from *`collection`*.

In [None]:
# A set comprehension
{word for word in wordlist if 's' in word}

In [None]:
# A dict comprehension
{pos:word for pos, word in enumerate(wordlist) if 's' in word}

## Budget Example

Looking back at our budget data structure from above we can write a number of useful comprehensions. This comprehension asks the same question as our full loop above. In what years did `edu` and `research` exceed 3% of the total federal budget.

In [None]:
federal=[3457079000., 3603059000., 3536951000., 3454647000., 3506089000., 3758577000., 3999467000.]
edu=[ 94169000.,  67584000.,  59605000.,  41882000.,  60917000., 104189000., 69939000.]
research=[11730000., 12434000., 12458000., 12479000., 12011000., 12271000., 12824000.]
soc=[706737000., 730811000., 773290000., 813551000., 850533000., 896294000., 944338000.]
defense=[666703000., 678064000., 650851000., 607795000., 577897000., 567703000., 586479000.]
years=[2010,2011,2012,2013,2014,2015,2016]

budget = []
for year, f, r, s, e, d, in zip(years,federal,research,soc,edu,defense):
    # Each year is a dictionary
    budget.append({
        'year':year,
        'federal':f,
        'research':r,
        'soc':s,
        'edu':e,
        'defense':d
    })

In [None]:
[year['year'] for year in budget if (year['edu']+year['research'])/year['federal'] > 0.03]

Using nested comprehensions we can make a `set` of all Subfuction Title's that have exceeded 1% of the federal budget across every year available.

In [None]:
{subfunc for year in budget 
         for subfunc in (set(year.keys()) - {'year','federal'}) 
         if year[subfunc]/year['federal']>0.01}

The expression `set(year.keys()) - {'year','federal'}` takes set of keys, removes `year` and `federal` and returns a new set. This could have been achieved by adding `and subfunc not in ['year','federal']` to the `if` clause.  Note that in Python 3 `set(year.keys())` could be simplified to `year.keys()` because `keys()` returns a set in Python 3 but a list in Python 2.

In [None]:
{subfunc for year in budget 
         for subfunc in year.keys() 
         if year[subfunc]/year['federal']>0.01 and subfunc not in ['year', 'federal']}

We can easily add more data to our budget data structure by preparing new dictionaries and inserting them into the  list. Notice that all of the dictionaries in the list don't have the same number of keys.

In [None]:
fy2000={
    "year":2000,
    'medicare':197113000.,
    'edu':30693000.,
    'federal':1788950000.,
    'research':6167000.,
    'soc':409423000.
}

budget.insert(0,fy2000)

fy2003={
    'year':2003,
    'federal':2159899000.,
    'defense':387136000.,
    'soc':474680000.
}

budget.insert(1,fy2003)

Without actually having to know all of the keys in the individual dictionaries we were able to discover that *medicare* had also contributed to more than 1% of the total budget for the provided years.

In [None]:
[year['year'] for year in budget]

In [None]:
{subfunc for year in budget
         for subfunc in (set(year.keys()) - {'year','federal'}) 
         if year[subfunc]/year['federal']>0.01}

Our nested comprehension also allows for two `if` clauses. Here we are only searching for Subfunction Titles in years after 2010.

In [None]:
{subfunc for year in budget if year['year'] > 2010
         for subfunc in (set(year.keys()) - {'year','federal'}) 
         if year[subfunc]/year['federal']>0.01}