# Comprehending Comprehensions 

Based on the PyCon 2023 &#x1F40D; tutorial "Reuven M. Lerner: Comprehending comprehensions" 

&#x1F3AC; https://www.youtube.com/watch?v=qMv1ZD2V1A4 

&#x1F47E; https://github.com/reuven/PyCon-04April-19-comprehensions

## &#x1F4D6; Contents  

1. [What are comprehensions?](#what_comps) <br>

2.  [List comprehensions](#list_comps) <br>
2.1 [When to use a loop vs. a comprehension?](#when_comp) <br>
2.2 [Exercises](#exs) <br>
2.3 [Common mistakes](#mistakes) <br>


3.  [List comprehensions and files](#list_and_files) <br>
3.1 [Exercise: Sum numbers](#ex_sum_nums) <br>
3.2 [Exercise: Shoe Dicts](#ex_shoe_dicts) <br>
3.3 [Counter](#counter) <br>
4. [Set comprehensions](#set_comps) <br>
4.1 [Exercise: Sum unique numbers](#ex_sum_unique_nums) <br>
4.2 [Exercise: Which shells?](#ex_shells)
5. [Dictionary comprehensions](#dict_comps) <br>
5.1 [Exercise: Usernames and shells](#ex_names_shells)
6. [Nested list comprehensions](#nested_comps) <br>
6.1 [Exercise: Movie genres](#ex_movie) <br>
7. [Generator expressions](#gen_exp) <br>
7.1 [What's a generator?](#what_gen) <br>
7.2 [Scope](#scope)

# &#x1F4A1;   What are comprehensions? <a class="anchor" id="what_comps"></a>

In [1]:
# Suppose I have a list of integers 
# and I want to create a list of those integers^2

numbers = list(range(10))
numbers

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [2]:
output = []

for one_number in numbers:
    output.append(one_number ** 2)
    
output

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [3]:
# Is there an alternative method?
# Yes! List comprehensions

[one_number ** 2 for one_number in numbers]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# &#x1F4CB;  List comprehensions <a id="list_comps"></a>

List comprehensions are easier to write (and understand) if we break them down and write them over multiple lines.

In a comprehension, the &#x1F947; __first__ thing that runs is the loop and the &#x1F948; __second__ thing is the expression.

The result of a list comprehension is a list. We have created a new list! And can pass it as an argument to a function or assign it to a variable.

This new list is the result of evaluating our expression on every element of the input list. Hence the output list will have the same number of elements as the input list.

In [4]:
[one_number ** 2             # expression -- can be any python expression!
 for one_number in numbers]  # iteration -- any python loop

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

> a python **expression** is anything that returns a value e.g. function call, method call, variable.

# &#x27B0; When to use a loop  vs. a comprehension? <a id="when_comp"></a>

The distinction is between getting a new value back vs. incurring side effects. 

Meaning: if you have an existing list, and you want a new list *and* you can describe a mapping from the first to the second, then use a list comprehension.

But if you are assigning and/or modifying repeatedly, then use a regular `for` loop.

In [5]:
# Suppose I have a list of strings...

mylist = ['abcd', 'ef', 'ghi']

# ...and I want a new list based on mylist with '*' between the elements

'*'.join(mylist)

'abcd*ef*ghi'

In [6]:
# What if I have a list of integers?

```python
mylist = [10, 20, 30]

'*'.join(mylist)
```

Bad news -- we get an error! &#x274C; This is because:

> `.join` expects it's argument to be an iterable of strings.

In [7]:
# We have: a list of integers
# We want: a list of strings
# We can convert one int to one string with str()

[str(one_item)
for one_item in mylist]

['abcd', 'ef', 'ghi']

In [8]:
'*'.join([str(one_item)
         for one_item in mylist])

'abcd*ef*ghi'

In [9]:
# Suppose I have a string and
# I want to capitalise the start of each word

s = 'I caught a horse mackerel! Nay!? Yay!'

s.title()

'I Caught A Horse Mackerel! Nay!? Yay!'

In [10]:
# Cool. But what if str.title didn't exist? 
# Could I still do something like this? Yes!

# What if I were to break the string
# into individual words?

# I have: a list of strings
# I want: a list of strings whose first letters are capitalised
# I can use: str.capitalize

[one_word.capitalize()
for one_word in s.split()]

['I', 'Caught', 'A', 'Horse', 'Mackerel!', 'Nay!?', 'Yay!']

In [11]:
' '.join([one_word.capitalize()
            for one_word in s.split()])

'I Caught A Horse Mackerel! Nay!? Yay!'

> Use comprehensions to break big things apart &#x1F4A3; , apply an expression to each element &#x1F3AD; and then put them back together &#x1FA79;. 

> `.split` and `.join` are our best friends! &#x1F496;

# &#x1F4DD; Exercises: <a id="exs"></a>

1. Ask the user to enter a string containing numbers, separated by spaces. Add those numbers together (as integers) and print the result. It's OK to use the built-in `sum` function.

2. Ask the user to enter a string and print the length of the string -- except for whitespace. It's *not* OK to use `str.replace()`.

But first a note on `.strip()`

> `.strip()` removes the whitespace from either side of a string but **not** the whitespace in the middle.

In [12]:
# Example
s= '   z o e   '
s.strip()

'z o e'

In [13]:
# 1. s = input('Enter numbers, separated by whitespace: ').strip()
# This is taking ages to run...improvise!

s = "4 13 25 57 89"

# sum(s)

> Summing `s` yields an error &#x274C; because python is trying to do $0$ $+$ `str`

In [14]:
# What I have: a list of strings, containing digits
# What I want: the sum of the integers in that string
# I can transform one to the other using int

[int(one_item)
for one_item in s.split()]

[4, 13, 25, 57, 89]

In [15]:
sum([int(one_item)
    for one_item in s.split()])

188

In [16]:
# 2. Find the lengths of the words (not the whitespaces) in the user's input
# still having issues...
# s = input('Enter a sentence: ').strip()

s = "I caught a walker cicada! These guys chirp funny!"

len(s) # how many characters in the entire sentence?

49

In [17]:
# How long is this, if we ignore the whitespace?

# If we use s.split(), we get a list of strings
# without any whitespace

# I have: a list of strings
# I want: the sum of their lengths
# I can apply: len

sum([len(one_word)
    for one_word in s.split()])

41

In [18]:
# SQL analogy

s = "4 13 25 57 89"

[int(one_item)             # expression -- SELECT
for one_item in s.split()] # iteration -- FROM

[4, 13, 25, 57, 89]

# &#x274C; **Common mistake!** <a id="mistakes"></a>

In [19]:
# I have a string with words 
# and I want to print each word with stars around it!

s = "c'est fantastique"

[print(f'*{one_word}*')
for one_word in s.split()]

*c'est*
*fantastique*


[None, None]

The list that a comprehension returns contains the values that the expression returned.

> `print` **always** returns `None`

Here, `print` worked OK but returned `None`, which affected our output comprehension. 

Don't use `print` inside a comprehension! &#x1F6AB;

In [20]:
# Correct version

[f'*{one_word}*'
for one_word in s.split()]

["*c'est*", '*fantastique*']

# &#x1F4C1; List comprehensions and files <a class="anchor" id="list_and_files"></a>

In [21]:
# How about iterating over file-like-objects?
# Use this technique with small files only!

#!type linux-etc-passwd.txt
[one_line
for one_line in open('linux-etc-passwd.txt')];

In [22]:
# Can I get the usernames from this password file?
# Yes!

# Each record contains fields
# fields are separated by ':'
# the first field is the username

[one_line.split(':')[0]
for one_line in open('linux-etc-passwd.txt')];

In [23]:
# SQL analogy extended

[one_line.split(':')[0]                      # expression -- SELECT
for one_line in open('linux-etc-passwd.txt') # iteration -- FROM
if ':' in one_line];                         # condition -- WHERE

# &#x1F4DD; Exercise: Sum numbers <a id="ex_sum_nums"></a>

Use a comprehension to read through `nums.txt`, and sum the numbers it contains.

Each line of the file contains either zero integers or one integer. The integer may have whitespace before and/or after it. 

In [24]:
!type nums.txt

5
	10     
	20
  	3
		   	20        

 25


In [25]:
# I have: a file whose lines (strings) contain numbers
# I want: a list of numbers
# transform from a string to an int with int()

sum([int(one_line)                      
    for one_line in open('nums.txt')
    if one_line.strip()])   

83

In [26]:
# Solution 2

sum([int(one_line)
    for one_line in open('nums.txt')
    if one_line.strip().isdigit()]) # removes whitespace in edges and asks 
                                    # "Hey string, are you truish or falsish?" 
                                    # returns True if not an empty string

83

In [27]:
# I want to know how many vowels are in a string
# How can I use comprehension for this?

s = 'I caught a zebra turkeyfish! No gobbling those spines!'

sum([1
    for one_character in s
    if one_character in 'aeiou'])

15

# &#x1F4DD; Exercise: Shoe dicts  <a id="ex_shoe_dicts"></a>

`shoe-data.txt` contains 100 lines. Each line contains three fields: brand, colour and size. Fields are separated by tabs `('\t')`.

Use a list comprehension to turn this file into a list of dictionaries. Each line should be turned into a dict whose keys are `brand`, `color` and `size`. The values can remain strings i.e. don't worry about the shoe sizes.

I recommend that you write an external function that takes a string as input and returns a dict, then invoke this in your comprehension.

The result will be a list like

```python 
[
    {'brand' : 'Hoka',
    'color' : 'peach',
    'size' : '38'},
    ...
]
```

In [28]:
# A simple (not working) approach
# !type shoe-data.txt

def line_to_dict(one_line):
    return one_line.strip().split('\t')

[line_to_dict(one_line)
for one_line in open('shoe-data.txt')];

In [29]:
# A simple (working) approach

def line_to_dict(one_line):
    fields = one_line.strip().split('\t')
    return {'brand' : fields[0],
           'color' : fields[1],
           'size' : fields[2]}

[line_to_dict(one_line)
for one_line in open('shoe-data.txt')];

In [30]:
# Improve using unpacking

def line_to_dict(one_line):
    brand, color, size = one_line.strip().split('\t')
    return {'brand' : brand,
           'color' : color,
           'size' : size}

[line_to_dict(one_line)
for one_line in open('shoe-data.txt')];

In [31]:
# I want to retrieve all of the IP addresses from mini-access-log.txt

# !type mini-access-log.txt

[one_line.split()[0]
for one_line in open('mini-access-log.txt')];

# &#129518; Counter <a id="counter"></a>

In [32]:
# How many times did each IP address access my server?
# Use Counter

from collections import Counter

In [33]:
# The WRONG way to use Counter
# is as a cheap default dict

c = Counter()
c['a'] += 5
c['b'] += 3
c

Counter({'a': 5, 'b': 3})

In [34]:
# The CORRECT way to use Counter is to 
# initialise it with an iterable

# Counter will count how many times each element
# of that iterable exists. 
# Each element becomes a key and
# the number of times becomes the value

c = Counter([one_line.split()[0]
            for one_line in open('mini-access-log.txt')])

In [35]:
# Counter inherits from dict, so we can use dict methods

for key, value in c.items():
    print(f'{key} : {value}')

67.218.116.165 : 2
66.249.71.65 : 3
65.55.106.183 : 2
66.249.65.12 : 32
65.55.106.131 : 2
65.55.106.186 : 2
74.52.245.146 : 2
66.249.65.43 : 3
65.55.207.25 : 2
65.55.207.94 : 2
65.55.207.71 : 1
98.242.170.241 : 1
66.249.65.38 : 100
65.55.207.126 : 2
82.34.9.20 : 2
65.55.106.155 : 2
65.55.207.77 : 2
208.80.193.28 : 1
89.248.172.58 : 22
67.195.112.35 : 16
65.55.207.50 : 3
65.55.215.75 : 2


In [36]:
# Plot a "histogram"...
for key, value in c.items():
    print(f'{key:18} : {value * "x"}')

67.218.116.165     : xx
66.249.71.65       : xxx
65.55.106.183      : xx
66.249.65.12       : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
65.55.106.131      : xx
65.55.106.186      : xx
74.52.245.146      : xx
66.249.65.43       : xxx
65.55.207.25       : xx
65.55.207.94       : xx
65.55.207.71       : x
98.242.170.241     : x
66.249.65.38       : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
65.55.207.126      : xx
82.34.9.20         : xx
65.55.106.155      : xx
65.55.207.77       : xx
208.80.193.28      : x
89.248.172.58      : xxxxxxxxxxxxxxxxxxxxxx
67.195.112.35      : xxxxxxxxxxxxxxxx
65.55.207.50       : xxx
65.55.215.75       : xx


In [37]:
# This is a Counter method that returns
# a list of tuples from the 
# most common to the least, based on Counter
c.most_common()

[('66.249.65.38', 100),
 ('66.249.65.12', 32),
 ('89.248.172.58', 22),
 ('67.195.112.35', 16),
 ('66.249.71.65', 3),
 ('66.249.65.43', 3),
 ('65.55.207.50', 3),
 ('67.218.116.165', 2),
 ('65.55.106.183', 2),
 ('65.55.106.131', 2),
 ('65.55.106.186', 2),
 ('74.52.245.146', 2),
 ('65.55.207.25', 2),
 ('65.55.207.94', 2),
 ('65.55.207.126', 2),
 ('82.34.9.20', 2),
 ('65.55.106.155', 2),
 ('65.55.207.77', 2),
 ('65.55.215.75', 2),
 ('65.55.207.71', 1),
 ('98.242.170.241', 1),
 ('208.80.193.28', 1)]

In [38]:
# I can invoke most_common with an argument to
# get arg many elements in the output list

c.most_common(5)

[('66.249.65.38', 100),
 ('66.249.65.12', 32),
 ('89.248.172.58', 22),
 ('67.195.112.35', 16),
 ('66.249.71.65', 3)]

In [39]:
usernames = [one_line.split(':')[0]                      
            for one_line in open('linux-etc-passwd.txt') 
            if ':' in one_line] 

In [40]:
usernames

['root',
 'daemon',
 'bin',
 'sys',
 'sync',
 'games',
 'man',
 'lp',
 'mail',
 'news',
 'uucp',
 'proxy',
 'www-data',
 'backup',
 'list',
 'irc',
 'gnats',
 'nobody',
 'syslog',
 'messagebus',
 'landscape',
 'jci',
 'sshd',
 'user',
 'reuven',
 'postfix',
 'colord',
 'postgres',
 'dovecot',
 'dovenull',
 'postgrey',
 'debian-spamd',
 'memcache',
 'genadi',
 'shira',
 'atara',
 'shikma',
 'amotz',
 'mysql',
 'clamav',
 'amavis',
 'opendkim',
 'gitlab-redis',
 'gitlab-psql',
 'git',
 'opendmarc',
 'dkim-milter-python',
 'deploy',
 'redis']

In [41]:
# We can now search for usernames in this list using "in"

'news' in usernames

True

In [42]:
'daemon' in usernames

True

In [43]:
'zoe' in usernames

False

# &#x1F374; Set comprehensions  <a id="set_comps"></a>

In [44]:
# Lists are slow to search,  
# so maybe I should use a different data structure... 

# What searches faster? Sets.
# Sets: guarantee uniqueness in their members, searching is 
# fast and all elements are hashable -- just like dict keys

usernames = set([one_line.split(':')[0]                      
            for one_line in open('linux-etc-passwd.txt') 
            if ':' in one_line]) 

In [45]:
type(usernames)

set

In [46]:
# So we can use a set comprehension!

# look almost identical to a list comprehension
# but use {} instead

usernames = {one_line.split(':')[0]                      
            for one_line in open('linux-etc-passwd.txt') 
            if ':' in one_line}

In [47]:
usernames;

In [48]:
type(usernames)

set

In [49]:
'daemon' in usernames

True

# &#x1F4DD; Exercise: Sum unique numbers<a id="ex_sum_unique_nums"></a>

1. Ask the user to enter numbers, separated by whitespace
2. Print their sum, but count each number *once* only

Example:

    Enter numbers: 1 2 3 4 5 6
    Total is 21

In [50]:
s = input('Enter numbers: ').strip()

sum({int(one_item)
    for one_item in s.split()})

Enter numbers: 1 2 3 4 5 6


21

Creating a set vs. a list takes longer &#x23F2; because sets are hashable. But once a set exists, it's much faster to search them. 

> Use set comprehensions when you're searching lots of times &#x1F50D;  

> Use set comprehensions to answer the question: "How many unique blah blah blah...?" &#129518;

# &#x1F4DD;  Exercise: Which shells?<a id="ex_shells"></a>

Read through `linux-etc-passwd.txt` and find the different shells &#x1F41A; that are used on the system.

In [51]:
# Anti-solution

```python
{
    one_line.split(':')
    for one_line in open('linux-etc-passwd.txt')
    if ':' in one_line
}
```

Why doesn't this work? &#x274C;

Because `.split()` always returns a **list** of strings and set comprehensions cannot &#x1F6AB; contain things which are **unhashable** e.g. lists. 

> You can't use a list as a dictionary key &#8756; you can't use a list as an element in a set

In [52]:
# Solution
# Strategy: write it as a list comprehension then change [] to {}

{
    one_line.split(':')[-1].strip()
    for one_line in open('linux-etc-passwd.txt')
    if ':' in one_line
}

{'/bin/bash',
 '/bin/false',
 '/bin/nologin',
 '/bin/sh',
 '/bin/sync',
 '/usr/sbin/nologin'}

In [53]:
# I have a string with some words

# I want to create a dict wherein each word is the key
# and the word length is the value

s = "I caught a squid! Oh no I squidn't!"

# we can invoke dict() on a list of tuples
# and get back a dictionary!

dict([(one_word, len(one_word)) 
    for one_word in s.split()])

{'I': 1, 'caught': 6, 'a': 1, 'squid!': 6, 'Oh': 2, 'no': 2, "squidn't!": 9}

This works &#x1F197; - but we can do better. Enter dictionary comprehensions!

# &#x1F4D5; Dictionary comprehensions <a id="dict_comps"></a>

When writing a dict comprehension, we use `{}` just as with set comprehensions but now we have **two** expressions in the first line. These two expressions are separated by a colon `:` and this colon `:` is how python &#x1F40D; knows it's a dict, not a set.

In [54]:
{ one_word : len(one_word) # key : value expression
 for one_word in s.split()
}

{'I': 1, 'caught': 6, 'a': 1, 'squid!': 6, 'Oh': 2, 'no': 2, "squidn't!": 9}

In [55]:
# I'm going to create a really fast config file
# with name = value on each line

with open('myconfig.txt','w') as outfile:  # `with` is a context manager and hides two method calls -- magic methods
    # outfile.__enter__() 
    for index, one_character in enumerate('abcd', 1):
        outfile.write(f'{one_character}={index}\n')
    # outfile.__exit__() 

In [56]:
!type myconfig.txt

a=1
b=2
c=3
d=4


In [57]:
# I can use a dict comprehension to 
# read this file into a dict!

{ one_line.split('=')[0] : one_line.split('=')[-1].strip()
    for one_line in open('myconfig.txt')
}

{'a': '1', 'b': '2', 'c': '3', 'd': '4'}

# &#x1F4DD; Exercise: Usernames and shells <a id="ex_names_shells"></a>

Use a dict comprehension to create a dict, in which the keys &#x1F511; are usernames and the values are the shells &#x1F41A; associated with those usernames in `linux-etc-passwd.txt`.

In [58]:
# Start with a list-tuple combo

# [(one_line.split(':')[0], one_line.split(':')[-1].strip())                      
# for one_line in open('linux-etc-passwd.txt') 
# if ':' in one_line]

{ 
    one_line.split(':')[0] : one_line.split(':')[-1].strip()                      
    for one_line in open('linux-etc-passwd.txt') 
    if ':' in one_line
};

In [60]:
# Walrus solution 

{ 
    fields[0] : fields[-1].strip()                      
    for one_line in open('linux-etc-passwd.txt') 
    if ':' in one_line and (fields := one_line.split(':')) # need parentheses around second condition
};

> Summary: use a comp when I have an **iterable**, I want an **interable** and I can define a **mapping** &#x1F30D; between them

In [61]:
# How about a list of lists, where inner lists contain integers...?

mylist = [[1, 2, 3],
         [11, 12, 13, 14],
         [21, 22, 23, 24, 25],
         [31, 32, 33, 34, 35, 36]]

mylist

[[1, 2, 3], [11, 12, 13, 14], [21, 22, 23, 24, 25], [31, 32, 33, 34, 35, 36]]

In [62]:
# how can I sum the integers in this nested list?
# first guess: sum! (bad guess)

```python
sum(mylist)
```

In [63]:
# guess 2: use a comprehension!

[one_sublist
for one_sublist in mylist]

[[1, 2, 3], [11, 12, 13, 14], [21, 22, 23, 24, 25], [31, 32, 33, 34, 35, 36]]

But these results are identical! We've run into a problem previously stated: that the number of elements we get back is less than or equal to the input. 

So, how can we get more?

# &#x1F423; Nested list comprehensions <a id="nested_comps"></a>

In [64]:
# nested list comprehensions!

([one_number                        # third
    for one_sublist in mylist       # executed first
    for one_number in one_sublist]) # second 

[1, 2, 3, 11, 12, 13, 14, 21, 22, 23, 24, 25, 31, 32, 33, 34, 35, 36]

In [65]:
sum([one_number                        
    for one_sublist in mylist       
    for one_number in one_sublist])

372

In [66]:
# can we use if statements in nested comps? Of course!

[one_number                        
    for one_sublist in mylist
    if len(one_sublist) > 3    # only long sublists
    for one_number in one_sublist
    if one_number % 2]         # only if it's odd

[11, 13, 21, 23, 25, 31, 33, 35]

In [67]:
sum([one_number                        
    for one_sublist in mylist
    if len(one_sublist) > 3 
    for one_number in one_sublist
    if one_number % 2])

192

In [68]:
numbers = [10, 20, 30, 40, 45, 50, 55, 60, 70]

[one_number
 for one_number in numbers 
 if one_number > 40    # allowed > 1 if lines irrespective of no. of fors
 if one_number % 2]    # multiple if lines are and-ed together

[45, 55]

> Use nested comprehensions to unpack &#x1F4E6; complex data structures

In [69]:
# download the movies.dat file from here:
# https://files.lerner.co.il/advanced-exercises-files.zip

In [70]:
#!type movies.dat

# &#x1F4DD; Exercise: Movie genres <a id="ex_movie"></a>

Goal: Find out what the 5 most popular movie genres are in `movies.dat`

1. Use a nested comprehension to read through the file, find the appropriate fields and lines, and then use `Counter` to find the most common genres.

If a movie has more than one genre, each should be counted once.

Hint: You'll want to hand `Counter` a list of genres from the file. 

In [71]:
# Getting a UnicodeDecodeError so edited movies.dat

c = Counter([one_genre                        
             for one_line in open('movies.dat')        
             for one_genre in one_line.strip().split('::')[-1].split('|')])

c.most_common(5)

[('Drama', 1086),
 ('Comedy', 839),
 ('Action', 357),
 ('Thriller', 340),
 ('Romance', 313)]

# &#x26A1;&#x1F3ED; Generator expression  <a id="gen_exp"></a>

In [72]:
# does () = tuple comprehension?

(x**2
 for one_number in range(10))

<generator object <genexpr> at 0x00000134B79C6400>

Nope -- a generator!

# &#x1F4A1;  What's a generator? <a id="what_gen"></a>


A generator is an object in Python that knows how to behave inside a `for` loop -- because it's iterable.

The point &#x1F4CC; of a generator is that it doesn't return all of its elements at once. Rather, it returns only one at a time.

A generator expression works like a list comprehension except that, instead of returning a list with all of its elements, it returns a generator object. This object can be put into a `for` loop (or any other iterable context), and it will only run the expression when it is asked to, typically once per iteration.

In [73]:
g = (one_number**2                # 3. runs one iteration, returns one_number then goes to sleep
     for one_number in range(10))

In [74]:
for one_item in g: # 1. For loop says to g "hey -- are you iterable?" G says yes
    print(one_item) # 2. For loop says "give me your next thing g"

0
1
4
9
16
25
36
49
64
81


> Useful when we don't want everything returned at once &#x1F4BE; 

In [75]:
mylist = [10, 20, 30]

# we could use a list comprehension...
'*'.join([str(one_item)
         for one_item in mylist])

'10*20*30'

In [76]:
# ...or a generator expression
'*'.join((str(one_item)
         for one_item in mylist))

'10*20*30'

In [77]:
# generator expression -- with only one set of parentheses!
'*'.join(str(one_item)
         for one_item in mylist)

'10*20*30'

In [78]:
# Use case: read words from a file

# nested gen expression
g = (one_word 
    for one_line in open('wcfile.txt')
    for one_word in one_line.split())

total = 0
for w in g:
    total += len(w) # total the lengths of all words, no spaces
    
total

132

In [79]:
# You can define a generator with a "generator function"
# You can use yield to return a 
# value without exiting the function

def mygen():
    yield 10
    yield 20
    yield 30
    
g = mygen()

In [80]:
g

<generator object mygen at 0x00000134B685B3D0>

In [81]:
list(g) # List goes to g and runs a for loop on it

[10, 20, 30]

In [82]:
# You can do the same thing
# by writing a function that returns a generator expression

def mygen():
    return (one_number
           for one_number in [10, 20, 30])

In [83]:
mygen()

<generator object mygen.<locals>.<genexpr> at 0x00000134B79C7030>

# &#x1F52C; Scope <a id="scope"></a>

In [85]:
# what is x?

x = 100

for one_number in range(5):
    x = one_number * 3

print(x) 

12


Why? Because `x` and `one_number` are both in the global &#x1F30D; scope -- no function def = no new scope

In [86]:
# what is x?

x = 100

print([x * 3
      for x in range(5)])

x

[0, 3, 6, 9, 12]


100

> Comprehensions don't modify the outer &#x1F30C; scope

Let's look BTS &#x1F575;

In [87]:
def regular_loop():
    x =100
    
    for one_number in range(5):
        x = one_number * 3

In [88]:
import dis # disassembler for Python

In [89]:
dis.dis(regular_loop)

  1           0 RESUME                   0

  2           2 LOAD_CONST               1 (100)
              4 STORE_FAST               0 (x)

  4           6 LOAD_GLOBAL              1 (NULL + range)
             18 LOAD_CONST               2 (5)
             20 PRECALL                  1
             24 CALL                     1
             34 GET_ITER
        >>   36 FOR_ITER                 7 (to 52)
             38 STORE_FAST               1 (one_number)

  5          40 LOAD_FAST                1 (one_number)
             42 LOAD_CONST               3 (3)
             44 BINARY_OP                5 (*)
             48 STORE_FAST               0 (x)
             50 JUMP_BACKWARD            8 (to 36)

  4     >>   52 LOAD_CONST               0 (None)
             54 RETURN_VALUE


> `STORE_FAST` means `x` and `one_number` are **local** &#x1F3E0; variables


Compare to the comprehension version:

In [90]:
def comp_loop():
    x =100
    
    print([x * 3
          for x in range(5)])

In [91]:
dis.dis(comp_loop)

  1           0 RESUME                   0

  2           2 LOAD_CONST               1 (100)
              4 STORE_FAST               0 (x)

  4           6 LOAD_GLOBAL              1 (NULL + print)
             18 LOAD_CONST               2 (<code object <listcomp> at 0x00000134B7AA4B90, file "C:\Users\zoeda\AppData\Local\Temp\ipykernel_19608\3917933206.py", line 4>)
             20 MAKE_FUNCTION            0

  5          22 LOAD_GLOBAL              3 (NULL + range)
             34 LOAD_CONST               3 (5)
             36 PRECALL                  1
             40 CALL                     1

  4          50 GET_ITER
             52 PRECALL                  0
             56 CALL                     0
             66 PRECALL                  1
             70 CALL                     1
             80 POP_TOP
             82 LOAD_CONST               0 (None)
             84 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x00000134B7AA4B90, file "C:\Users\zoeda\AppData\L

&#x1f92f; ...


> ```python
18 LOAD_CONST               2 (<code object <listcomp> at 0x0000011F39B6C100, file    "C:\Users\zoeda\AppData\Local\Temp\ipykernel_11260\3917933206.py", line 4>)
             20 MAKE_FUNCTION            0
```


What is happening here? Well, when we define a list comprehension, we create a secret &#x1f92b; function BTS which has it's own local variables. Python &#x1F40D; then sticks &#x1FA79; this secret function as a constant on the function object.

This is our secret function:
```python

Disassembly of <code object <listcomp> at 0x0000011F39B6C100, file      "C:\Users\zoeda\AppData\Local\Temp\ipykernel_11260\3917933206.py", line 4>:
  4           0 RESUME                   0
              2 BUILD_LIST               0
              4 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                 7 (to 22)

  5           8 STORE_FAST               1 (x)

  4          10 LOAD_FAST                1 (x)
             12 LOAD_CONST               0 (3)
             14 BINARY_OP                5 (*)
             18 LIST_APPEND              2
             20 JUMP_BACKWARD            8 (to 6)
        >>   22 RETURN_VALUE
```

Note the `STORE_FAST` for `x` => `x` is local to the comprehension

# The End!

In [92]:
from IPython.display import Image
Image(url= 'https://media.giphy.com/media/g19pV2LZKPaj8BjwwJ/giphy.gif')