Item 32 Consider Generator Expressions for Large List Comprehensions

Things to Remember
- List comprehensions can cause problems for large inputs by using too much memory.
- Generator expressions avoid memory issues by producing outputs one at a time as iterators.
- Generator expressions can be composed by passing the iterator from one generator expression into the for subexpression of another.
- Generator expressions execute very quickly when chained together and are memory efficient.    


Problem with list comprehensions
- they create new list instances (in memory) containing one item for each value in the input sequences
- this might be problematic for large inputs which might lead to significant amount of memory usage and potential program crash  

In [None]:
value = [len(x) for x in open('my_file.txt')] # everything is in memory 
print(value)

In [None]:
# solution - using generator expressions
# - put list-comprehension-like syntax between () characters
#   to create a generator expression
# - the generator expression immediately evaluates to an 
#   iterator and doesn't make forward progress
it = (len(x) for x in open('my_file.txt'))
print(it)


In [None]:
print(next(it))
print(next(it))

In [None]:
# you can compose multiple generator expressions together
roots = ((x, x**0.5) for x in it)
# - when this iterator is advanced the 
#   interior iterator is also advanced
print(next(roots))

Benefits of composing multiple generator expressions

- It creates a domino effect of looping, evaluating conditional expressions, and passing around inputs and outputs.
- And all these will be done as memory efficient as possible.
- A great choice if you are looking for a way to compose functionality that's operating on a large stream of input.

Only gotcha:
- The iterators returned by generator expressions are stateful - Check Item 31 for more info  

