# Generators and Map-Reduce
### Generator 
In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators.[1] A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator. 

An example from wikipedia for learning purposes:

In [19]:
from typing import Iterator
#The Iterator[int] type hint is used to indicate that the return value is an iterator that generates integers.
def countfrom(n: int) -> Iterator[int]: # This part specifies the return type annotation.
    while True:
        yield n
        n += 1

# Example use: printing out the integers from 10 to 20.
# Note that this iteration terminates normally, despite
# countfrom() being written as an infinite loop.

for i in countfrom(10):
    if i <= 15:
        print(i)
    else:
        break

10
11
12
13
14
15


In [17]:
squares = (n * n for n in countfrom(2))

for j in squares:
    if j <= 20:
        print(j)
    else:
        break

4
9
16


In [18]:

# Another generator, which produces prime numbers indefinitely as needed.
import itertools

def primes() -> Iterator[int]:
    """Generate prime numbers indefinitely as needed."""
    yield 2
    n = 3
    p = []
    while True:
        # If dividing n by all the numbers in p, up to and including sqrt(n),
        # produces a non-zero remainder then n is prime.
        if all(n % f > 0 for f in itertools.takewhile(lambda f: f * f <= n, p)):
            yield n
            p.append(n)
        n += 2

prime_generator = primes()
for _ in range(5):
    print(next(prime_generator))

2
3
5
7
11


In [13]:
# This one is used to find whether a number is prime
number = 17
prime_generator = primes()
is_prime = number in itertools.islice(prime_generator, number)
print(is_prime)

True


In [14]:
# make a list of prime numbers
prime_generator = primes()
prime_list = [next(prime_generator) for _ in range(15)]
prime_list

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

### Maps
map() provides an alternative approach that’s based in functional programming. You pass in a function and an iterable, and map() will create an object. This object contains the output you would get from running each iterable element through the supplied function.

In [29]:
txns = [1.09, 23.56, 57.84, 4.56, 6.78]
TAX_RATE = .08
def get_price_with_tax(txn):
    return txn * (1 + TAX_RATE)
final_prices = map(get_price_with_tax, txns)
list(final_prices)

[1.1772000000000002, 25.4448, 62.467200000000005, 4.9248, 7.322400000000001]

However, one can rewrite the map into a list comprehension.

In [30]:
txns = [1.09, 23.56, 57.84, 4.56, 6.78]
TAX_RATE = .08

final_price = [txn * (1 + TAX_RATE) for txn in txns]
final_price

[1.1772000000000002, 25.4448, 62.467200000000005, 4.9248, 7.322400000000001]

### List Comprehensions vs. For Loops
List comprehensions are not always the fastest case. In fact, they are the fastest when one wants to make a list. in this scenario the act of appending values to the list takes more time than using list comprehension. For mor information one can use this link:

https://towardsdatascience.com/list-comprehensions-vs-for-loops-it-is-not-what-you-think-34071d4d8207

In [20]:
import time
iterations = 100000000
start = time.time()
mylist = []
for i in range(iterations):
    mylist.append(i+1)
end = time.time()
print(end - start)

start = time.time()
mylist = [i+1 for i in range(iterations)]
end = time.time()
print(end - start)

16.440829515457153
10.11851978302002


 Howeve, when one only wants to perform some computations (or call an independent function multiple times) and do not want to create a list, the list comprehension is slower.

In [23]:
start = time.time()
for i in range(iterations):
    i+1
end = time.time()
print(end - start)

start = time.time()
[i+1 for i in range(iterations)]
end = time.time()
print(end - start)

8.955711364746094
9.692799091339111


Do not try to make a numpy array using for-loops. It may take years :).

In [None]:
# this takes for ever
import numpy as np
myarray = np.array([])
for i in range(iterations):
    myarray = np.append(myarray, i+1)

what’s faster than a for loop or a list comprehension? Array computations! Actually, it is a bad practice in Python to use for loops, list comprehensions, or .apply() in pandas. Instead, you should always prefer array computations.

In [24]:
start = time.time()
mylist = list(range(iterations))
end = time.time()
print(end - start)

3.1701016426086426


The comparison of Chat GPT:

List comprehensions:

1. Use list comprehensions when you want to create a new list by applying a transformation or filtering to an existing iterable.
2. List comprehensions are concise and provide a compact way to generate lists in a single line of code.
3. They are often used when you need to perform a simple transformation or filtering operation on every element of an iterable.
4. List comprehensions can be more readable and expressive than writing equivalent loops.

Loops (for/while loops):

1. Use loops when you need to repeatedly execute a block of code for a specific number of iterations or until a certain condition is met.
2. Loops are more flexible and can handle complex control flow situations that cannot be easily expressed with list comprehensions.
3. They are suitable when you need to perform more complex operations that involve multiple statements or conditions.
4. Loops can also be used to iterate over an iterable without necessarily creating a new list.

In general, list comprehensions are preferred when the task involves transforming or filtering an iterable to create a new list. They are concise, readable, and often more efficient. However, if the task requires complex logic, multiple statements, or control flow that cannot be easily expressed in a single line, using a loop is more appropriate.