# Introduction

# Iterables

An **iterable** is an object capable of returning **its members one by one**. Said in other words, an iterable is anything that you can loop over with a *for* loop in Python.

## Sequences

Sequences are a very common type of iterable. Some examples for built-in sequence types are **lists, strings, and tuples**.

In [102]:
numbers = [10, 12, 15, 18, 20]
fruits = ["pineapple", "apple", "lemon", "strawberry", "orange", "kiwi"]
message = "I love Python ❤️"

for num in numbers:
    print(num)

10
12
15
18
20


### Extracting Subset Sequence

You can use indexes to get element or elements from the sequence. In Python, the indexes start from 0. Therefore, the first element in the list will have an index 0. We can also use **negative** indexes to access elements. The last element in the sequence will have an index **-1**, the one before the last one will have an index **-2** and so on. We have also something called slicing in Python which can be used to get multiple elements from the sequence. We can use it like this: sliceable[start_index: end_index: step].


1. The **start_index** is the beginning index of the slice, the element at this index will be **included** to the result, the default value is 0
2. The **end_index** is the end index of the slice, the element at this index will **not be included** to the result, the default value will be the length of the list. Also, the default value can be "length of the list - 1" if the step is negative. If you skip this, you will get all the elements from the start index to the end.
3. The **step** is the amount by which the index increases, the default value is 1. If we set a **negative value** for the step, we’ll move backward.


**[Indice Access]** They support efficient element access using integer indices via the **\__getitem()\__** special method (indexing) and define a **\__length()\__** method that returns the length of the sequence

In [103]:
print(numbers[0])
print(fruits[2])
print(message[-2])

10
lemon
❤


**[Slicing Access]** Also, we can use the slicing technique on them.

In [104]:
# Slicing the sequences
print(f"numbers[:2] -> numbers[:2]")
print(f"numbers[0:2] -> {numbers[0:2]}")
print(f"numbers[0:4:2] -> {numbers[0:4:2]}")
print(f"numbers[4:0:-1] -> {numbers[4:0:-1]}")
print(f"fruits[1:] -> {fruits[1:]}")
print(f"fruits[::] -> {fruits[::]}")
print(f"fruits[0:2] -> {fruits[0:2]}")
print(f"fruits[-2:-1] -> {fruits[-2:-1]}")
print(f"fruits[3:] -> {fruits[3:]}")
print(f"fruits[:4] -> {fruits[:4]}")
print(f"fruits[:]: {fruits[:]}")
print(f"fruits[::-1] -> {fruits[::-1]}")
print(f"fruits[::-2] -> {fruits[::-2]}")
print(f"fruits[::2] -> {fruits[::2]}")

# Understanding some default values
print(f"fruits[0:6:1] -> {fruits[0:6:1]}")
print(f"fruits[-1:-7:-1] -> {fruits[-1:-7:-1]}")

numbers[:2] -> numbers[:2]
numbers[0:2] -> [10, 12]
numbers[0:4:2] -> [10, 15]
numbers[4:0:-1] -> [20, 18, 15, 12]
fruits[1:] -> ['apple', 'lemon', 'strawberry', 'orange', 'kiwi']
fruits[::] -> ['pineapple', 'apple', 'lemon', 'strawberry', 'orange', 'kiwi']
fruits[0:2] -> ['pineapple', 'apple']
fruits[-2:-1] -> ['orange']
fruits[3:] -> ['strawberry', 'orange', 'kiwi']
fruits[:4] -> ['pineapple', 'apple', 'lemon', 'strawberry']
fruits[:]: ['pineapple', 'apple', 'lemon', 'strawberry', 'orange', 'kiwi']
fruits[::-1] -> ['kiwi', 'orange', 'strawberry', 'lemon', 'apple', 'pineapple']
fruits[::-2] -> ['kiwi', 'strawberry', 'apple']
fruits[::2] -> ['pineapple', 'lemon', 'orange']
fruits[0:6:1] -> ['pineapple', 'apple', 'lemon', 'strawberry', 'orange', 'kiwi']
fruits[-1:-7:-1] -> ['kiwi', 'orange', 'strawberry', 'lemon', 'apple', 'pineapple']


## Other Iterables
Many things in Python are iterables, but not all of them are sequences. **Dictionaries, file objects, sets, and generators** are all iterables, but none of them is a sequence.

In [105]:
my_set = {2, 3, 5}
my_dict = {"name": "Ventsislav", "age": 24}
# my_file = open("*.*")
squares = (n**2 for n in my_set)

## Python’s for loops don’t use indices

By default, for loop dose not use indices to iterate the objects. If the data is a kind of sequence, you can use follow snippet code to generate indices and access each element using indices style.

In [106]:
index = 0
numbers = [1, 2, 3, 4, 5]
while index < len(numbers):
    print(numbers[index])
    index += 1

1
2
3
4
5


However, what about the non-sequence objects? They don’t support indexing, so this approach will not work for them.

In [107]:
index = 0
numbers = {1, 2, 3, 4, 5} # a set, rather than a sequence
while index < len(numbers):
    print(numbers[index])
    index += 1

TypeError: 'set' object is not subscriptable

Hmmm, but how the Python’s for loop works on these iterables then? We can see that it works with sets.

In [108]:
numbers = {1, 2, 3, 4, 5}
for number in numbers:
    print(number)

1
2
3
4
5


# Iterators

An iterator is an object representing a stream of data. You can create an iterator object by applying the **iter()** built-in function to an **iterable**.

In [109]:
numbers = [10, 12, 15, 18, 20]
fruits = ("apple", "pineapple", "blueberry")
message = "I love Python ❤️"

print(iter(numbers))
print(iter(fruits))
print(iter(message))

<list_iterator object at 0x00000234B85455C8>
<tuple_iterator object at 0x00000234B8545108>
<str_iterator object at 0x00000234B8545E48>


### How does a Iterator work? 

You can use an iterator to manually loop over the iterable it came from. 

1. You can use the **iter()** built-in function to an **iterable** to create an iterator. 
2. A repeated passing of the iterator to the built-in function **next()** returns successive items in the stream. For example
3. Once, when you consumed an item from the iterator, it’s gone. When no more data are available a **StopIteration** exception is raised.

In [110]:
values = [10, 20, 30]
iterator = iter(values) # Step 1, create an iterator.
print(next(iterator)) # Step 2. Get next element from the iterator.
print(next(iterator))
print(next(iterator))
print(next(iterator)) # Step 3. A StopIteration is raised because of all elements have been comsumed.

10
20
30


StopIteration: 

### Additional Notes on Iterators

#### The Relationship and Deffierence between itarable and iterator

1. An **iterable** is something you can loop over.
2. An **iterator** is an object representing a stream of data. It does the iterating over an **iterable**.
3. Additionally, in Python, the iterators are also iterables which act as their own iterators.
4. However, the difference is that iterators don’t have some of the features that some iterables have. **They don’t have length and can’t be indexed.**





In [111]:
numbers = [100, 200, 300]
iterator = iter(numbers)
print(len(iterator)) ## There is no length of an iterator.

TypeError: object of type 'list_iterator' has no len()

In [112]:
numbers = [100, 200, 300]
iterator = iter(numbers)
print(iterator[0]) ## Iterator cannot be indexed

TypeError: 'list_iterator' object is not subscriptable

#### Iterators are lazy

Iterators allow us to both work with and create **lazy iterables** that don’t do any work until we ask them for their next item.

Because of their laziness, the iterators can help us to deal with following two scnerios: 

1. Infinitely long iterables
2. Loading and iterating a huge size of data in a iterator manner to save us a lot of memory and CPU time. Espeically, there is no enough memory to hold entire datasets. Since we can use an iterator which can give us the next item every time we ask it. 

Many people use Python to solve Data Science problems. In some cases, the data you work with can be very large. In this cases, we can’t load all the data in the memory.
The solution is to load the **data in chunks**, then perform the desired operation/s on each chunk, discard the chunk and load the next chunk of data. Said in other words we need to create an iterator. We can achieve this by using the *read_csv* function in pandas. We just need to specify the chunksize.

**Example: Loading Large DataSets**

In [113]:
import pandas as pd

# Initialize an empty dictionary
counts_dict = {}

# Iterate over the file chunk by chunk
for chunk in pd.read_csv("data/iris.csv", chunksize = 10):
    # Iterate over the "Species" column in DataFrame
    for entry in chunk["Species"]:
        if entry in counts_dict.keys():
            counts_dict[entry] += 1
        else:
            counts_dict[entry] = 1

# Print the populated dictionary
print(counts_dict)

{'Iris-setosa': 50, 'Iris-versicolor': 50, 'Iris-virginica': 50}


### Iterators are everywhere
We have seen some examples with iterators. Moreover, Python has many built-in classes that are iterators, which includes:

1. **enumerate**
2. **reversed**
3. **zip**
4. **map**
5. **filer**
6. **open**



In [114]:
## enumerate example
fruits = ("apple", "pineapple", "blueberry")
iterator = enumerate(fruits)
print(type(iterator))
print(next(iterator))

for index, ele in enumerate(fruits):
    print(index, ele)

<class 'enumerate'>
(0, 'apple')
0 apple
1 pineapple
2 blueberry


In [115]:
# Reversed examples
fruits = ("apple", "pineapple", "blueberry")
iterator = reversed(fruits)
print(type(iterator))
print(next(iterator))
for ele in reversed(fruits):
    print(ele)

<class 'reversed'>
blueberry
blueberry
pineapple
apple


In [116]:
# Zip example
numbers = [1, 2, 3]
squares = [1, 4, 9]
iterator = zip(numbers, squares)
print(type(iterator))
print(next(iterator))
print(next(iterator))

for n, s in zip(numbers, squares):
    print(n, s)

<class 'zip'>
(1, 1)
(2, 4)
1 1
2 4
3 9


In [117]:
# Map examples

numbers = [1, 2, 3, 4, 5]
squared = map(lambda x: x**2, numbers)
print(type(squared))
print(next(squared))
print(next(squared))

<class 'map'>
1
4


In [118]:
# Filter Example

numbers = [-1, -2, 3, -4, 5]
positive = filter(lambda x: x > 0, numbers)
print(type(positive))
print(next(positive))

<class 'filter'>
3


In [119]:
# Moreover, the file objects in Python are also iterators.
file = open("data/example.txt")
print(type(file))
print(next(file))
print(next(file))
print(next(file))
file.close()

<class '_io.TextIOWrapper'>
da

a

b



In [120]:
# We can also iterate over key-value pairs of a Python dictionary using the items() method.
my_dict = {"name": "Ventsislav", "age": 24}
iterator = my_dict.items()
print(type(iterator))
for key, item in iterator:
    print(key, item)

<class 'dict_items'>
name Ventsislav
age 24


In [121]:
def custom_for_loop(iterable, action_to_do):
    iterator = iter(iterable) # Step 1
    done_looping = False
    while not done_looping:
        try:
            item = next(iterator) # Step 2
        except StopIteration:
            done_looping = True # Step 4
        else:
            action_to_do(item) # Step 3

In [122]:
# Let’s try to use this function with a set of numbers and the print built-in function.

numbers = {1, 2, 3, 4, 5}
custom_for_loop(numbers, print)

1
2
3
4
5


We can see that the function we’ve defined works very well with sets, which are not sequences. This time we can pass **any iterable** and it will work. 

Under the hood, all forms of **looping over iterables** in Python is working this way.

## Creating a custom iterator with defining a Class

## Defining a iterator by conventional way

In some cases, we may want to create a **custom** iterator. We can do that by defining a class that has **\__init\__**, **\__next\__**, and **\__iter\__** methods. Let’s try to create a custom iterator class that generate numbers between min value and max value.

In [123]:
class generate_numbers:
    def __init__(self, min_value, max_value):
        self.current = min_value
        self.high = max_value

    def __iter__(self):
        return self

    def __next__(self):
        if self.current > self.high:
            raise StopIteration
        else:
            self.current += 1
            return self.current - 1

In [124]:
numbers = generate_numbers(40, 50)
print(type(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))

<class '__main__.generate_numbers'>
40
41
42


## Generator Functions and Generator Expressions

Usually, we use a generator function or generator expression when we want to create a custom iterator. They are simpler to use and need less code to achieve the same result.

### Generator Functions

**[Definition]** A function which returns a generator iterator. It looks like a normal function except that it contains **yield** expressions for producing a series of values usable in a **for-loop** or that can be retrieved one at a time with the **next()** function.

Now, we can try to re-create our custom iterator using a generator function.

In [125]:
def generate_numbers(min_value, max_value):
    while min_value < max_value:
        yield min_value
        min_value += 1

In [126]:
numbers = generate_numbers(10, 20)
print(type(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))

<class 'generator'>
10
11
12


The yield expression is the thing that separates a generation function from a normal function. This expression is helping us to use the iterator’s laziness.

**[Definition]** Each *yield* temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the generator iterator resumes, it picks up where it left off (in contrast to functions which start fresh on every invocation).

### Generator Expressions

The **generator expressions** are very similar to the **list comprehensions**. Just like a list comprehension, the general expressions are concise. In most cases, they are written in one line of code.

**[Definition]** An expression that returns an iterator. It looks like a normal expression followed by a *for* expression defining a loop variable, range, and an optional *if* expression.

The **general formula** is:  (*output expression* **for** *iterator variable* **in** *iterable*)

Let’s see how we can define a simple generator expression.

In [127]:
numbers = [1, 2, 3, 4, 5]
squares = (number**2 for number in numbers)
print(type(squares))
print(next(squares))
print(next(squares))
print(next(squares))

<class 'generator'>
1
4
9


We can also add a **conditional expression** on the iterable. We can do it like this:

In [128]:
numbers = [1, 2, 3, 4, 5]
squares = (number**2 for number in numbers if number % 2 == 0)
print(type(squares))
print(list(squares))

<class 'generator'>
[4, 16]


They can be **multiple conditional expressions** on the **iterable** for more complex filtering.

In [129]:
numbers = [1, 2, 3, 4, 5]
squares = (number**2 for number in numbers if number % 2 == 0 if number % 4 == 0)
print(type(squares))
print(list(squares))

<class 'generator'>
[16]


Also, we can add an **if-else clause** on the **output expression** like this:

In [130]:
numbers = [1, 2, 3, 4, 5]
result = ("even" if number % 2 == 0 else "odd" for number in numbers)
print(type(result))
print(list(result))

<class 'generator'>
['odd', 'even', 'odd', 'even', 'odd']


# Understanding how Python’s for loop works

Under the hood, Python’s for loop is using **iterators**. Now, we know what the iterables and iterators are and how to use them. We can try to define a function that loops through an iterable **without using a for loop**. We can achieve it using following procedure:
1. Create an iterator from the given iterable
2. Repedeatly get the next item from the iterator
3. Execute the wanted action
4. Stop the looping, if we got a StopIteration exception when we’re trying to get the next item

# Comprehension

Comprehensions in Python provide us with a short and concise way to construct new sequences (such as lists, set, dictionary etc.) using sequences which have been already defined. This idea is borrowed from the functional programming language Haskell. Python supports the following 4 types of comprehensions:

1. List comprehension
2. Set comprehension
3. Dictionary Comprehensions
4. Generator comprehension


Usually, the formal definition is 

```
[( | { | [ ] expression for expr in sequence1
             if condition1
             for expr2 in sequence2
             if condition2
             for expr3 in sequence3 ...
             if condition3
             for exprN in sequenceN
             if conditionN [] | } | )]
```
To put it another way, comprehensions are equivalent to the following Python code:
```
for expr1 in sequence1:
    if not (condition1):
        continue   # Skip this element
    for expr2 in sequence2:
        if not (condition2):
            continue   # Skip this element
        ...
        for exprN in sequenceN:
            if not (conditionN):
                continue   # Skip this element

            # Output the value of
            # the expression.
```



## List comprehension

List Comprehensions provide an elegant way to create new lists. The following is the basic structure of a list comprehension:

output_list = **[output_exp for var in input_list if (var satisfies this condition)]**

Note that list comprehension may or may not contain an if condition. List comprehensions can contain multiple for (nested list comprehensions).

In [131]:
set_list = {1, 2, 3, 4, 5}
new_list = [i + 1 for i in set_list]
print(f"new_list: {type(new_list)} -> {new_list}")

new_list: <class 'list'> -> [2, 3, 4, 5, 6]


## Dictionary Comprehensions

Extending the idea of list comprehensions, we can also create a dictionary using dictionary comprehensions. The basic structure of a dictionary comprehension looks like below.

output_dict = **{key:value for (key, value) in iterable if (key, value satisfy this condition)}**

In [132]:
set_list = {1, 2, 3, 4, 5}
dict_using_comp = {var:var ** 3 for var in set_list if var % 2 != 0}
print(f"[1] dict_using_comp: {type(dict_using_comp)} -> {dict_using_comp}")


state = ['Gujarat', 'Maharashtra', 'Rajasthan']
capital = ['Gandhinagar', 'Mumbai', 'Jaipur']
dict_using_comp = {key:value for (key, value) in zip(state, capital)} 
print(f"[2] dict_using_comp: {type(dict_using_comp)} -> {dict_using_comp}")

[1] dict_using_comp: <class 'dict'> -> {1: 1, 3: 27, 5: 125}
[2] dict_using_comp: <class 'dict'> -> {'Gujarat': 'Gandhinagar', 'Maharashtra': 'Mumbai', 'Rajasthan': 'Jaipur'}


## Set Comprehensions
Set comprehensions are pretty similar to list comprehensions. The only difference between them is that set comprehensions use curly brackets { }.

output_set = **{output_exp for var in input_list if (var satisfies this condition)}**

In [133]:
set_list = {1, 2, 3, 4, 5}
new_set_list = {i + 1 for i in set_list}
print(f"new_set_list: {type(new_set_list)} -> {new_set_list}")

new_set_list: <class 'set'> -> {2, 3, 4, 5, 6}


## Generator Comprehensions

Generator Comprehensions are very similar to list comprehensions. One difference between them is that generator comprehensions use circular brackets whereas list comprehensions use square brackets. The major difference between them is that generators don’t allocate memory for the whole list. Instead, they generate each value one by one which is why they are memory efficient. 

output_generator = 

Let’s look at the following example to understand generator comprehension:

In [134]:
set_list = {1, 2, 3, 4, 5}
new_list_generator = (i + 1 for i in set_list)
print(f"new_list_generator: {type(new_list_generator)} -> {next(new_list_generator)}")

new_list_generator: <class 'generator'> -> 2


# Summary
1. List comprehensions provide us with a simple way to create a list based on some iterable.
2. The comprehensions are more efficient than using a for a loop.
3. We can use conditional statements in the comprehensions.
4. Comprehensions are a good alternative to the built-in map and filter functions.
5. We can have nested comprehensions.
6. In Python, we have also dictionary comprehensions and set comprehensions.
7. Generator expressions are preferable when we work with an infinite stream of a very large amount of data.
8. An iterable is something you can loop over.
9. Sequences are a very common type of iterable
10. Many things in Python are iterables, but not all of them are sequences.
11. An iterator is an object representing a stream of data. It does the iterating over an iterable. You can use an iterator to get the next value or to loop over it. Once, you loop over an iterator, there are no more stream values.
12. Iterators use the lazy evaluation approach.
13. Many built-in classes in Python are iterators.
14. A generator function is a function which returns an iterator.
15. A generator expression is an expression that returns an iterator.