## For loop & List comprehension

We have studied the creation and modification of some built-in sequences (list, tuple, range). In reality, when we have a sequence of items, it is likely that we will need to perform some operation on every element, or apply a filter to find a subset of elements that matches our needs. Using `while`-loop can be a way, but it is not very good-looking. Luckily, Python has us covered with the beautiful-syntax `for` loop. Based on `for`-loop, Python offers List comprehension to further reduce the lines of code needed to be read while maintaining reasonable readability.

### Prelude

Before diving into `for` loops, let's start with a problem that requires us to work with every element within a sequence: Word counting. To perform word counting, we need to split the words by the word separator, which is usually a single space `" "`. In Python, the `str` class offers us the `.split()` method which we can use for spliting a string by a given delimiter. In this case, the delimiter is a single space `" "` so we came up with the following snippet:

In [17]:
s = """
This is an example paragraph. This is another sentence.
"""

words = s.split(" ")
word_count = len(words)
word_count

9

Does it work properly when we have a jammed Spacebar that registers the space character multiple times?

In [19]:
s = """
This is an example paragraph. This is    another sentence.
"""

words = s.split(" ")
word_count = len(words)
word_count

12

In [21]:
"is    another".split(" ")

['is', '', '', '', 'another']

Apparantly not. This is due to the fact that `"is    another"` when split by `" "` yields the list: `['is', '', '', '', 'another']`. There are 3 redundant empty strings yielded as the result and we need to discard them all. Fortunately, Python offers a rather neat way to clean them up -- using `.split()` without any argument. Take a look at the updated code:

In [3]:
s = """
This is an example paragraph. This is    another sentence.
"""

words = s.split()
word_count = len(words)
word_count

9

Recall that a string is actually a tuple of characters, so similarly to tuples, we can also get `len()` of a string and perform indexing and slicing.

In [5]:
# word length
word = "Mine"
len(word)

4

### The for-loop

Let's update our problem a little bit. Assume that every word that is less than 2 characters in length is **stop-word** which does not contribute much to the overall meaning of our text. Our job now is to get rid of every word that is less than 2 characters in length. 

A naive approach would be to:
- Iterate over every word in the list of words
  - If a word has more than 2 characters: note it down

Syntax of the `for` loop in Python does just the same. Have a look at the syntax:
```
for <element> in <collection>:
    # ... do sth with element
```

Now let's reproduce our procedure using Python:

In [9]:
# need to get only words that are more than 2 characters
s = """
This is an example paragraph. This is    another sentence.
"""

words = s.split()
non_stopwords = []
print(word) # cannot use `word` here

for word in words: # each element in `words` is iterated and stored in `word`
    if len(word) > 2:
        non_stopwords.append(word)

print(non_stopwords)
print(word)

sentence.
['This', 'example', 'paragraph.', 'This', 'another', 'sentence.']
sentence.


One thing Python does differently is that it allows the looping variable to be available even after the for-loop concluded. So we always get the last element as the value of the looping variable.

A humourous comparison of Python loops with the route to success in real-life:

While - Measure of success => Used when only a condition to break the loop is known

For - Steps to success => Used when you go through a collection

**Challenge**: Given a list of integers, find the maximum value.

In [10]:
numbers = [20, 1, 4, 8, 100, 150, 24, 18]
# tuple: (20, 1, 4, 8, 100, 150, 24, 18)

# find the max value of `numbers`
# use the first element as the pivot
max_value = numbers[0] 
# for each `number` in `numbers`
# if `number` > `max_value`: max_value = number
for number in numbers:
    if number > max_value:
        max_value = number

print("max_value:", max_value)

max_value: 150


### `enumerate`

Sometimes we need to work also with the indices of elements in the sequence. Python provides `enumerate()` to do just that. The usage is like the following:

In [22]:
# An enumerate() always return a list of tuples in the form of (index, element)
numbers = [20, 1, 4, 8, 100, 150, 24, 18]

print(list(enumerate(numbers)))

[(0, 20), (1, 1), (2, 4), (3, 8), (4, 100), (5, 150), (6, 24), (7, 18)]


**Challenge**: Given a list of integers, find the maximum value and its index.

In [4]:
numbers = [20, 1, 4, 8, 100, 150, 24, 18]

# find max value & index of max value of `numbers`
# introducing `enumerate`
max_value = numbers[0]
max_value_index = 0
for index_and_number in enumerate(numbers):
    index = index_and_number[0]
    number = index_and_number[1]
    if number > max_value:
        max_value = number
        max_value_index = index

print("max_value:", max_value)
print("max_value found at index:", max_value_index)


max_value: 150
max_value found at index: 5


***Destructuring:***  Python provides a neat way to quickly assign list and tuple elements to variables that is called tuple/list destructuring. The usage looks like the following:

In [6]:
# tuple destructuring
t = (1, 2)
t1, t2 = t
print(t1)
print(t2)
# list destructuring
l = [1, 2]
l1, l2 = l
print(l1)
print(l2)

# tuple assignment
t1, t2 = 1, 2

1
2
1
2


Applying the destructuring syntax yields an even shorter version of the above snippet:

In [8]:
numbers = [20, 1, 4, 8, 100, 150, 24, 18]

# find max value & index of max value of `numbers`
# introducing `enumerate`
max_value = numbers[0]
max_value_index = 0
for index, number in enumerate(numbers):
    # index, number = index_and_number
    if number > max_value:
        max_value = number
        max_value_index = index

print("max_value:", max_value)
print("max_value found at index:", max_value_index)

# bonus: max() in Python
max_value = max(numbers)
print("max value by max() function:", max_value)

max_value: 150
max_value found at index: 5
max value by max() function: 150


***Bear in mind:*** Whenever you need to work with index and elements, you always need to use `enumerate`.

### List comprehension

When you need to create a list as the result after working with elements of a collection, you can think of List comprehension as a shorter syntax to the original `for` loop.

Syntax: `[<do sth with element> for <element> in <collection> if <condition>]`

Or: `list(<do sth with element> for <element> in <collection> if <condition>)`

Output: a new list containing matched items in the original collection.

In [14]:
s = """
This is an example paragraph. This is    another sentence.
"""

words = s.split()

# non_stopwords = []
# for word in words: # each element in `words` is iterated and stored in `word`
#     if len(word) > 2:
#         non_stopwords.append(word)
words_copy = [word for word in words if len(word) > 2]
words_copy = list(word for word in words if len(word) > 2)
print(words)
print(words_copy)

# equivalence:
words_copy = []
for word in words:
    if len(word) > 2:
        words_copy.append(word)
print(words_copy)


['This', 'is', 'an', 'example', 'paragraph.', 'This', 'is', 'another', 'sentence.']
['This', 'example', 'paragraph.', 'This', 'another', 'sentence.']
['This', 'example', 'paragraph.', 'This', 'another', 'sentence.']


**Challenge:** Get all square of all numbers in the given list.

In [15]:
numbers = [20, 1, 4, 8, 100, 150, 24, 18]

# using plain for-loop
squared_numbers = []
for number in numbers:
    squared = number ** 2
    squared_numbers.append(squared)
print(squared_numbers)

[400, 1, 16, 64, 10000, 22500, 576, 324]


In [16]:
numbers = [20, 1, 4, 8, 100, 150, 24, 18]
# using list-comprehension
squared_numbers = [number ** 2 for number in numbers]
print(squared_numbers)

[400, 1, 16, 64, 10000, 22500, 576, 324]


### Challenge

We have worked with `list` extensively in the lecture session, so for the homework we can also apply the `for` loop to `tuple` and `range`. Good luck!

1. Using `for` loop, print the first 20 natural numbers (1 ~ 20). Hint: use `range()`
2. Using `for` loop, create a list of n (arbitrary int value, n > 2) first numbers of the Fibonacci sequence.
   ***Hint:** You may use the following snippet to get started with:*
   ```
    fibonacci_numbers = [1, 1]
    for index in range(2, n): 
        # index - 1 and index - 2 are 2 previous indices
        ...
   ```
3. Using `for` loop, find the factorial of n (arbitrary int value, n > 2). Hint: also use `range()`