***
## 3.3 Numpy - Iterating Over Arrays

***
### Python3.1 Numpy Introduction
### Python3.2 Numpy DataTypes, Functions, and Random Module
### Python3.3 Numpy Iterating Over Arrays
### Python3.4 Numpy Manipulating Arrays
### Python3.5 Numpy Operations
### Python3.6 Numpy File Input and Output and Data Processing
### Python3.7 Numpy-Sort, Argsort, Nonzero, and Extract Functions
### Python3.8 Numpy BreakoutGroupExercises
### Python3.8 Numpy BreakoutGroupExercises - Solutions
***

***
## Table of Contents
### 1. `for` loop
###  2. `enumerate()` Function
### 3. `zip()` Function and `itertools.zip_longest()` Function
***

### Iterating Over Array Elements

Generally, we want to avoid iterating over the elements of arrays whenever we can (at all costs). The reason is that in an interpreted language like Python (or MATLAB), iterations are really slow compared to vectorized operations. 

However, sometimes iterations are unavoidable. For such cases, the Python `for` loop is the most convenient way to iterate over an array:

### 1. For Loop

In [38]:
import numpy as np

v = np.array([1,2,3,4])
for element in v:
    print(element)

1
2
3
4


In [39]:
M = np.array([[1,2], [3,4]])

for row in M:
    print("row = ", row)
    
    for element in row:
        print(element)

row =  [1 2]
1
2
row =  [3 4]
3
4


### 2. `enumerate()` function is a built-in function available with Python. 

- `enumerate()` command adds a counter to each item of the iterable object and returns an enumerate object.

- `enumerate(iterable, startIndex)`

- The output from enumerate: (0, item_1), (1, item_2), (2, item_3), … (n-1, item_n)


In [40]:
# Example:
mylist = ['A', 'B', 'C', 'D']
e_list = enumerate(mylist)
print(list(e_list))

[(0, 'A'), (1, 'B'), (2, 'C'), (3, 'D')]


When we need to iterate over each element of an array and modify its elements, it is convenient to use the enumerate function to obtain both the element and its index in the for loop:

In [41]:
M = np.array([[1,2], [3,4]])

for row_idx, row in enumerate(M):
    print("row_idx", row_idx, "row", row)
    
    for col_idx, element in enumerate(row):
        print("col_idx", col_idx, "element", element)
       
        # update the matrix M: square each element
        M[row_idx, col_idx] = element ** 2

row_idx 0 row [1 2]
col_idx 0 element 1
col_idx 1 element 2
row_idx 1 row [3 4]
col_idx 0 element 3
col_idx 1 element 4


In [42]:
x=np.random.randint(0,12,(4,4))
x

array([[ 6, 10, 11,  7],
       [10,  8,  3,  7],
       [ 4,  5,  2,  8],
       [10,  8,  9,  7]])

In [43]:
for row in x:
    print(row)

[ 6 10 11  7]
[10  8  3  7]
[4 5 2 8]
[10  8  9  7]


In [44]:
for i in range(len(x)):
    print(x[i])

[ 6 10 11  7]
[10  8  3  7]
[4 5 2 8]
[10  8  9  7]


In [45]:
# enumerate gives row and index of the rows
for i, row in enumerate(x):
    print('row', i, 'is', row )

row 0 is [ 6 10 11  7]
row 1 is [10  8  3  7]
row 2 is [4 5 2 8]
row 3 is [10  8  9  7]


In [46]:
x

array([[ 6, 10, 11,  7],
       [10,  8,  3,  7],
       [ 4,  5,  2,  8],
       [10,  8,  9,  7]])

In [47]:
x2=x**2
x2

array([[ 36, 100, 121,  49],
       [100,  64,   9,  49],
       [ 16,  25,   4,  64],
       [100,  64,  81,  49]], dtype=int32)

### 3. zip() Function creates an iterator that will aggregate elements from two or more iterables. 

[Builtins Link](https://docs.python.org/3/library/builtins.html)

Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. The iterator stops when the shortest input iterable is exhausted. With a single iterable argument, it returns an iterator of 1-tuples. With no arguments, it returns an empty iterator. 

In [48]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

#### zip() function is defined as `zip(*iterables)`

- The function takes in iterables as arguments and returns an iterator. This iterator generates a series of tuples containing elements from each iterable. `zip()` can accept any type of iterable, such as files, lists, tuples, dictionaries, sets, and so on.

- 1) use zip(numbers, letters) to create an iterator that produces tuples of the form (x, y)--> zip() function returns an iterator. 
- 2) To retrieve the final list object, use list() to consume the iterator.

In [49]:
numbers = [1, 2, 3]
letters = ['a', 'b', 'c']
zipped = zip(numbers, letters)  
zipped  # Holds an iterator object

<zip at 0x1ee3e590740>

In [50]:
type(zipped)

zip

In [51]:
list(zipped)

[(1, 'a'), (2, 'b'), (3, 'c')]

If you’re working with sequences like lists, tuples, or strings, then your iterables are guaranteed to be evaluated from left to right. This means that the resulting list of tuples will take the form [(numbers[0], letters[0]), (numbers[1], letters[1]),..., (numbers[n], letters[n])]. 

However, for other types of iterables (like sets), you might see some weird results:

In [52]:
s1 = {2, 3, 1}
s2 = {'b', 'a', 'c'}
list(zip(s1, s2))

[(1, 'c'), (2, 'a'), (3, 'b')]

Note: in this example, s1 and s2 are set objects, which don’t keep their elements in any particular order. This means that the tuples returned by zip() will have elements that are paired up randomly. If you’re going to use the Python zip() function with unordered iterables like sets, then this is something to keep in mind.

In [53]:
# Example with three iterables:
integers = [1, 2, 3]
letters = ['a', 'b', 'c']
floats = [4.0, 5.0, 6.0]
zipped = zip(integers, letters, floats)  # Three input iterables
list(zipped)

[(1, 'a', 4.0), (2, 'b', 5.0), (3, 'c', 6.0)]

In [54]:
#Passing Arguments of Unequal Length:
# When working with the Python zip() function, it’s important to pay attention to the length of the iterables. 
# It’s possible that the iterables you pass in as arguments aren’t the same length.
# The number of elements that zip() puts out will be equal to the length of the shortest iterable. The remaining elements in any longer iterables will be totally ignored by zip(), as you can see the following example. 
list(zip(range(5), range(100)))

[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]

#### itertools.zip_longest() Function
If trailing or unmatched values are important, then use itertools.zip_longest() instead of zip(). With this function, the missing values will be replaced with whatever you pass to the fillvalue argument (defaults to None). The iteration will continue until the longest iterable is exhausted:

In [55]:
from itertools import zip_longest
numbers = [1, 2, 3]
letters = ['a', 'b', 'c']
longest = range(5)
zipped = zip_longest(numbers, letters, longest, fillvalue='whatever')
list(zipped)


[(1, 'a', 0),
 (2, 'b', 1),
 (3, 'c', 2),
 ('whatever', 'whatever', 3),
 ('whatever', 'whatever', 4)]

Note: The itertools.zip_longest() Function yields five tuples with elements from letters, numbers, and longest. The iteration only stops when longest is exhausted. The missing elements from numbers and letters are filled with a question mark ?, which is what is specified with fillvalue.

#### Looping Over Multiple Iterables
Looping over multiple iterables is one of the most common use cases for Python’s zip() function. If you need to iterate through multiple lists, tuples, or any other sequence, then it’s likely that you’ll fall back on zip(). This section will show you how to use zip() to iterate through multiple iterables at the same time.
Traversing Lists in Parallel

In [56]:
letters = ['a', 'b', 'c']
numbers = [0, 1, 2]
for l, n in zip(letters, numbers):
    print(f'Letter: {l}')
    print(f'Number: {n}')

Letter: a
Number: 0
Letter: b
Number: 1
Letter: c
Number: 2


In [57]:
# iterate through more than two iterables in a single for loop with three input iterables:

letters = ['a', 'b', 'c']
numbers = [0, 1, 2]
operators = ['*', '/', '+']
for l, n, o in zip(letters, numbers, operators):
    print(f'Letter: {l}')
    print(f'Number: {n}')
    print(f'Operator: {o}')

Letter: a
Number: 0
Operator: *
Letter: b
Number: 1
Operator: /
Letter: c
Number: 2
Operator: +


Note: In this example, use `zip()` with three iterables to create and return an iterator that generates 3-item tuples. This allows to iterate through all three iterables in one go. There’s no restriction on the number of iterables you can use with Python’s zip() function.

In [58]:
x

array([[ 6, 10, 11,  7],
       [10,  8,  3,  7],
       [ 4,  5,  2,  8],
       [10,  8,  9,  7]])

In [59]:
x2

array([[ 36, 100, 121,  49],
       [100,  64,   9,  49],
       [ 16,  25,   4,  64],
       [100,  64,  81,  49]], dtype=int32)

In [60]:
## zip allows iterate through both arrays
for i, j in zip(x,x2):
    print(i, '+', j, '=', i+j)

[ 6 10 11  7] + [ 36 100 121  49] = [ 42 110 132  56]
[10  8  3  7] + [100  64   9  49] = [110  72  12  56]
[4 5 2 8] + [16 25  4 64] = [20 30  6 72]
[10  8  9  7] + [100  64  81  49] = [110  72  90  56]


In [61]:
# Traversing Dictionaries in Parallel: 
# dictionaries are ordered collections, meaning they keep their elements in the same order in which they were introduced. If you take advantage of this feature, then you can use the Python zip() function to iterate through multiple dictionaries in a safe and coherent way:

dict_one = {'name': 'John', 'last_name': 'Doe', 'job': 'Python Consultant'}
dict_two = {'name': 'Jane', 'last_name': 'Doe', 'job': 'Community Manager'}
for (k1, v1), (k2, v2) in zip(dict_one.items(), dict_two.items()):
    print(k1, '->', v1)
    print(k2, '->', v2)


name -> John
name -> Jane
last_name -> Doe
last_name -> Doe
job -> Python Consultant
job -> Community Manager


In [62]:
dict_one.items()

dict_items([('name', 'John'), ('last_name', 'Doe'), ('job', 'Python Consultant')])

Note: iteration through dict_one and dict_two in parallel: In this case, zip() generates tuples with the items from both dictionaries, then, unpacks each tuple and gain access to the items of both dictionaries at the same time.

#### Unzip A List of Tuple Into Independent Sequences by using `zip()` along with the unpacking operator `*`

In [63]:
# a list of tuples containing some kind of mixed data
# Then, use the unpacking operator * to unzip the data and create two different lists (numbers and letters)
pairs = [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]
numbers, letters = zip(*pairs)
numbers

(1, 2, 3, 4)

In [64]:
letters

('a', 'b', 'c', 'd')

#### Sorting in Parallel: combine two lists and sort them at the same time by using `zip()` along with `.sort()` as follows:

In [65]:
letters = ['b', 'a', 'd', 'c']
numbers = [2, 4, 3, 1]
data1 = list(zip(letters, numbers))
data1
[('b', 2), ('a', 4), ('d', 3), ('c', 1)]

[('b', 2), ('a', 4), ('d', 3), ('c', 1)]

In [66]:
data1.sort()  # Sort by letters
data1

[('a', 4), ('b', 2), ('c', 1), ('d', 3)]

In [67]:
data2 = list(zip(numbers, letters))
data2

[(2, 'b'), (4, 'a'), (3, 'd'), (1, 'c')]

In [68]:
data2.sort()  # Sort by numbers
data2


[(1, 'c'), (2, 'b'), (3, 'd'), (4, 'a')]

#### Also can use `sorted()` and `zip()` together to achieve a similar result:

In [69]:
letters = ['b', 'a', 'd', 'c']
numbers = [2, 4, 3, 1]
data = sorted(zip(letters, numbers))  # Sort by letters
data


[('a', 4), ('b', 2), ('c', 1), ('d', 3)]

#### Calculating in Pairs: use the Python zip() function to make some quick calculations

In [70]:
total_sales = [52000.00, 51000.00, 48000.00]
prod_cost = [46800.00, 45900.00, 43200.00]
for sales, costs in zip(total_sales, prod_cost):
    profit = sales - costs
    print(f'Total profit: {profit}')


Total profit: 5200.0
Total profit: 5100.0
Total profit: 4800.0


#### Building Dictionaries: Python’s dictionaries are a very useful data structure. Sometimes, you might need to build a dictionary from two different but closely related sequences. A convenient way to achieve this is to use `dict()` and `zip()` together. 

In [71]:
fields = ['name', 'last_name', 'age', 'job']
values = ['John', 'Doe', '45', 'Python Developer']
a_dict = dict(zip(fields, values))
a_dict

{'name': 'John', 'last_name': 'Doe', 'age': '45', 'job': 'Python Developer'}

#### Updating an existing dictionary by combining `zip()` with `dict.update()`
Suppose that John changes his job and you need to update the dictionary. You can do something like the following:

In [72]:
new_job = ['Python Consultant']
field = ['job']
a_dict.update(zip(field, new_job))
a_dict


{'name': 'John', 'last_name': 'Doe', 'age': '45', 'job': 'Python Consultant'}

## Further reading

- http://numpy.scipy.org
- http://scipy.org/Tentative_NumPy_Tutorial
- http://scipy.org/NumPy_for_Matlab_Users - A Numpy guide for MATLAB users.

#### Note: The course materials are developed mainly based on personal experience and contributions from the Python learning community
Referred Books: 
- Learning Python, 5th Edition by Mark Lutz
- Python Data Science Handbook, Jake, VanderPlas
- Python for Data Analysis, Wes McKinney    

Copyright ©2023 Mei Najim. All rights reserved. 