## Software (Python) Engineering Notebook 2

The Software (Python) Engineering Notebook 2 mainly serves as a condensed course for incoming new developers who knows python but do not have much experience writing classes.

It also contains a few recommended ways to work with packing data and dealing with for loops. These recommendations in general will help new developer write cleaner codes.

In [1]:
import itertools as iter
import numpy as np
import time

### 1. What is Class in Python? (Part 1)

If a function is a code template to create operations in Python. A class is a code template to create Objects. 

What are Objects then? 

An object in Python is like a container for <span style="color:red">data(attributes) related to that object and functions(methods) related to that object.<span> 

The object itself will represent a logical building block for the larger codebase.

Let's see this in code below

In [2]:
class Node:
    """
    A Node class to represent random values and a pointer to the next node in the chain.

    Attributes:
    -----------
    node_id:
        Index number for the node based on order
    value:
        Variable to store numpy array as information
    child:
        Next Node in the chain
    
    Methods:
    --------
    No methods
    """
    def __init__(self, node_id: int=0, next=None):
        self.node_id = node_id
        self.next = next
        self.value = np.random.randint(5, size=5)

    def sum_value(self):
        return np.sum(self.value)
    
    def method_inherit(self):
        return f"I can still activate {self.method_inherit.__name__}"

The Node class represent a template for creating multiple nodes. Each node can contain different attributes and methods.

The current attributes for the Node class are as follows:

* node_id
* next
* value

At the initialization of the class, the object will store the default attributes' values such as node_id = 0 and next = None. The ```value``` will store the output from the operation "np.random.randint()"

This object will also contain a method which will sum the array of values that is stored at it's ```value``` attribute.

```self``` refers to the object itself.

Note: Ignore method_inherit for now.

In [3]:
node_A = Node(node_id=1)
node_B = Node(node_id=2)

print(f"Node id: {node_A.node_id} contains array {node_A.value} and it's sum is {node_A.sum_value()}")
print(f"Node id: {node_B.node_id} contains array {node_B.value} and it's sum is {node_B.sum_value()}")

Node id: 1 contains array [0 2 1 2 3] and it's sum is 8
Node id: 2 contains array [1 1 0 4 4] and it's sum is 10


As you can see from the example above, we can create multiple instances of the Node class with different attributes value.

##### 1.1.0 Choosing between Function and Class

Now that you understand what a Class is, should you still write simple functions without a class?

The delineation of functions and classes in theory is clear. Functions carry out operations, Classes encapsulate logic. However as programmers, we often find ourselves in a dilemma of when to use functions and when to use classes.

Here are several questions that can be asked and use as guidelines:

* <span style="color:red">Do I need to store state?
* <span style="color:red">Are my functions frequently calling functions inside a proposed class?
* <span style="color:red">Should I be using a utils.py instead to collate utility functions?
* <span style="color:red">What is the administrative overhead of creating a class?
* <span style="color:red">Are writing a bunch of functions easier than writing a class?
* <span style="color:red">Am I building different things out of a common base?
* <span style="color:red">Would I end up with memory issues due to the number of instances and states stored?

At the end of the day, it depends on the individual engineer assessment on what works best for the problem at hand.

##### 1.2.0 Subclassing or Class Inheritance and To Super Init or Not?

What is Class Inheritance?

Class Inheritance is the technique that allow us to write another class that takes on the parent class's attributes and methods. We do this when we want to extend the parent's class functionality.

We can do this through super().__ init __()

The next cell will show you how this can be done to ```Node``` using super().__ init __() to create a new class ```class ChainNodes(Nodes):```.

<span style="color:red">Docstrings are removed to save space for you to look at the entire class as one and ease of explanation in the next few cells.

In [4]:
class ChainNodes(Node):

    def __init__(self):
        super().__init__()

    def chain_nodes(self, max_depth:int):

        def _chain_recursive(node, max_depth=max_depth):
            if node.node_id == max_depth:
                return                
            
            new_node = Node(node_id=node.node_id + 1)
            node.next = new_node
            
            _chain_recursive(new_node, max_depth=max_depth)

        _chain_recursive(self)
            

    def visualize(self):

        def _visualize_recursive(node):
            if node is None:
                return
            
            print(f"Node {node.node_id}: Value = {node.value}")
            
            _visualize_recursive(node.next)
        
        _visualize_recursive(self)

```
class ChainNodes(Node):
    
    def __init__(self):
        super().__init__()
```

The above is the portion of the code that will cause the ```ChainNode``` class to inherit the attributes of the ```Node``` class along with the methods. See the next example below

In [5]:
chain = ChainNodes()
print(chain.node_id)
print(chain.next)
print(chain.value)
print(chain.sum_value())

0
None
[1 4 3 0 2]
10


You can see that the ```chain``` has inherited the Node class's attributes and as well as the method ```sum_value```.

Now, it is time to talk about the ```next``` attribute.

The ```next``` value is None because it is meant as a pointer to the next Node which will be created by the ```chain_nodes``` function. This function creates nodes and points to the next node it creates recursively

```
    def chain_nodes(self, max_depth:int):

        def _chain_recursive(node, max_depth=max_depth):
            if node.node_id == max_depth:
                return                
            
            new_node = Node(node_id=node.node_id + 1)
            node.next = new_node
            
            _chain_recursive(new_node, max_depth=max_depth)

        _chain_recursive(self)
```
So the ChainNodes class extended the Node class feature by being able to create multiple Node objects and connect them. See below ```print``` statements to understand better what attributes have been stored.

In [6]:
chain = ChainNodes()
chain.chain_nodes(max_depth=3)
print(chain.visualize())
print(chain.node_id)
print(chain.value)
print(chain.sum_value())
print(chain.next.node_id)
print(chain.next.value)
print(chain.next.sum_value())

Node 0: Value = [3 2 0 1 3]
Node 1: Value = [4 0 0 3 4]
Node 2: Value = [0 2 3 3 1]
Node 3: Value = [3 3 3 4 3]
None
0
[3 2 0 1 3]
9
1
[4 0 0 3 4]
11


(Sidetrack) This node structure is dfferent from a tree node or neural net node structure. However, conceptually they are very similar. You should be able to see how classes can contain common logic within and at the same time separate out different logic to other classes.

<span style="color:red">Although super().__ init __() is very powerful, it is not mandatory in writing class.<span>

See the below implementation to understand a different way of implementing the above logic. There is no ```super().__ init __()``` and generator comprehension is used instead.

In [7]:
class LinkNodes(Node):

    def __init__(self):
        self.head=None

    def link_nodes(self, max_depth:int):
        # generator comprehension
        generator=(Node(node_id=i) for i in range(max_depth+1))

        curr_node = next(generator)
        self.head = curr_node

        for node in generator:
            curr_node.next = node
            curr_node = node

    @staticmethod
    def visualize(node):

        def _visualize_recursive(node):
            if node is None:
                return
            
            print(f"Node {node.node_id}: Value = {node.value}")
            
            _visualize_recursive(node.next)
        
        _visualize_recursive(node)

In [8]:
chain = LinkNodes()

try:
    chain.node_id
except Exception as e:
    print(e)

try:
    print(chain.method_inherit())
except Exception as e:
    print(e)

try:
    print(chain.sum_value())
except Exception as e:
    print(e)

'LinkNodes' object has no attribute 'node_id'
I can still activate method_inherit
'LinkNodes' object has no attribute 'value'


The difference here is that ```chain``` does not have the Node's attribute but still inherits the Node's method. 

But why would we want to write something like that? 

It depends on the logic of the problem statement.

Let's see the below demo.

In [9]:
chain.link_nodes(max_depth=3)
print(chain.visualize(chain.head))

Node 0: Value = [2 0 2 1 1]
Node 1: Value = [1 4 4 4 2]
Node 2: Value = [0 2 4 4 0]
Node 3: Value = [2 3 0 3 1]
None


As you can see now, the first node is stored at ```chain.head``` instead of ```chain```. But why would we do something like this? In our example, we shouldn't do this because we are trying to simply chain up nodes. 

However, what if we are trying to chain up nodes that belong to a Tree? A Tree is not a node, so the tree should not inherit the Node's class attribute. In this case, we can store the first node of the tree as root node in the tree's attribute instead. 

This will encapsulate the logic of a Node and the logic of a Tree better.

<span style="color:red">Note: The function within the class has changed from recursive to generator comprehension. This is not due to the class requirements, but rather as a demonstration that the same logic can be written in different ways and we should be open to trying different implementation methods to improve our code quality.<span>

### 2. Packing and Unpacking in Python

One of the most common thing that we do in coding is packing and unpacking of variables. In this section, we will look at some of the ways we can pack and unpack variables in Python efficiently.

In [10]:
x = np.random.rand(2, 5)

def numpy_layer(x, input_size, output_size):
    weight = np.random.rand(output_size, input_size)
    bias = np.random.rand(output_size)
    return np.dot(x, weight.T) + bias

mylist = [x, 5, 3]
numpy_layer(*mylist)

array([[1.41948438, 0.7583901 , 1.54077774],
       [1.83883162, 1.31490136, 2.21907052]])

The " * " in above example can be helpful in unpacking arguments into a function. It unpacks [x,5,3] for x, input_size and output_size.

We can also pack unspecific number of arguments into a function using " * " as shown below. This allows the function ```get_mean``` to take in unspecified number of arguments.

In [11]:
def get_mean(*arg):
    return np.mean(*arg)

values = [1,2,4,5,6]
get_mean(values)

np.float64(3.6)

For key-word arguments, we can use "**kwargs" instead. This unpacks the dictionary in the function.

In [12]:
def get_total_salary(**kwargs):
    
    total_salary = 0

    for _, arg in kwargs.items():
        total_salary += arg
    
    return total_salary

dict_A = {"chris":1000,"Tommy":10500}
print(get_total_salary(**dict_A))

11500


You can also unpack variables neatly without calling index of a list or tuple multiple times.

In [13]:
words = ('walk','the','dog')
verb = words[0]
nonstop = words[1]
noun = words[2]

# Neater way
verb, nonstop, noun = words

```zip``` in python is a useful function to take multiple iterables and map the values into one iterable object based on index.

In [14]:
a = ("Alice","Bernard","Cathy")
b = ("Tan","Goh","Wong")

# names is the iterable object combining a and b
names = zip(a,b)

a, b, c = names
print(a)
print(b)
print(c)

('Alice', 'Tan')
('Bernard', 'Goh')
('Cathy', 'Wong')


It can also help you unpack two dictionaries at the same time. This allows you to iterate through multiple dictionaries in a safe and coherent way.

In [15]:
# Unpacking dictionary items in parallel
dict_a ={"first_name":"Bernard","last_name":"Goh","role":"AI Engineer"}
dict_b ={"first_name":"Alice","last_name":"Tan","role":"Data Scientist"}

for x, y in zip(dict_a.items(),dict_b.items()):
    print(x)
    print(y)

('first_name', 'Bernard')
('first_name', 'Alice')
('last_name', 'Goh')
('last_name', 'Tan')
('role', 'AI Engineer')
('role', 'Data Scientist')


### 3. Looping beyond For Loops: Itertools, Map and List Comprehension

##### 3.1.0 Itertools: using Product()

itertools.product() calculates cartesian product on iterables. The logic is equivalent to nested for loops.

So product of A & B is the same as:

```
mylist = []
for x in A:
    for y in B:
        mylist.append((x,y))
```

See the example below

In [16]:
a = np.ones((5,1))
b = np.ones((5,1))
c = np.ones((5,1))

num = np.zeros((5,1))

for x, y, z in iter.product(a,b,c):
    num += x + y + z

print(num)

[[375.]
 [375.]
 [375.]
 [375.]
 [375.]]


##### 3.2.0 Map: Using map() for readability

There are some cases where using map() will read better than for loop and list comprehension.

<span style="color:red">Note that these are very opinionated guidelines on how to write loops. Please refer to your Lead for their opinions.

In [17]:
a = [1,2,3]
b = [5,6,7]
c = [4,8,9]

def process_arrays(x,y,z):
    return x+y-z

# the loop is easy to read but longer
nums =[]
for x in range(len(a)):
    num=process_arrays(a[x],b[x],c[x])
    nums.append(num)
print(nums)

# list comprehension is nice but harder to read (subjective)
list_nums = [process_arrays(a[x],b[x],c[x]) for x in range(len(a))]
print(list_nums)

# map in this case will be neater
map_nums=list(map(process_arrays, a, b, c))
print(map_nums)


[2, 0, 1]
[2, 0, 1]
[2, 0, 1]


##### 3.3.0 Using List Comprehension for Loops

List Comprehension is generally a better way to write a for loop if there isn't a lot of operations at each loop.

Take the function ```square``` below, if we want to ```square``` every value in a list. List Comprehension will work very well.

In [18]:
data = list(range(1000))

def square(x):
    return x*x

x_squared = [square(value) for value in data]

print(x_squared)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225, 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776, 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569, 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604, 9801, 10000, 10201, 10404, 10609, 10816, 11025, 11236, 11449, 11664, 11881, 12100, 12321, 12544, 12769, 12996, 13225, 13456, 13689, 13924, 14161, 14400, 14641, 14884, 15129, 15376, 15625, 15876, 16129, 16384, 16641, 16900, 17161, 17424, 17689, 17956, 18225, 18496, 18769, 19044, 19321, 19600, 19881, 20164, 20449, 20736, 21025, 21316, 21609, 21904, 22201, 22500, 22801, 23104, 23409, 23716, 24025, 24336, 24649, 24964, 25281, 25600, 25921, 26244, 2656

Beyond readability, let's see if there is any performance difference between a normal for loop, map and list comprehension. Writing fast and efficient code is part of becoming a better developer.

In [19]:
def timeit_decorator(func):
    def wrapper(*args,**kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        time_taken = end_time - start_time
        print(f"{func.__name__} took {time_taken:.6f} seconds to execute.")
        return result
    return wrapper

@timeit_decorator
def test_loop(data):
    result = []
    for item in data:
        result.append(square(item))
    return None

@timeit_decorator
def test_map(data):
    list(map(square,data))
    return None

@timeit_decorator
def test_list(data):
    [square(item) for item in data]
    return None

print(test_loop(data))
print(test_map(data))
print(test_list(data))

test_loop took 0.000060 seconds to execute.
None
test_map took 0.000054 seconds to execute.
None
test_list took 0.000045 seconds to execute.
None


##### 3.4.0 Using Dictionary Comprehension to build Dictionary

If you have two lists of key and value pairs that you want to create a dictionary. Dictionary Comprehension can be an option.

In [20]:
# Dictionary Comprehension
staffs = ["Alice","Bernard","Cathy"]
emails = ["alice@email.com","bernard@email.com","cathy@email.com"]

staff_email_dict = {staff: email for staff, email in zip(staffs, emails)}
print(staff_email_dict)

{'Alice': 'alice@email.com', 'Bernard': 'bernard@email.com', 'Cathy': 'cathy@email.com'}


### 4. Practice Exercises