# COMS W3231 Intermediate Computing in Python
## Introduction to Python Software Engineering

**Date**: January 29, 2025\
Daniel Bauer (original notes on Top-Down design by Jan Janak)

---

## Software Engineering vs. Programming

Software Engineering is concerned with the design, implementation, testing, and maintainance of software systems. Programming/Coding focuses only on the implementation part.

You are probably reasonably good programmers at this point. A main goal of this course is to teach you how to be a software software engineer. 

"[Software Engineering is] the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software."—IEEE Standard Glossary of Software Engineering Terminology _In human terms: it is a methodology for systematic software development._

Here are some aspect sof the general software engineering workflow we will discuss in this course: 
* Think about the **design goals** of the project (what will it do, how will the end-user use it, what are some limitations)?
* **Software Design:**
    * Modularization. Identifying separate components that can be implemented independently. Specify how the components fit together.
    * Top-Down Program Design.  
    * Object Oriented Design. Abstract Data Types. Design Patterns.
* **Implementation:**
    * Implement each component, adhering to the design specification.
    * Documentation, Debugging
* **Testing**
* **Deployment and maintenace**



### Top-down Program Design (a.k.a. Wishful Programming)

**Top-down Design**: Top-down program design is a particular software engineering strategy. The core idea is to divide the program up into individual components (functions, objects) that operate independently through a clearly defined interface. 

Initially, these components act like black boxes -- we wish/assume they were already implemented and work correctly. Then, we repeat the design process for each black box, potentially decomposing it into other components.

This is a form of **abstraction**: Initially, we do not have to worry about **how** each component is implemented. We just have to understand **what** it is supposed to do. 

Let's see an example of how we can put this definition into practice. We will design a non-trivial program. We will use functions to divide a larger, intimidating problem into manageable chunks that could be implemented incrementally and independently.

### Example: The Game of Pig

Pig is a two-player dice game. Players take turns to roll a single die as many times as they want. They add each result to a running total. But if they roll 1, they lose the gained score in this run and their turn is over. The first player to score 100 or more points wins.

#### Rules
A player repeatedly rolls a die until a 1 is rolled or they choose to "hold":
  * If the player rolls a 1, they score nothing in the run.
  * If the player rolls any other number, it is added to their running total, and the player continues.
  * If the player "holds", their total for this run is added to their score, and it becomes the other player's turn.
  * The first player to score 100 or more points wins.

Let's consider how we can implement the game in Python. The objective is for a single user to play against the computer interactively. We will start by *wishing* Python had a built-in function to play the game! Of course, we will have to implement it ourselves. Thinking this through (without programming) will guide us through the process.

We wish Python had a function called pig that we could run:
```python
pig()
```

The function does not exist, so let's consider how to implement it at a high level. Maybe we want this function to do three things:
  1. Display the rules
  2. Play the game
  3. Congratulate the winner

Displaying the rules is simple. We know how to do that with just basic Python:
```python
def display_rules():
    print("Welcome to the Game of Pig!")
    print("Roll the die as many times as you want or hold")
    print("If you roll one, you lose the score from the current turn")
    print("The first player to score 100 or more wins!")
```

Congratulating the winner is also simple:
```python
def congratulate_winner():
    print("Congratulations!")
```

Now the function to **Play the game**. This will be a little harder, but we can apply top-down design to this function as well. What does the function do at a high level?
  1. Player 1 takes turn
  2. Player 2 takes turn
  3. Repeat steps 1 and 2 until somebody wins
  4. Announce winner

Let's go ahead and refine it. The repetition calls for a `while` or `for` loop:
```python
while (nobody has won):
    Player 1 takes turn
    Player 2 takes turn

Announce winner
```

Let's refine the while expression to make it look more like something Python can understand. "Nobody has won" means neither player scored 100 or more. We probably need some variables to keep track of the scores. Nevermind that the variable do not exist yet:
```python
while (player1_score < 100 and player2_score < 100):
    Player 1 takes turn
    Player 2 takes turn

Announce winner
```

There is a catch in the body of the `while` loop now. What if player 1 reaches 100? Player 2 should not get another turn in that case:
```python
while (player1_score < 100 and player2_score < 100):
    Player 1 takes turn
    if player1_score < 100:
        Player 2 takes turn

Announce winner
```

Now the function **Play the game** seems almost right, except for "Player 1 takes turn", "Player 2 takes turn", and "Announce winner". Let's move one layer lower and work on those functions now.

**Player 1 takes turn** needs to perform the following:
  1. Roll the die
  2. Show the result
  3. If roll == 1 then set turn_score to zero and end turn
  4. Else add the roll to turn_score and ask if the player wants to roll again
  5. Repeat until roll == 1 or the player holds

Just like before, we can refine this into a more Pythonic-looking pseudo-code:

```python
turn_score = 0
while (player wants to roll again and roll != 1):
    roll the die
    show the result
    if roll == 1:
        turn_score = 0
        end the turn
    else
        turn_score += roll
        ask player if roll again
return turn_score
```

Getting close. We need to provide functions "roll the dice" and "show the result" but that's simple with built-in modules:
```python
import random

def roll_the_die():
    return random.randint(1, 6)

def show_the_result(value):
    print(f"Rolled {value}")
```

**Player 2 takes turn**: Player 2 will be the computer. The implementation here will be slightly different because the computer must decide whether to roll or hold. How does the computer decide? The simplest strategy is to let the computer roll until it reaches a certain score, say, 20. Then it holds. Not the best, but for our example, it will do:
```python
turn_score = 0
while (turn_score < 20 and roll != 1):
    roll the die
    show the result
    if roll == 1:
        turn_score = 0
        end the turn
    else
        turn_score += roll
return turn_score
```

Now we are ready to sit down and create the Python program. We simply take the function we have developed above and refine them until the program works. This should not take that much work anymore. Furthermore, it should be mostly mechanical work.

In [None]:
import random

def roll_the_die():
    return random.randint(1, 6)

def show_the_result(value):
    print(f"Rolled {value}")

def display_rules():
    print("Welcome to the Game of Pig!")
    print("Roll the die as many times as you want or hold")
    print("If you roll one, you lose the score from the current turn")
    print("The first player to score 100 or more wins!")
    print("")

def congratulate_winner():
    print("Congratulation!")

def player_1_turn():
    print("Your turn")
    turn_score = 0
    again = 'y'
    roll = 0
    while (again == 'y' and roll != 1):
        roll = roll_the_die()
        show_the_result(roll)
        if roll == 1:
            turn_score = 0
            print(f"Your turn score is {turn_score}")
            break
        else:
            turn_score += roll
            print(f"Your turn score is {turn_score}")
            again = input("Roll again? (y/n)")
    return turn_score    

def player_2_turn():
    print("Computer's turn")
    turn_score = 0
    roll = 0

    while (turn_score < 20 and roll != 1):
        roll = roll_the_die()
        show_the_result(roll)
        if roll == 1:
            turn_score = 0
            break
        else:
            turn_score += roll
            print(f"Computer's turn score is {turn_score}")
    return turn_score    

def announce_winner(player1_score):
    if player1_score >= 100:
        print("User wins!")
    else:
        print("Computer wins!")

def play_game():
    player1_score = 0
    player2_score = 0
    while (player1_score < 100 and player2_score < 100):
        player1_score += player_1_turn()
        if player1_score < 100:
            player2_score += player_2_turn()

    announce_winner(player1_score)

def pig():
    display_rules()
    play_game()
    congratulate_winner()

pig()

We started with a rough idea and incrementally refined the *pseudo-code* into something that resembled a Python program. We then sat down and typed the Python program based on the pseudo-code. Following this approach while working on more complex problems might save you time. Thinking ahead and expressing ideas in pseudo-code is a helpful approach. Start by assuming Python has all the functions that you need. Iteratively implement what is missing, applying the same design process to the components.

## OOP Design

Let's apply the idea of **abstraction** we saw with functions above to Object Oriented Programming. In fact, the concept of abstraction is essential to object-oriented programming!

Similar to how functions provide an abstraction over components of the code/functionality, objects provide an abstraction over data.

Recall the three core aspects of Object Oriented Programming:

* **Encapsulation**
    * Objects bundle together data (data fields/instance variables) and functionality (methods). We can interact with the data by calling methods. When interacting with an object, we do not need to fully understand *how* the data is stored and how the methods are implemented, but only *what* is stored, and what the methods are supposed to do. 
* **Polymorphism**
    * Allows objects of different classes to be treated as interchangeable. In Python, this usually means that these classes all implement certain required functionality. This is known as **Duck Typing** (*if it looks like a duck and quacks like a duck, it's a duck*). In Python, we can implement certain special methods to allow objects to be used using built-in Python syntax (for example, comparisons with <, >, ==).
* **Inheritance**
    * Classes can be related to each other through inheritance. A child class/subclass inherits the functionality of the parent class/superclass and may extend this functionality. Different subclasses can share the same functionality as the parent class.
 
Together, these mechanisms make it possible to design programs using abstraction, treating objects as black boxes and "wishing" that they provide certain functionality.

### Example 1: Reconsider Binary Search 

Binary search works by making repeated comparisons between the key (the target we are looking for) and the candidates in the list. *The <, >, and == operations are the only functionality we require of the objects in the list*.

In [3]:
def binary_search(arr, x):
  left = 0
  right = len(arr)-1
    
  mid = (left + right) // 2
  while left<=right and arr[mid] != x:
      if arr[mid] > x: # x must be located in the first half
        #arr[mid].__gt__(x)
          
        right = mid - 1

      elif arr[mid] < x: # x must be located in second half
        left = mid + 1

      mid = (left + right) // 2

  if left <= right:
    return mid
  else:
    raise ValueError("Not found.")


class Customer:

    def __init__(self, customer_id, name): 
        self.customer_id = customer_id
        self.name = name
        # could add other information about the customer here

    def get_name(self): # "getter" method ensures encapsulation
        return self.name

    def get_id(self): 
        return self.customer_id

    def __gt__(self, other): 
        if not isinstance(other, Customer):
            raise TypeError("cannot compare Customer to", type(other))
        return self.customer_id > other.get_id()
    
    def __lt__(self, other): 
        if not isinstance(other, Customer):
            raise TypeError("cannot compare Customer to", type(other))
        return self.customer_id < other.get_id()

    def __eq__(self, other): 
        if not isinstance(other, Customer):
            raise TypeError("cannot compare Customer to", type(other))
        return self.customer_id == other.get_id()



c1 = Customer(1,"Marge")
c2 = Customer(4,"Homer")
c3 = Customer(8,"Bart")
c4 = Customer(11,"Lisa")
c5 = Customer(32,"Maggie")

arr = [c1, c2, c3, c4, c5]

found_idx = binary_search(arr,c4)
print(found_idx)

print(arr[found_idx].get_name())


3
Lisa


### Example 2: Iterators and the Range class

Recall the range function used for iteration. 

```python
for i in range(10):
    ...
```

range is actually a type, and range(10) instantiates this type into a range object. 

In [6]:
r = range(3)
r

range(0, 3)

We have no idea how the range class is implemented, but we do know we can use it for iteration, just like other types of objects (so-called iterables). 



In [130]:
for i in r: 
    print(i)

0
1
2


In [8]:
for i in [12, 4, 1, 2]:
    print(i)

12
4
1
2


The reason why you can iterate over range objects and lists is that these types both implement the Iterable/Iterator protocol -- they are both iterables. This is a kind of OOP *design pattern*. 

**Iterator:** An iterator is any object that supports the following method

* `__next__()` - return the next item in the iteration. If no more items are left, raise a `StopIteration` exception.

**Iterable:** An iterable is any object that supports the following method

* `__iter__()` - return a new iterator.

For example

In [12]:
range(3).__iter__()

<range_iterator at 0x111d9ab80>

In [14]:
[1,2,3].__iter__()

<list_iterator at 0x104415c30>

In [16]:
iter([1,2,3]) # built-in function iter just calls the special __iter__ method

<list_iterator at 0x111d99870>

In [34]:
it = range(3).__iter__() #iter(range(3))

In [36]:
it

<range_iterator at 0x112102520>

In [20]:
next(it) # built-in function next just calls the special __next__ method

0

In [22]:
next(it)

1

In [24]:
next(it)

2

In [26]:
next(it)

StopIteration: 

Question: Why separate iterators from iterables? 

* Iterating requires to keep track of the current state of the iteration, but we don't want to change the state/data of the iterable.
* Consider what happens if you want to iterate over the same iterable multiple times.

In [28]:
x = [1,2,3]
for i in x: 
    for j in x: 
        print(i,j)

1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3


**Implementing your own iterable:** We will implement our own Range class.

In [64]:
class RangeIterator:
    
    def __init__(self, start, stop): 
        self.items_in_range = []
        current = start
        while current < stop: 
            self.items_in_range.append(current)
            current += 1

        self.progress = 0

    def __next__(self): # RangeItertor is an iterator, so it implements __next__

        if self.progress >= len(self.items_in_range):
            raise StopIteration
        
        item = self.items_in_range[self.progress]
        self.progress += 1
        return item
        
class Range:
    def __init__(self, start, stop):
        self.start = start
        self.stop = stop

    def __iter__(self): # Range is an iterable, so it has to implement __iter__
        return RangeIterator(self.start, self.stop)
        

In [66]:
r = Range(0,5)

In [68]:
it = iter(r)

In [70]:
next(it)

0

In [72]:
next(it)

1

In [76]:
for i in Range(0,5): 
    print(i)

0
1
2
3
4


Note that this implementation works! Again, in order to use the new Range class we do not have to worry about **how** it is implemented only **what** it is supposed to do. 

A closer look reveals that the implementation of the Range class is quite inefficient, at least in terms of memory use. There is no reason 
why we have to store all elements in the range in a list. Instead, we can just keep track of the current element. Each time next is called, we advance the current element until the stop position is reached. This is an example of *lazy evaluation*. 

In [92]:
class RangeIterator:
    
    def __init__(self, start, stop): 
        self.current = start
        self.stop = stop
    
    def __next__(self): # RangeItertor is an iterator, so it implements __next__

        if self.current >= self.stop:
            raise StopIteration
        
        item = self.current    #return self.current++ would work in Java, but not in Python
        self.current += 1
        return item
            

class Range:
    def __init__(self, start, stop):
        self.start = start
        self.stop = stop

    def __iter__(self):
        return RangeIterator(self.start, self.stop)

In [94]:
for i in Range(0,5):
    print(i)

0
1
2
3
4
