In [None]:
from cs103 import *
# call help with a 'library' (in quotes) to get info about it
help('cs103')

# CPSC 103 - Systematic Program Design
# Module 03 Day 1
Rik Blok, with thanks to Prof. Giulia Toti

---

# Reminders
- Wed: Module 2 (HtDF): Code Review
- Wed: Module 2 (HtDF): Tutorial
- Mon: Module 3 (HtDD): Worksheet
- Mon: Module 4: Pre-Lecture Assignment
- Mon: Project: Team Registration (only required if you're working with a partner)

See also the [course calendar](https://canvas.ubc.ca/calendar?include_contexts=course_106343) (**[v.gd/6KJtbx](https://v.gd/6KJtbx)**).

---

# Your progress so far

#### Module 1: 
- You learned the basics of Python syntax and how to use Jupyter notebooks to run Python code
- You gained an understanding of variables and functions

#### Module 2:
- You learned the How to Design Functions (HtDF) recipe, which allows you to write clear and well-structured functions

---

# Module learning goals

By the end of this module, you will be able to:

- Use the How to Design Data (HtDD) recipe to design data definitions. 
- Identify problem domain information that should be represented as simple atomic data, intervals, enumerations, and optionals. 
- Use the Data Driven Templates recipe to generate templates for functions operating on data of a user-defined type. 
- Use the How to Design Functions (HtDF) recipe to design functions operating on data of a user-defined type. 

---

# Modeling information

- We need to connect the data from our program with information from our problem (and vice versa)
- The HtDD recipe helps us to represent the information of the problem as data in your program, using the best format
- Then the HtDF can solve the problem using this data 

<div style="width: 100%">
    <a title="How to Design Programs, Second Edition" href="https://htdp.org/2022-8-7/Book/part_one.html#%28counter._%28figure._fig~3adata-info%29%29"><img alt="From information to data, and back" src="https://raw.github.students.cs.ubc.ca/rikblok/public/main/CPSC103/module03-from-info-to-data-and-back.gif?token=GHSAT0AAAAAAAAAEO3CWUGP5S5CF7GR2YEWY65MZDA"></a>
</div>

---

<img style="float: right; width:10%" src="https://lthub.ubc.ca/files/2020/07/iClicker-Cloud-Logo.png">

# iClicker question: Information and data
Our program is storing the following data in a variable: `100`.  How could this data be interpreted as information in a relevant problem domain?

<!-- formatting: add two spaces at end of line to force linebreak -->

A. 100 students in a course  
B. The width of a 100×100 pixel image  
C. $100 in a bank account  
D. All of the above  
E. None of the above  

<details>

- The problem domain guides the representation of information as data
- Data without domain knowledge is meaningless!
- The programmer's job is to bridge the divide so the user can work with *information* and the program can work with *data*

</details>

---

# Primitive vs. non-primitive data

#### Primitive Python data
Data provided by Python, without any meaning attached. They are also called *atomic non-distinct*. 

Examples: `int`, `float`, `str`, `bool`

#### Glossary
- *Primitive*: Built into Python
- *Atomic*: Can't be broken down into smaller pieces
- *Distinct*: A specific value (e.g., `False`, `-3`, or `'hi'`)  
- *Non-distinct*: Can take on more than one value (e.g., `bool`)


#### Non-primitive data
Data we create using a combination of primitive data. 

Examples: `age`, `height`, `name`, `grade`, ...

**Non-primitive data imparts some information from the problem into Python data, enriching it with some meaning.**

---

# How to Design Data recipe (HtDD)

The HtDD recipe consists of the following steps: 
1. Definition 
2. Interpretation 
3. Examples 
4. Template 

There are several types of data, and there is *no firm rule* for when to use a particular one... it should just fit your problem.

---

# How to Design Data recipe (HtDD)

1. **Definition:** the line that tells Python the name of the new type, it is like a signature from the HtDF. 
2. **Interpretation:** describes what the new data type represents, it is like the purpose from the HtDF. 
3. **Examples:** they show how to form data of this type, usually giving special cases. 
4. **Template:** this is a one-parameter function that shows how a function acting on this data should operate. 

---

# HtDD naming conventions

1. **Data** type names use `UpperCamelCase` [[Wikipedia](https://en.wikipedia.org/wiki/Camel_case)].  E.g.,
```python
MovieTitle = ...
```
2. **Interpretation** starts with `# interp.`.  E.g.,
```python
# interp. Stores the name of a movie.
```
3. **Examples** are in `ALL_CAPS` [[Wikipedia](https://en.wikipedia.org/wiki/All_caps)] and often use abbreviated type name, followed by a number.  E.g.,
```python
MT1 = ...
MT2 = ...
MT_FAVOURITE = ...
```
4. **Template** function names start with `fn_for_`, then the type name in `snake_case` [[Wikipedia](https://en.wikipedia.org/wiki/Snake_case)].  
Parameter names in lowercase, usually abbreviation of data type name.  E.g.,
```python
def fn_for_movie_title(mt: MovieTitle) -> ...:
```
5. (**Variables and functions** - for consistency, let's also use `snake_case` for variables and functions and parameters, moving forward)

See also the course [Style Guide](https://canvas.ubc.ca/courses/106343/pages/style-guide?module_item_id=5186602).

#### Benefits:
- Provides additional information about the use of an identifier
- Reduces ambiguity
- Promotes code sharing and re-use

---

# Data types

#### Now:
- Simple atomic data
- Interval
- Enumeration
- Optional

#### Later:
- Compound data (Module 4)
- Arbitrary-sized (Module 5)

---

# Simple atomic data

When the information to be represented is itself atomic in form. Usually these are just the primitive data with a better name and description. 

```python
Temperature = float 
# interp. the air temperature in degrees Celsius

T1 = 0.0 
T2 = -24.5 
 
@typecheck 
# template based on Atomic Non-Distinct 
def fn_for_temperature(t: Temperature) -> ...: 
	return ...(t) 
```

---

<img style="float: right; width:10%" src="https://lthub.ubc.ca/files/2020/07/iClicker-Cloud-Logo.png">

# iClicker question: Simple atomic
Which of the following are examples of information that would be best represented with **simple atomic** data types?  Select **ALL** that apply.

<!-- formatting: add two spaces at end of line to force linebreak -->

A. The temperature of liquid water at standard atmospheric pressure  
B. Allergies that a patient has  
C. The name of a book  
D. A blood type, such as A, B, AB, or O  
E. A phone number  

<details>
    
A. Under these conditions water is liquid between 0°C and 100°C  
B. A patient might not have any allergies  
C. Might be any string  
D. Humans have four distinct blood types  
E. Phone numbers can vary in number of digits and may contain special codes, such as `*` and `#`.  When presented they may also contain `-`, `+`, or other symbols   

</details>

---

# Interval

When the information to be represented is numbers within a certain range. 

```python
Time = int # in range[0, 86400) 
# interp. seconds since midnight

T_MIDNIGHT = 0 
T_ONE_AM = 3600 
T_NOON = 43200 
T_END_OF_DAY = 86399

@typecheck 
# Template based on Atomic Non-Distinct 
def fn_for_time(t: Time) -> ...: 
	return ...(t)
```

- Range is just a comment for the programmer, not enforced by Python
- Square and round bracket notation borrowed from math
- Recall: `[]` means the endpoint is *included* and `()` means it is *excluded*
- Use `...` to indicate no limit, e.g., `WaterDepth = float # in range (...,0]`

---

<img style="float: right; width:10%" src="https://lthub.ubc.ca/files/2020/07/iClicker-Cloud-Logo.png">

# iClicker question: Interval
Which of the following are examples of information that would be best represented with **interval** data types?  Select **ALL** that apply.

<!-- formatting: add two spaces at end of line to force linebreak -->

A. The name of someone's sibling  
B. A percentage score on a test  
C. A temperature in Celsius  
D. Whether a user is logged in  
E. The wavelength of a visible photon  

<details>
    
A. Someone might not have a sibling  
B. A number between 0 and 100 (inclusive)  
C. The temperature at the North pole?  Or on the surface of the sun?  Should we limit the range?
D. Either they are or aren't 😉  
E. Our eyes can detect light between 400nm and 700nm  

</details>

---

# Enumeration

When the information to be represented consists of a fixed number of distinct values. 

```python
from enum import Enum 

Rock = Enum('Rock', ['ig', 'me', 'se']) 
# interp. a rock is either igneous ('ig'), metamorphic ('me'), or sedimentary ('se')

# examples are redundant for enumerations

@typecheck 
# Template based on Enumeration (3 cases) 
def fn_for_rock(r: Rock) -> ...: 
	if r == Rock.ig: 
		return ... 
	elif r == Rock.me: 
		return ... 
	elif r == Rock.se: 
		return ... 
```

- Python treats cases in definition as allowed distinct values, using "dot notation" instead of strings
- Advantage over strings: Restricts allowed values
```python
my_rock = Rock.ig          # distinct Enum values allowed (good!)
my_rock = "kryptonite"     # any string allowed by Python (bad!)
my_rock = Rock.kryptonite  # will produce error (good!)
```
- Note that `# examples are redundant for enumerations` so just write that
- One branch per option in template, all separated with `elif` (don't use `else` to catch other cases)

---

<img style="float: right; width:10%" src="https://lthub.ubc.ca/files/2020/07/iClicker-Cloud-Logo.png">

# iClicker question: Enumeration
Which of the following are examples of information that would be best represented with **enumeration** data types?  Select **ALL** that apply.

<!-- formatting: add two spaces at end of line to force linebreak -->

A. An individual's emergency contact  
B. The day of the week  
C. How much money there is in a bank account  
D. A music genre played by a streaming channel  
E. The number of pages in a book  

<details>
    
A. They may not have an emergency contact  
B. Seven distinct values  
C. Could be positive 🙂 or negative 😢  
D. A channel typically plays music from several genres  
E. A book has at least one page  

</details>

---

# Optional

When the information to be represented is well-represented by another form of data (often simple atomic or interval) except for one special case. 

```python
from typing import Optional 

Countdown = Optional[int] # in range[0, 10] 
# interp. a countdown that has not started yet (None), 
# or is counting down from 10 to 0 

C0 = None 
C1 = 10 
C2 = 7 
C3 = 0 

@typecheck 
# Template based on Optional 
def fn_for_countdown(c: Countdown) -> ...: 
	if c == None: 
		return ... 
    else: 
        return ...(c) 
```

---

<img style="float: right; width:10%" src="https://lthub.ubc.ca/files/2020/07/iClicker-Cloud-Logo.png">

# iClicker question: Optional
Which of the following are examples of information that would be best represented with **optional** data types?  Select **ALL** that apply.

<!-- formatting: add two spaces at end of line to force linebreak -->

A. A person's job title  
B. A season of the year  
C. A user's middle name  
D. The age of a voter in a BC election  
E. A description of the weather  

<details>
    
A. They might not have a job  
B. There are four seasons  
C. Not everyone has a middle name  
D. You must be at least 19 to vote in a BC election  
E. Weather can be highly variable!  Don't restrict your description  


</details>

---

# Templates

#### Data template
This is a one-parameter function that shows how a function operating on this data should operate. 

#### How to use with function template
When writing function with HtDF recipe, in Step 3 "Template", use data template instead of writing your own:
- Comment out the body of the stub, as usual
- Then **copy the body of template from the data definition to the function**
- If there is no data definition, just copy all parameters (as we did in HtDF for atomic non-distinct data) 

References: [How to Design Data](https://canvas.ubc.ca/courses/106343/modules/items/5186606) and [Data Driven Templates](https://canvas.ubc.ca/courses/106343/modules/items/5186607) modules on Canvas

---

## "Standing" problem

**Problem:** Design a function that takes a standing (SD for "standing deferred",
AUD for "audit", and W for "withdraw") and determines whether the
student is still working on the course where they earned that
standing.

To do this, we'll need a data definition for a standing, first.

<img style="float: right; width:10%" src="https://lthub.ubc.ca/files/2020/07/iClicker-Cloud-Logo.png">

# iClicker question: Data definition
How do we represent "Standing" in a way that is understandable and meaningful in Python?

<!-- formatting: add two spaces at end of line to force linebreak -->

A. Simple atomic  
B. Interval  
C. Enumeration  
D. Optional  
E. Something else

## "Standing" solution

**Problem:** Design a function that takes a standing (SD for "standing deferred",
AUD for "audit", and W for "withdraw") and determines whether the
student is still working on the course where they earned that
standing.

Note:
- You don't need to memorize the library for the data definition.  Just look it up!
```python
# Simple atomic doesn't require a library
# Interval doesn't require a library
from enum import Enum        # Enumeration
from typing import Optional  # Optional
```

<details><summary>▶ Sample solution (For later.  Don't peek if you want to learn 🙂)</summary>
    
```python

from enum import Enum

Standing = Enum('Standing', ['SD', 'AUD', 'W'])
# interpr. the standing of a student in a course, which is either SD for
# "standing deferred", AUD for "audit", and W for "withdraw

# examples are redundant for enumeration

@typecheck
# Template for Enumeration (3 cases)
def fn_for_standing(s: Standing) -> ...:
    if s == Standing.SD:
        return ...
    elif s == Standing.AUD:
        return ...
    elif s == Standing.W:
        return ...
    
```
    
</details>

In [None]:
Standing = ... # TODO!



Now we can design – using the HtDF recipe – the function that takes a standing and "determines whether the student is still working on the course where they earned that standing."

Notice that the "Template" step in the HtDF recipe changes from **writing** a template to instead **copying** a template.

In [None]:
@typecheck
def still_working(s: Standing) -> ...:
    """
    TODO!
    """
    return 0  # INCORRECT stub

start_testing()
#expect(still_working(TODO), TODO)
summary()

# Re-using our "Standing" data definition
A single data definition usually gets used for many different functions in your program, but we often only have time for one in class, tutorial, and assignments. Let's do a second design here!

**Problem:** Design a function that takes a standing (as above) and returns an English explanation of what the standing means.

We already have the data definition, which guides our function design. Indeed, the designed function is very similar to the previous one. Finding where it's *different* may tell you a lot about why examples and templates are useful!

In [None]:
@typecheck
def describe_standing(s: Standing) -> ...:
    """
    returns an English description of s
    """
    return 0  # INCORRECT stub

start_testing()
# We've gone ahead and filled in the test cases already to help move us along a bit!
# The HtDD recipe tells us we should have one test for every value in the Standing enumeration!
expect(describe_standing(Standing.SD), "Standing Deferred: awaiting completion of some additional requirement")
expect(describe_standing(Standing.AUD), "Auditing: sat in on the course for credit, but not for a grade")
expect(describe_standing(Standing.W), "Withdrawn: Withdrew from the course after the add/drop deadline")
summary()

Well done! Now, write a call to `describe_standing`:

In [None]:
# Call describe_standing