# Lesson 06 Reference - Designing Data

Why do we program? We program to solve real-world problems.

Solving real-world problems depends on appropriately _modelling_ the problem and the relevant data associated with it.

The more appropriately or fittingly or cleverly the data is modelled, the easier it can be to solve the problem.

### Three quotes from fancy people on this topic:

#### From [Linus Torvalds](https://en.wikipedia.org/wiki/Linus_Torvalds) (creator of the Linux operating system and the git version control application)
> "Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

#### From [Rob Pike](https://users.ece.utexas.edu/~adnan/pike.html) in his book, "Rules of Programming"
> Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

#### From [Fred Brooks](https://en.wikipedia.org/wiki/The_Mythical_Man-Month) in his book, "The Mythical Man-Month"
> Representation is the essence of computer programming.
...
Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious.


## Three basic types of data:
1. Atomic - Single pieces of data (e.g. `int`, `float`, `str`, `bool`)
2. Collection - Arrays of data (e.g. `list`, `tuple`, `set`)
3. Compound - Meaningful combinations of different data types, (e.g. `dataclass`, `NamedTuple`, custom classes)



## Design Template for Atomic Data

```python
DataType: <type> # Interp. represents something about a thing...explain what the values in datatype could be or what their range might be - type names should be TitleCase (convention for python custom types and classes)
    
# Examples - variable names for examples should be all caps (convention for python constants)
DT0 = <an example value>
DT1 = <an example value>
```

**An example:**
```python
AMFreq: int # A Canadian AM radio frequency in kHz. Valid AM frequencies are between 540 and 1170 kHz and step in increments of 10.

#Examples
AMF0 = 1060
AMF1 = 540
AMF2 = 1170
```

#### Using `Optional`
```python
from typing import Optional # First, import it from 'typing' module in the standard library

BatteryLevel: Optional[float] # Interp. represents the power of a battery in percent. None represents no battery present.

BL_1 = 12.3
BL_2 = 100.0
BL_3 = None

def check_battery_present(bl: BatteryLevel) -> bool:
    """
    Returns True if the battery is present.
    Returns False otherwise.
    """
    if bl is not None:
        return True
    else:
        return False
```

# Compound Data

## 1. Data definition recipe for `dict` as a _compound_ type

Choose a _dict_ for a compound data definition for the following conditions:

1. You do not know _what the attributes are_ that you will need to store as keys
2. You do not know _how many attributes_ you will need to store as keys
3. You do not know the _shape_ of your data (e.g. nested/tree/heirachical data)


```python
from typing import Dict
MyCompoundType: Dict

# Interp. Represents a ... 
# keys represent ...
# values represent ...

# Examples
MCT1 = {<key1a>: <value1a>, <key1b>: <value1b>, ...}
MCT2 = {<key2a>: <value2a>, <key2b>: <value2b>, ...}
```

### An example

```python
from typing import Dict
WoodMaterial: Dict

# Interp. A class to represent a structural wood material as defined in CSA O86-14
# species is a str which can be one of "D. Fir", "SPF", "Hem-Fir", or "Northern"
# grade is a str which can be one of "SS", "No. 1/2", "No. 3/Stud"
# E is the elastic modulus in MPa
# Refer to Table 6.3.1A for more information

WMD1 = {"species": "D. Fir", "grade": "SS", "E": 12500}
WMD2 = {"species": "Northern", "grade": "No. 3/Stud", "E": 6500}
WMD3 = {"species": "SPF", "grade": "No. 1/2", "E": 9500}
```

## 2. Data definition recipe for `dataclass` (_compound_ type)

```python
from dataclasses import dataclass

@dataclass
class ClassName:
    attribute_a: <type>
    attribute_b: <type>
    attribute_c: <type>
    ...
```

### An example

```python
@dataclass # this thing is called a "decorator". Just go with it for now.
class WoodMaterial:
    species: str
    grade: str
    E: int
    
# Interp. A class to represent a structural wood material as defined in CSA O86-14
# species is a str which can be one of "D. Fir", "SPF", "Hem-Fir", or "Northern"
# grade is a str which can be one of "SS", "No. 1/2", "No. 3/Stud"
# E is the elastic modulus in MPa
# Refer to Table 6.3.1A for more information

# Examples
WM1 = WoodMaterial("D. Fir", "SS", 12500)
WM2 = WoodMaterial("Northern", "No. 3/Stud", 6500)
WM3 = WoodMaterial("SPF", "No. 1/2", 9500)
```

# Collections

## Design Template for Collection Data

```python
CollectionType: <type>[<subtype>] # Interp. represents some collection of types. Explain if there are any special rules or parameters for your collection (e.g. cannot be empty, or can be up to a certain length) - type names should be TitleCase (convention for Python custom types and classes)
    
# Examples - variable names for examples should be ALL CAPS (convention for python constants)
CT0 = <an example value>
CT1 = <an example value>
```

**An example:**

```python
AMFreqList: List[AMFreq] # Interp. A list of AM radio frequencies. Can be empty.
    
# Examples
AMFL0 = []
AMFL1 = [AMF0, AMF1, AMF2]
```

**Another example, with `dict`:**

```python
AMFreqDict: Dict[AMStation, AMFreq] # Interp. A dictionary of AM radio station names and their frequencies
    
# Examples
AMFD0 = {}
AMFD1 = {AMS0: AMF0, "CFRZ": AMF1, AMS2: AMF2}
```

# Additional Python data types language reference

## Reference guide for other Python built-in collection types: `tuple`, `dict`, `set`

### `tuple`: Tuples (immutable list)

A `list` allows us to store any kind of data and access data by its position in the list via indexing. A list allows us to _append_, _insert_, and _remove_ values from it. In other words, it is _mutable_ because we can mutate it, change it.

A `tuple` is just like a list in that it is an array of data and we can access data by its position in the tuple via indexing. HOWEVER, we cannot _append_, _insert_, or _remove_ values from the tuple after it is created. In other words, it is _immutable_. 

**Why on earth would anyone use a `tuple` when they can use a `list`?** 

The reason to use a `tuple` instead of a list is because `tuples` are _hashable_. This means that, because they do not change, they can be assigned a "permanent" location in computer memory and used as a look-up value for other data. _See `dict`, below_.

### How to make a `tuple`

Use parentheses (instead of square brackets) to make a tuple.

```python
my_list = ["cat", 123, "bat", 45.3]
my_tuple = ("cat", 123, "bat", 45.3)
```

### Methods

Because tuples are immutable, there are not really any methods attached to them. There are only two:

* `.count(item)` Returns the number of times that `item` occurs in the tuple
* `.index(item)` Returns the index of the item in the tuple

### Iterating

Just like a list:

```python
my_tuple = ("cat", 123, "bat", 45.3)
for item in my_tuple:
    print(item)
```

## `set`: Unordered collection of unique items

A `list` can have multiple copies of the same data. So can a `tuple`.

A `set` can only have one copy of any given item.

e.g.
```python
my_list = ["cat", "bat", "bat", "hat"]
my_set = {"cat", "bat", "bat", "hat"} # Braces
```

The `my_set` above, when evaluated, will only contain `{"cat", "bat", "hat"}` even though two `"bat"` was added to it.

### How to make `set`

There are two ways to make a `set`:

```python
# First way, use braces {} (but without colons, just each item separated by commas)
my_set = {"cat", "bat", "bat", "hat", "hat"}

# Second way, using the `set()` constructor on a list or tuple
my_list = ["cat", "bat", "bat", "hat"]
my_set = set(my_list)
```

### Iterating over `set`

Since `set`s are unordered, they cannot be iterated over. If you want to iterate over the information contained within a `set`, then you must convert it to a `list` which you can iterate over.

```python
set_to_list = list(my_set)
for item in set_to_list:
    print(item)
```

Note: there is no way of know what order the items in your list will be. To guarantee an order, use `sorted()`:

```python
set_to_list = sorted(list(my_set))
for item in set_to_list:
    print(item)
```

### Special properties of sets

Working with sets can make your life easy, especially for certain tasks. 

Here is an excellent article on the features of sets: [Sets on RealPython.com](https://realpython.com/python-sets/)


## `dict`: Dictionaries (key: value pairs)

A `dict` is a unique kind of type in Python and it is said that Python itself is _built_ around dictionaries.

A `dict` is a collection or array, like `list` and `tuple`, except instead of storing single values separated by commas, it stores two values separated by commas, a _key_ and a _value_.

e.g. 

```python
my_loads = {"DL": 20, "LL": 50, "SL": 30, "WL": 20}

wood_moduli = {"D. Fir": 19200, "SPF No. 1": 16300, "SPF No. 2": 16300, "Hem/Fir": 17600}
```

Dictionaries are similar to sets: dictionary keys are unique just as items in a set are unique. 

### How to make a `dict`

There are multiple ways:

```python
# First way, using braces {}
my_dict = {"item 1": 53, "item 2": "cat", "item 3": [23, 53]}

# Another way, using keyword arguments
my_dict = dict(item_1=53, item_2="cat", item_3=[23, 53])                                                       
                                                       
# Yet Another way, using a list of tuple pairs
my_dict = dict([("item 1", 53), ("item_2", "cat"), ("item 3", [23, 53])])
```

### Dictionary keys

Dictionary keys have to be _hashable_. In other words, they need to be one of the _immutable_ data types, e.g. 

* `bool`
* `int`
* `float`
* `str`
* `tuple`
* `NamedTuple`

A `list` _cannot_ be a dictionary key. It is mutable.
A `dict` _cannot_ be a dictionary key. It is also mutable.

Dictionary keys must be able to be allocated into "permanent" memory storage so that Python can always go back to check their equality against an incoming key look-up. If they can change, they are no longer reliable items to look up against.


### Indexing

Like lists and tuples, you can access the data in a dictionary with indexing. Except, with lists and tuples, you use the integer position of the item in the list. With dictionaries, you use the _key_ as your index.

e.g.

```python
my_loads["DL"]

wood_moduli["SPF No. 2"]
```

### Methods (also, see _Iterating_ below)

* **`.update(item)`** Change the keys: values in the dictionary with the keys: values in `item` (which itself must be a dictionary) (Remember this one)
* **`.get(key, [fallback])`** Attempt to get the value for `key` from the dictionary. If `key` is not in the dictionary, return the `fallback` value. (Remember this one, too)
* [Others (see this outside documentation)](https://www.w3schools.com/python/python_ref_dictionary.asp)


### Iterating

There are different ways of iterating over dictionaries and all three use special methods:
1. By keys
2. By values
3. By both keys and values ("items")

#### Iterating by keys

```python
my_loads = {"DL": 20, "LL": 50, "SL": 30, "WL": 20}

for key in my_loads.keys():
    print(key)
```

#### Iterating by values

```python
my_loads = {"DL": 20, "LL": 50, "SL": 30, "WL": 20}

for value in my_loads.values():
    print(value)
```

#### Iterating by keys and values (items)

```python
my_loads = {"DL": 20, "LL": 50, "SL": 30, "WL": 20}

for key, value in my_loads.items():
    print(key)
    print(value)
```