# Sum Types

Remember when I said, "Pure functions are my favorite part of functional programming"? Well, [sum types](https://en.wikipedia.org/wiki/Tagged_union) are a close second.

A "sum" type is the opposite of a "product" type. This Python object is an example of a _product_ type:

```python
man.studies_finance = True
man.has_trust_fund = False
```

The total number of combinations a `man` can have is `4`, the _product_ of `2 * 2`:

|studies_finance|has_trust_fund|
|---|---|
|True|True|
|True|False|
|False|True|
|False|False|

If we add a third attribute, perhaps a `has_blue_eyes` boolean, the total number of possibilities multiplies again, to `8`!

|studies_finance|has_trust_fund|has_blue_eyes|
|---|---|---|
|True|True|True|
|True|True|False|
|True|False|True|
|True|False|False|
|False|True|True|
|False|True|False|
|False|False|True|
|False|False|False|

But let's pretend that we live in a world where there are _really_ only [three types of people](https://www.youtube.com/watch?v=tEt0IuQJX2o) that our program cares about:

1. Dateable
2. Undateable
3. Maybe dateable

We can _reduce_ the number of cases our code needs to handle by using a (admittedly fake Pythonic) sum type with only 3 possible _types_:

```python
class Person:
    def __init__(self, name):
        self.name = name

class Dateable(Person):
    pass

class MaybeDateable(Person):
    pass

class Undateable(Person):
    pass
```

Then we can use the [isinstance](https://docs.python.org/3/library/functions.html#isinstance) built-in function to check if a `Person` is an instance of one of the subclasses. It's a clunky way to represent sum types, but hey, it's Python.

```python
def respond_to_text(guy_at_bar):
    if isinstance(guy_at_bar, Dateable):
        return f"Hey {guy_at_bar.name}, I'd love to go out with you!"
    elif isinstance(guy_at_bar, MaybeDateable):
        return f"Hey {guy_at_bar.name}, I'm busy but let's hang out sometime later."
    elif isinstance(guy_at_bar, Undateable):
        return "Have you tried being rich?"
    else:
        raise ValueError("invalid person type")
```

## Sum Types

As opposed to product types, which can have many (often infinite) combinations, sum types have a _fixed_ number of possible values. To be clear: **Python doesn't really support sum types**. We have to use a workaround and invent our own little system and enforce it ourselves.

## Assignment

Whenever a document is parsed by Doc2Doc, it can either succeed or fail. In functional programming, we often represent errors as data (e.g. the `ParseError` class) rather than by `raise`ing exceptions, because exceptions are side effects. _(This isn't standard Python practice, but it's useful to understand from an FP perspective)_

**Complete the `Parsed` and `ParseError` subclasses.**

- `Parsed` represents success. It should accept a `doc_name` string and a `text` string and save them as properties of the same name.
- `ParseError` represents failure. It should accept a `doc_name` string and an `err` string and save them as properties of the same name.

The test suite uses the `isinstance` function to see if an error occurred based on the class type.

In [8]:
class MaybeParsed:
    pass


# don't touch above this line


class Parsed(MaybeParsed):
    def __init__(self, doc_name, text):
        self.doc_name = doc_name
        self.text = text


class ParseError(MaybeParsed):
    def __init__(self, doc_name, err):
        self.doc_name = doc_name
        self.err = err


In [9]:
run_cases = [
    Parsed("why_fp.txt", "Because we're better than everyone else"),
    ParseError("why_fp.docx", "Can't handle weird windows files"),
]

submit_cases = run_cases + [
    Parsed("why_fp.md", "Because we're better than everyone else"),
    ParseError("why_fp.pdf", "Can't handle weird adobe files"),
]


def test(obj):
    print("---------------------------------")
    print(f"Testing properties of {obj.doc_name}...")
    if isinstance(obj, Parsed):
        if not obj.text:
            print(f"Expecting .text to be non-empty")
            print("Fail")
            return False
        if not obj.doc_name:
            print(f"Expecting .doc_name to be non-empty")
            print("Fail")
            return False
    elif isinstance(obj, ParseError):
        if not obj.err:
            print(f"Expecting .err to be non-empty")
            print("Fail")
            return False
        if not obj.doc_name:
            print(f"Expecting .doc_name to be non-empty")
            print("Fail")
            return False
    else:
        raise ValueError(f"unknown class type for: {obj}")
    print("Pass")
    return True


def main():
    passed = 0
    failed = 0
    skipped = len(submit_cases) - len(test_cases)
    for test_case in test_cases:
        correct = test(test_case)
        if correct:
            passed += 1
        else:
            failed += 1
    if failed == 0:
        print("============= PASS ==============")
    else:
        print("============= FAIL ==============")
    if skipped > 0:
        print(f"{passed} passed, {failed} failed, {skipped} skipped")
    else:
        print(f"{passed} passed, {failed} failed")


test_cases = submit_cases
if "__RUN__" in globals():
    test_cases = run_cases

main()


---------------------------------
Testing properties of why_fp.txt...
Pass
---------------------------------
Testing properties of why_fp.docx...
Pass
---------------------------------
Testing properties of why_fp.md...
Pass
---------------------------------
Testing properties of why_fp.pdf...
Pass
4 passed, 0 failed


# Enums

Doing the admittedly weird `class` and `isinstance()` thing works, but it turns out, there's a better way in some cases. If you're trying to represent a fixed set of values (but not store additional data within them) [enums](https://docs.python.org/3/library/enum.html) are the way to go.

Click to hide video

Your browser does not support playing HTML5 video. You can instead. Here is a description of the content: enums-python

Click to hide video

Your browser does not support playing HTML5 video. You can instead. Here is a description of the content: enums-python

Let's say we have a `Color` variable that we want to restrict to only three possible values:

- `RED`
- `GREEN`
- `BLUE`

We could use a plain-old `string` to represent these values, but that's annoying because we have to remember all the "valid" values and defensively check for invalid ones all over our codebase. Instead, we can use an `Enum`:

```python
from enum import Enum

Color = Enum('Color', ['RED', 'GREEN', 'BLUE'])
print(Color.RED)  # this works, prints 'Color.RED'
print(Color.TEAL) # this raises an exception
```

Now `Color` is a sum type! _At least, as close as we can get in Python._

There are a few benefits:

1. A "Color" can only be `RED`, `GREEN`, or `BLUE`. If you try to use `Color.TEAL`, Python raises an exception.
2. There is a central place to see the "valid" values for a `Color`.
3. Each "Color" has a "name" (e.g. `Color.RED`) and a "value" (e.g. `1`). The value is an integer and is used under the hood instead of the name. Integers take up less memory than strings, which helps with performance.

## Assignment

Create an `Enum` called `Doctype` with values:

- PDF
- TXT
- DOCX
- MD
- HTML

## Exercise and Solution
```python
from enum import Enum

Doctype = Enum('Doctype', ['PDF', 'TXT', 'DOCX', 'MD', 'HTML'])
```

# Sum Types

Unfortunately, Python does _not_ support sum types as well as some of the other [statically typed](https://developer.mozilla.org/en-US/docs/Glossary/Static_typing) languages.

Python [does not enforce](https://docs.python.org/3/library/typing.html) your types before your code runs. That's why we need this line here to `raise` an `Exception` if a color is invalid:

```python
def color_to_hex(color):
    if color == Color.GREEN:
        return '#00FF00'
    elif color == Color.BLUE:
        return '#0000FF'
    elif color == Color.RED:
        return '#FF0000'
    # handle the case where the color is invalid
    raise Exception('unknown color')
```

In a language like [Rust](https://www.rust-lang.org/) we could write the same thing like this:

```rust
fn color_to_hex(color: Color) -> String {
    match color {
        Color::Green => "#00FF00".to_string(),
        Color::Blue => "#0000FF".to_string(),
        Color::Red => "#FF0000".to_string(),
    }
}
```

Notice how there isn't any case for an unknown value? That's because the Rust code will fail to compile (a step that happens before the code runs at all) if the `Color` is a different value. **This static enforcement is a huge benefit of sum types**, and it's a shame we can't get that in Python.

# Match

Let's take another look at our example [Enum](https://docs.python.org/3/library/enum.html) from the previous lesson:

```python
Color = Enum("Color", ["RED", "GREEN", "BLUE"])
```

## Working With Enums

Python has a `match` statement that tends to be a lot cleaner than a series of `if/else/elif` statements when we're working with a fixed set of possible values (like a sum type, or more specifically an enum):

```python
def get_hex(color):
    match color:
        case Color.RED:
            return "#FF0000"
        case Color.GREEN:
            return "#00FF00"
        case Color.BLUE:
            return "#0000FF"

        # default case
        # (invalid Color)
        case _:
            return "#FFFFFF"
```

If you have _two_ values to match, you can use a `tuple`:

```python
def get_hex(color, shade):
    match (color, shade):
        case (Color.RED, Shade.LIGHT):
            return "#FFAAAA"
        case (Color.RED, Shade.DARK):
            return "#AA0000"
        case (Color.GREEN, Shade.LIGHT):
            return "#AAFFAA"
        case (Color.GREEN, Shade.DARK):
            return "#00AA00"
        case (Color.BLUE, Shade.LIGHT):
            return "#AAAAFF"
        case (Color.BLUE, Shade.DARK):
            return "#0000AA"

        # default case
        # (invalid combination)
        case _:
            return "#FFFFFF"
```

The value we want to compare is set after the `match` keyword, which is then compared against different cases/patterns. If a match is found, the code in the block is executed.

## Assignment

Complete the `convert_format` function. Using the enum `DocFormat`, it should support 3 types of conversions:

1. [ ] From `MD` to `HTML`:
    - Assume the content is a single `h1` tag in markdown syntax - it's a single string representing a line. Replace the leading `#` with an `<h1>` and add a `</h1>` to the end.
    - `# This is a heading` -> `<h1>This is a heading</h1>`
2. [ ] From `TXT` to `PDF`:
    - Simply add a `[PDF]` tag to the beginning and end of the content. Notice the spaces between `[PDF]` tags and the content:
    - `This is some text` -> `[PDF] This is some text [PDF]`
3. [ ] From `HTML` to `MD`:
    - Replace any `<h1>` tags with `#` and remove any `</h1>` tags.
    - `<h1>This is a heading</h1>` -> `# This is a heading`
4. [ ] Any other conversion:
    - If the input format is invalid, raise an `Exception` with the string `invalid type`.

## Solution
```python
from enum import Enum


class DocFormat(Enum):
    PDF = 1
    TXT = 2
    MD = 3
    HTML = 4


# don't touch above this line


def convert_format(content, from_format, to_format):
    match (from_format, to_format):
        case (DocFormat.MD, DocFormat.HTML):
            return content.replace("# ","<h1>")+"</h1>"

        case (DocFormat.TXT, DocFormat.PDF):
            return "[PDF] " + content + " [PDF]"

        case (DocFormat.HTML, DocFormat.MD):
            return content.replace("<h1>","# ").replace("</h1>", "")

        case _:
            raise Exception("invalid type")

```

In [14]:
from enum import Enum


class DocFormat(Enum):
    PDF = 1
    TXT = 2
    MD = 3
    HTML = 4


# don't touch above this line


def convert_format(content, from_format, to_format):
    match (from_format, to_format):
        case (DocFormat.MD, DocFormat.HTML):
            return content.replace("# ","<h1>")+"</h1>"

        case (DocFormat.TXT, DocFormat.PDF):
            return "[PDF] " + content + " [PDF]"

        case (DocFormat.HTML, DocFormat.MD):
            return content.replace("<h1>","# ").replace("</h1>", "")

        case _:
            raise Exception("invalid type")


In [15]:
# from main import *

try:
    DocFormat.MD and DocFormat.HTML and DocFormat.PDF and DocFormat.TXT
except Exception as error:
    print(f"Error: Missing attribute {error} from enum")

    class DocFormat(Enum):
        PDF = None
        TXT = None
        MD = None
        HTML = None


run_cases = [
    ("# Hello, world!", DocFormat.MD, DocFormat.HTML, "<h1>Hello, world!</h1>"),
    (
        "This is plain text.",
        DocFormat.TXT,
        DocFormat.PDF,
        "[PDF] This is plain text. [PDF]",
    ),
]

submit_cases = run_cases + [
    ("<h1>Title</h1>", DocFormat.HTML, DocFormat.MD, "# Title"),
    ("Something wicked", DocFormat.TXT, None, "invalid type"),
]


def test(content, from_format, to_format, expected_output):
    print("---------------------------------")
    print(f"Converting from {from_format} to {to_format}...")
    print(f"Content: {content}")
    print(f"Expected: {expected_output}")
    try:
        result = convert_format(content, from_format, to_format)
    except Exception as e:
        result = str(e)
    print(f"Actual: {result}")
    if result == expected_output:
        print("Pass")
        return True
    print("Fail")
    return False


def main():
    passed = 0
    failed = 0
    skipped = len(submit_cases) - len(test_cases)
    for test_case in test_cases:
        correct = test(*test_case)
        if correct:
            passed += 1
        else:
            failed += 1
    if failed == 0:
        print("============= PASS ==============")
    else:
        print("============= FAIL ==============")
    if skipped > 0:
        print(f"{passed} passed, {failed} failed, {skipped} skipped")
    else:
        print(f"{passed} passed, {failed} failed")


test_cases = submit_cases
if "__RUN__" in globals():
    test_cases = run_cases

main()


---------------------------------
Converting from DocFormat.MD to DocFormat.HTML...
Content: # Hello, world!
Expected: <h1>Hello, world!</h1>
Actual: <h1>Hello, world!</h1>
Pass
---------------------------------
Converting from DocFormat.TXT to DocFormat.PDF...
Content: This is plain text.
Expected: [PDF] This is plain text. [PDF]
Actual: [PDF] This is plain text. [PDF]
Pass
---------------------------------
Converting from DocFormat.HTML to DocFormat.MD...
Content: <h1>Title</h1>
Expected: # Title
Actual: # Title
Pass
---------------------------------
Converting from DocFormat.TXT to None...
Content: Something wicked
Expected: invalid type
Actual: invalid type
Pass
4 passed, 0 failed
