<img src="https://www.assistancebeyondcrisis.org.au/wp-content/uploads/2017/05/logo-deloitte.png" width="150" align = "right">

# AnalyticsU - Python 201

# Intermediate Python Concepts

This course covers several intermediate-level Python concepts that extend to multiple domains including analytics and data science:

1. **Comprehensions (list, set, dict)** - creating new data structures with idiomatic Python code
2. **Object-oriented programming (OOP)** - a programming paradigm that represents things as objects

These are sometimes referred to as **native** Python concepts because they are objects built into the Python interpreter or from the Python Standard Library, rather than being specific to a third-party Python package.

# Prequisites: Imports

You'll need to access the following imports throughout this lesson:

In [None]:
import bisect
import statistics
from collections import Counter
from datetime import date, timedelta
from math import radians, asin, cos, isclose, sin, sqrt
from typing import Container, Optional

import pandas as pd

The first several lines here import from Python's Standard Library.  These are the "[batteries included](https://docs.python.org/3/library/)" part of any Python distribution.

The last import, `pandas`, is a third-party library.

_Note_: To make your code more readable in a large codebase, it is good practice to **organize import statements** in line with [Python's PEP 8](https://www.python.org/dev/peps/pep-0008/#imports) style guide.  Generally, that means using imports in the following order:

1. Standard library imports (such as `datetime` or `bisect`)
2. Related third party imports (such as `pandas`)
3. Local application/library specific imports (not shown above)

...with a blank line between each group.

_Troubleshooting_: If you're seeing `ModuleNotFoundError: No module named 'pandas'`, you will need to [install Pandas](https://pandas.pydata.org/docs/getting_started/install.html) from the command line through `conda` or `pip`.

# Part 1: Comprehensions

In AnalyticsU Python101 and/or the [Python tutorial](https://docs.python.org/3/tutorial/), you were introduced to elementary data structures such as `list`, `tuple`, `set`, and `dict`.

You can [iterate](https://docs.python.org/3/library/stdtypes.html#typeiter) over the elements of these data structures using [control flow](https://docs.python.org/3/tutorial/controlflow.html) such as `for` and `while`, optionally appending or adding to a new data structure as a result. 

An alternative and somtimes more idiomatic way to iterate is to use a [comprehension](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions).  These come in several forms:

| Type | Example |
| ---- | ------- |
| `list` comprehension | `[i ** 2 for i in range(5)]` |
| `set` comprehension | `{i[0].casefold() for i in ("sampson", "Shelby", "pat")}` |
| `dict` comprehension | `{name: len(name) for i in ("sampson", "Shelby", "pat")}` |

_Note_: The term before "comprehension" here refers to the data structure that is being _formed_, not to the data structure that is being iterated over.  In otherwords, a _list comprehension_ will uses brackets `[ ... ]` to form a new list, and could be iterating over another `list`, a `dict` or something else.

## List Comprehensions

Square each number in `range(5)`:

In [None]:
print([i ** 2 for i in range(5)])

Iterate over a `tuple` rather than `range` object:

In [None]:
print([i ** 2 for i in (2, 4, 8, 16)])

Find only even numbers over the range `[0, 19]`:

In [None]:
print([i for i in range(20) if i % 2 == 0])

Define a `list` of addresses:

In [None]:
addresses = [
    "Fort Meade, MD",
    "Baltimore, MD",
    "Fort Worth, TX",
    "Culpeper, VA",
    "Boise, ID",
    "Baltimore, MD",
]

Using a traditional `for`-loop, create a new `list` that contains only cities in `MD`:

In [None]:
only_maryland = []
for a in addresses:
    if a.endswith("MD"):
        only_maryland.append(a)

print(only_maryland)

The above can be condensed into a `list` comprehension:

In [None]:
only_maryland_comp = [a for a in addresses if a.endswith("MD")]

print(only_maryland_comp)

The input may be more involved, such as a `dict` mapping student names to vectors of grades:

In [None]:
grades = {
    "tom": [98.7, 94.2, 89.0],
    "luke": [85.7, 83.0, 89.0],
    "jenn": [99.1, 99.2, 100.0],
}

You can find the unioned, flat list of grades by iterating over the dictionary `values` with a [**nested** `list` comprehension](https://docs.python.org/3/tutorial/datastructures.html#nested-list-comprehensions):

In [None]:
all_grades = [g for v in grades.values() for g in v]
print(all_grades)

## Exercises Comprehensions (Part 1):

### Exercise 1

**<span style="color:red">Challenge:</span>** Now that we have all the words and the lengths for each word, let's see if we can get a list of just the lengths of the words in the sentence. Produce a `list` variable called `lengths` that contains the length of each word in `sentence`.

In [None]:
sentence = "I love learning Python and applying it to solve complex problems"

**<span style="color:green">Add your answer here:</span>**

> Hint: you can use `sentence.split()` to split the `str` into a `list`, and then iterate over that list


In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---

lengths = "?"

**<span style="color:blue">Check your answer:</span>**

In [None]:
assert lengths == [1, 4, 8, 6, 3, 8, 2, 2, 5, 7, 8], lengths

### Exercise 2

**<span style="color:red">Challenge:</span>** Given `sentence`, create a new sentence that contains only the words from `sentence` that **do not start with a vowel** (_a, e, i, o, u_), case-insensitive.

The new sentence should be a `str` variabled named `new_sentence`.

In [None]:
sentence = """\
I love learning Python and applying it to solve complex problems.
Lists can be troublesome but I can handle it with list comprehensions in Python."""

**<span style="color:green">Add your answer here:</span>**

> Hint: You can use the `str.startswith()` method to determine if a string starts with a given letter. If you pass a list or tuple of letters, the method will evaluate whether the string starts with any of those characters. Example:
>
> ```python
> >>> "who".startswith(("a", "e", "i", "o", "u"))
> False
> ```

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---



**<span style="color:blue">Check your answer:</span>**

In [None]:
assert new_sentence == 'love learning Python to solve complex problems. Lists can be troublesome but can handle with list comprehensions Python.'

### Exercise 3

**<span style="color:red">Challenge:</span>** Write a function named `divisible()` that takes in two variables (`max_num` and `test_num`) where the function loops through the range from 0 to `max_num` and checks to see which numbers are divisible by `test_num`. The function should return a list of all numbers in range `max_num` divisible by `test_num`.

Extra credit: use a `list` comprehension inside the definition of `divisible()`.

Example:

```python
>>> divisible(7, 62)  # numbers less than 62 that are divisible by 7.
[0, 7, 14, 21, 28, 35, 42, 49, 56]
```

**<span style="color:green">Add your answer here:</span>**

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---

def divisible(test_num: int, max_num: int) -> list:
    raise NotImplementedError

**<span style="color:blue">Check your answer:</span>**

In [None]:
assert "divisible" in globals(), "must define divisible() function"
test = divisible(7, 62)

assert test == [0, 7, 14, 21, 28, 35, 42, 49, 56]

### Exercise 4

**<span style="color:red">Challenge:</span>** Write another list comprehension that checks each value in your previous list comprehension and states whether it's even or odd.

The result should be a `list` of `str` named `even_odd` that contains the strings `"Even"` or `"Odd"` based on whether the corresponding number is even or odd.

**<span style="color:green">Add your answer here:</span>**

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---



**<span style="color:blue">Check your answer:</span>**

In [None]:
assert even_odd == ['Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even']

### Exercise 5
**<span style="color:red">Challenge:</span>** Given a sequence of numbers, `seq`, find the sum of elements that are **greater than 20**. The output should be a variable called `total` that is an `int`.

In [None]:
seq = [-21, 4, 15, 21, 25, 78, 19, 4]

**<span style="color:green">Add your answer here:</span>**

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---



Alternative answer, using a traditional `for `loop:

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---



**<span style="color:blue">Check your answer:</span>**

In [None]:
assert total == 124

<br>

## Set Comprehensions

Can you find the set of _unique cities_ in Maryland from `addresses`?

In [None]:
addresses = [
    "Fort Meade, MD",
    "Baltimore, MD",
    "Fort Worth, TX",
    "Culpeper, VA",
    "Boise, ID",
    "Baltimore, MD",
]

In [None]:
only_maryland_cities = set()
for a in addresses:
    if a.endswith("MD"):
        city = a.partition(",")[0]
        only_maryland_cities.add(city)

print(only_maryland_cities)

You can achieve this in one line of code with a **set comprehension**, which looks like a list comprehension except that it will remove duplicate entries from the result:

In [None]:
{a.partition(",")[0] for a in addresses if a.endswith("MD")}

_Note_: Unlike a `list` or `tuple`, a `set` has no concept of sortedness, and is most commonly used for fast **membership testing**.

<br>

## Dict Comprehensions

A related form is a **dict comprehension**:

In [None]:
import statistics

grades = [
    ("tom", [98.7, 94.2, 89.0]),
    ("luke", [85.7, 83.0, 89.0]),
    ("jenn", [99.1, 99.2, 100.0]),
]

Here is how you could use a `for` loop to find each student's mean grade:

In [None]:
avg_grades = {}
for student, gradeset in grades:
    avg_grades[student] = round(statistics.mean(gradeset), 2)
    
print(avg_grades)

And here is the same result, but with a `dict` comprehension:

In [None]:
{student: round(statistics.mean(gradeset), 2) for student, gradeset in grades}

### Bonus: Generator Expressions

Related to comprehensions are [generator expressions](https://www.python.org/dev/peps/pep-0289/).  These let you avoid creating an intermediate `list` object in memory if you only need to extract a particular data point from it:

In [None]:
stockdata = [
    {"ticker": "GE", "pct_chg": -2.0},
    {"ticker": "GE", "pct_chg": 2.1},
    {"ticker": "INTC", "pct_chg": 0.1},
    {"ticker": "INTC", "pct_chg": 0.3},
    {"ticker": "INTC", "pct_chg": -2.9},
]

max_ge_increase = max(row["pct_chg"] for row in stockdata if row["ticker"] == "GE")
print(max_ge_increase)

As a second example, you can reuse `all_grades` from above to find the count of grades by 10-percent bands:

In [None]:
from collections import Counter

all_grades = [98.7, 94.2, 89.0, 85.7, 83.0, 89.0, 99.1, 99.2, 100.0]
print(Counter(i // 10 * 10 for i in all_grades))

## Exercises: Comprehension (Part 2)

### Exercise 6

**<span style="color:red">Challenge:</span>** Given the variable `sentence` below, produce a dictionary (`dict`) called `len_words` that shows the **length of each word in the sentence**.

- The keys should be `str` containing individual words
- The values should be `int` containing the length of that word

In [None]:
sentence = "I love learning Python and applying it to solve complex problems"

> Hint: the `.split()` method splits up a `str` into a `list` based on a separator character. The default argument is `' '` but can be changed to a specific character.

**<span style="color:green">Add your answer here:</span>**

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---



**<span style="color:blue">Check your answer:</span>**

In [None]:
assert len_words == {
    'I': 1,
    'love': 4,
    'learning': 8,
    'Python': 6,
    'and': 3,
    'applying': 8,
    'it': 2,
    'to': 2,
    'solve': 5,
    'complex': 7,
    'problems': 8
}

### Exercise 7

**<span style="color:red">Challenge:</span>** Using a **dictionary comprehension**, construct a dictionary (`dict`) that displays whether numbers 1 through 10 (inclusive) are 'Odd' or 'Even':

```python
{1: 'Odd',
 2: 'Even',
 3: 'Odd',
 4: 'Even',
 5: 'Odd',
 6: 'Even',
 7: 'Odd',
 8: 'Even',
 9: 'Odd',
 10: 'Even'}
```

> Hint: you can iterate over `range(1, 11)` within the comprehension.

**<span style="color:green">Add your answer here:</span>**

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---



**<span style="color:blue">Check your answer:</span>**

In [None]:
assert dict_check == {
    1: 'Odd',
     2: 'Even',
     3: 'Odd',
     4: 'Even',
     5: 'Odd',
     6: 'Even',
     7: 'Odd',
     8: 'Even',
     9: 'Odd',
     10: 'Even'
}

### Exercise 8
You have a bag containing magnets. Each magnet contains an individual letter of the alphabet, such as `'a'` or `'b'`.

**<span style="color:red">Challenge:</span>** Write a function `can_you_spell()` that returns `True` or `False` if a person's name can be spelled using letters from the bag, _without replacement_.

`name` is the person's name, such as `"lynn"`, while `bag` is a list of characters representing letters in the bag, such as `["y", "n", "p", "g", "l"]`.

**<span style="color:green">Add your answer here:</span>**

In [None]:
# --- INSERT ANSWER BELOW THIS LINE ---

def can_you_spell(x: str, y: list[str]) -> bool:
    raise NotImplemented

> Note: `Container` is more broad than a `list`. This denotes that, for example, a `tuple[str]` would also be an acceptable input.

**<span style="color:blue">Check your answer:</span>**

In [None]:
assert can_you_spell("lynn", ["y", "n", "p", "g", "n", "l"])

In [None]:
assert not can_you_spell("lynn", ["y", "n", "p", "g", "l"])

In [None]:
assert not can_you_spell("lynn", ["n", "o", "p", "e"])

<br>

## Part 2: Object-Oriented Programming (OOP)

One way to help define *object-oriented programming** is to contrast it to what is it _not_.

**Functional programming** (contrasted to object-oriented programming) is a programming style that breaks programs down into functions that take inputs, produce outputs, use **immutable** data structures heavily, and does not change state of function parameters.

### Functional Programming

An example of functional programming is to decompose the calculation of [sample standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) down into its parts.:

\begin{equation*}
s = {\sqrt {{\frac {1}{N-1}}\sum _{i=1}^{N}\left(x_{i}-{\bar {x}}\right)^{2}}}
\end{equation*}

where:

- $s$ is the resulting sample standard deviation
- ${\bar {x}}$ is the sample mean
- $x_{i}$ represents each observation
- $N$ is the number of observations
- $\sum _{i=1}^{N}\left(x_{i}-{\bar {x}}\right)^{2}$ is the **sum of squared deviations**

Here's the 'messy' way of calculating `s`:

In [None]:
import math

data = (727.7, 1086.5, 1091.0, 1361.3, 1490.5, 1956.1)
N = len(data)
mean = sum(data) / len(data)
sum_of_sqdev = 0
for obs in data:
    sum_of_sqdev += ((obs - mean) ** 2)
s = math.sqrt(sum_of_sqdev / (N - 1))
print(s)

Here's how you can break this down into Python functions:

In [None]:
from math import sqrt

def sample_mean(seq: tuple) -> float:
    """Compute the arithmetic mean of sequence `seq`."""
    return sum(seq) / len(seq)

def squared_deviations(seq, mean) -> list:
    """Derive sequence of square deviations from `mean`, for each `i` in seq."""
    return [(i - mean) ** 2 for i in seq]

def sample_stdev(seq) -> float:
    """Compute sample standard deviation given a sequence of numbers."""
    mean = sample_mean(seq)
    devs = squared_deviations(seq, mean)
    N = len(seq)
    result = sqrt(sum(devs) / (N - 1))
    return result

You've now added these three functions to the **global namespace**, but the variables defined within each function body (such as `devs = ...`) are **local** (internal) to those functions.

In [None]:
# A sequence (sample) of observations
data = (727.7, 1086.5, 1091.0, 1361.3, 1490.5, 1956.1)

# Assign the standard deviation to the variable sd
sd = sample_stdev(data)
print(f"Sample standard deviation of metabolic rate data: {sd:.2f}")

The `data` object is unchanged after being passed as a parameter to `sample_stdev`.

> Python passes arguments neither by reference nor by value, but by assignment. For more detail, see [Pass by Reference in Python](https://realpython.com/python-pass-by-reference/)

### What Makes Functional Programming "Functional"?

- A program is broken into smaller, concise functions as individual building blocks.
- Functions don't attempt to change state (mutate) their arguments.

### The Functional Style: More Reading

Aside from the `sample_stdev()`, Python also has other features that let you write in a functional style, such as:

- [Built-in functions](https://docs.python.org/3/library/functions.html) such as `map()` and `filter()`
- Libraries and modules such as [`itertools`](https://docs.python.org/3/library/itertools.html) and [`functools`](https://docs.python.org/3/library/functools.html)

Further reading:

- **docs.python.org**: [Functional HOWTO](https://docs.python.org/3/howto/functional.html)
- **realpython.com**: [Functional Programming in Python](https://realpython.com/courses/functional-programming-python/)

### Intro to OOP

Object-oriented programming: a way to **structure** your code by **bundling** related properties and behaviors into individual objects. ([source](https://realpython.com/python3-object-oriented-programming/))

One prominent feature of OOP is a **class**.

A class is like a **blueprint**.  You **instantiate** a class to make individual **instances** of the blueprint.

Let's get down to it and create an `Employee` **class**.

What should this class do for us?

- You should be able to create **instances** of the class to **hold data** for individual employees (level, name, home office, etc)
- It should let you associate the employee with a timesheet
- It should provide you some **instance methods** that derive additional insights about that employee

A class is like a **blueprint**; you define a `class` once:

In [None]:
class Employee:

    valid_levels = (
        "analyst",
        "consultant",
        "senior consultant",
        "manager",
        "senior manager",
        "partner",
        "principal",
        "managing director",
    )

    def __init__(self, lastname: str, firstname: str, level: str):
        self.lastname = lastname
        self.firstname = firstname
        level = level.casefold()
        if level not in self.valid_levels:
            raise ValueError(f"Invalid level: {level}")
        self.level = level

    def __str__(self):
        """Let str(x) return a useful string representation of the Employee."""
        return f"{self.lastname}, {self.firstname} [{self.__class__.__name__} - {self.level}]"

    def is_ppmd(self) -> bool:
        """Is this person a PPMD? True if so, False otherwise."""
        return self.level in ("partner", "principal", "managing director")

You **instantiate** a class to make individual **instances** of the blueprint.

In [None]:
employee_1 = Employee(lastname="Sears", firstname="Brandon", level="manager")
employee_2 = Employee(lastname="Gallagher", firstname="Layne", level="partner")

print(employee_1)
print(employee_2)

In [None]:
print(employee_1.is_ppmd())

In [None]:
print(employee_2.is_ppmd())

So far our `Employee` class mainly holds data. (First name, last name, level.)

We can also add additional **methods** to the class that, for example, could be used to build a timesheet system:

In [None]:
import bisect
from datetime import date, timedelta
from typing import Optional

import pandas as pd


class Employee:

    valid_levels = (
        "analyst",
        "consultant",
        "senior consultant",
        "manager",
        "senior manager",
        "partner",
        "principal",
        "managing director",
    )

    def promote(self) -> Optional[str]:
        """Give Employee a promotion and *return* their new level.
        
        If they can't be promoted any further, do nothing and return None.
        """
        position_index = self.valid_levels.index(self.level)
        if position_index == len(self.valid_levels) - 1:
            # Could not promote further. Time for vacation
            return None
        self.level = self.valid_levels[position_index + 1]
        return self.level

    def __init__(self, lastname, firstname, level):
        self.lastname = lastname
        self.firstname = firstname
        level = level.casefold()
        if level not in self.valid_levels:
            raise ValueError(f"Invalid level: {level}")
        self.level = level
        
        self._time_table = []

    def add_time_entry(self, dt: date, wbs_code: str, hours: float):
        """Add a single timesheet row for this Employee."""
        # Use bisect.insort_left() to maintain sortedness by (date, wbs_code)
        bisect.insort_left(self._time_table, (dt, wbs_code, hours))
        return self
    
    def timesheet(self, since: Optional[date] = None, until: Optional[date] = None) -> pd.DataFrame:
        """Generate a timesheet as a Pandas DataFrame."""
        df = pd.DataFrame(self._time_table, columns=["dt", "wbs_code", "hours"])
        pretty_frame = df.pivot_table(index="wbs_code", columns="dt", values="hours").fillna(0)
        return pretty_frame.loc[:, since: until]

    def current_week_hours(self) -> dict:
        """Summarize current-week hours per WBS code."""
        since = self.most_recent_sunday()
        return self.timesheet(since=since).sum(axis=1).to_dict()

    @staticmethod
    def most_recent_sunday() -> date:
        """Find the most recent Sunday that falls before today."""
        today = date.today()
        while today.weekday() != 6:
            today = today - timedelta(days=1)
        return today

    @property
    def is_ppmd(self) -> bool:
        """Is this person a PPMD?"""
        return self.level in ("partner", "principal", "managing director")
    
    def __str__(self):
        """Let str(x) return a useful string representation of the Employee."""
        return f"{self.lastname}, {self.firstname} [{self.__class__.__name__} - {self.level}]"

In [None]:
emp = Employee("Loite", "Del", level="Senior Manager")
print(emp)

In [None]:
emp.level

In [None]:
emp.is_ppmd()

Del has been promoted:

In [None]:
# Alter some internal state for `emp` and return the resulting new level
emp.promote()

In [None]:
emp.level

In [None]:
emp.is_ppmd

Now Del needs to record some timesheet entries:

In [None]:
entries = [
    {"dt": date(2020, 8, 20), "wbs_code": "pto", "hours": 8.0},
    {"dt": date(2020, 8, 1), "wbs_code": "ced", "hours": 2.0},
    {"dt": date(2020, 8, 20), "wbs_code": "xyz", "hours": 9.0},
    {"dt": date(2020, 8, 17), "wbs_code": "ced", "hours": 2.0},
    {"dt": date(2020, 8, 17), "wbs_code": "abc", "hours": 11.5},
    {"dt": date(2020, 8, 18), "wbs_code": "gaa", "hours": 1.0},
    {"dt": date(2020, 8, 18), "wbs_code": "xyz", "hours": 7.0},
    {"dt": date(2020, 8, 19), "wbs_code": "ced", "hours": 2.0},
    {"dt": date(2020, 8, 16), "wbs_code": "ced", "hours": 2.0}
]
for e in entries:
    emp.add_time_entry(**e)

In [None]:
emp.timesheet()

In [None]:
emp.timesheet(since=date(2020, 8, 17))

In [None]:
emp.add_time_entry(dt=date(2021, 1, 22), wbs_code="pto", hours=8.0)

In [None]:
emp.current_week_hours()

One concept that this example illustrates is [**composition**](https://realpython.com/inheritance-composition-python/): each `Employee` holds a `_time_table` list representing rows in a timesheet.

### Inheritance

The `Employee` class above is narrow-minded in that it only accounts for Deloitte's Traditionalist track levels.

We can create separate classes for `Traditionalist`, `Specialist`, and others through **inheritance**.

`Employee` becomes the **base class**.  `Traditionalist` and `Specialist` are the **child classes** that **inherit** from `Employee`.  `Employee` defines pieces that are common to its subclasses.  These will be inherited unless they are overriden in the subclass:

In [None]:
class Traditionalist(Employee):
    valid_levels = (
        "analyst",
        "consultant",
        "senior consultant",
        "manager",
        "senior manager",
        "partner",
        "principal",
        "managing director",      
    )


class Specialist(Employee):
    valid_levels = (
        "analyst",
        "specialist senior",
        "specialist master",
        "specialist leader",
        "managing director",
    )

In [None]:
joe = Specialist("Smith", "Joe", "specialist master")
print(joe)

In [None]:
try:
    jane = Traditionalist("Doe", "Jane", "specialist leader")
except Exception as e:
    print(e)

Inherited methods still behave the same:

In [None]:
joe.is_ppmd()

In [None]:
joe.add_time_entry(dt=date(2020, 8, 20), wbs_code="hgi", hours=9.25).timesheet()

### OOP Example 2: Representing Geography

In this section, you'll continue with another exercise in OOP, but switch to building a new `Coordinates` class that represents a pair of geographical _(latitude, longtitude)_ coordinates.

In [None]:
from math import radians, asin, cos, sin, sqrt

class Coordinates:

    def __init__(self, lat: float, lng: float):
        """Make a new pair of coordinates.
        
        lat and lng: decimal degrees.
        """
        self.lat = lat
        self.lng = lat
        
        # Functions from `math` expect coordinates expressed in radians, not degrees
        self._phi = radians(lat)
        self._lambda = radians(lng)

    def __str__(self):
        """Make a human-readable string representation of the coordinates pair."""
        return f"<{self.lat}, {self.lng}>"

What are some features of modelling a coordinate pair with a `Coordinates` class?

- **Data encapsulation** and **namespacing**: You no longer have a bunch of individual variables floating around.  Each `Coordinate` instance gets its own `.lat`, `.lng` (degrees) and `._phi`, `._lambda` (radians).
- **Extensibility**: You can add additional functionality just by defining new methods.

## Exercises: OOP

### Exercise 9:

Below you'll see a `Coordinates` class that represents a single pair of geometric coordinates.

**<span style="color:red">Challenge:</span>** Implement the `Coordinates.distance_from()` **instance method** to determine the distance from one `Coordinate` to another `Coordinate`, in kilometers.

Use the Haversine formula and trigonemtric functions from the [`math`](https://docs.python.org/3/library/math.html) module:

\begin{equation*}
d = 2r \arcsin \sqrt{\sin^2 \frac{1}{2} (\phi_2 - \phi_1) + \cos{\phi_1} \cos{\phi_2} \sin^2 \frac{1}{2} (\lambda_2 - \lambda_1)}
\end{equation*}

where:

- $\phi_1$ and $\lambda_1$ are latitude and longitude for Point 1, respectively
- $\phi_2$ and $\lambda_2$ are latitude and longitude for Point 2, respectively

Hints:

- The Python functions for calculating different trigonometric functions have been imported already. For example, _arcsin_ is `math.asin`.
- Remember to `return` a float value from your `.distance_from()` method.
- `R = 6371` (radius of the earth in KM) is a constant that you'll need in the calculation; it is already provided below.

**<span style="color:green">Add your answer here</span>** (by adding Python code to the `.distance_from()` method body):

In [None]:
from math import radians, asin, cos, sin, sqrt

class Coordinates:

    def __init__(self, lat: float, lng: float):
        """Make a new pair of coordinates."""
        self.lat = lat
        self.lng = lng
        
        # Functions from `math` expect coordinates expressed in radians, not degrees
        self._phi = radians(lat)
        self._lambda = radians(lng)

    def __str__(self):
        """Make a human-readable string representation of the coordinates pair."""
        return f"<{self.lat}, {self.lng}>"

    def distance_from(self, other):
        """Approximate distance in KM from one Coordinate to `other`."""
        R = 6371  # Radius of Earth in KM
        raise NotImplementedError("TODO: Write me!")

**<span style="color:blue">Check your answer:</span>**

In [None]:
import math

c1 = Coordinates(38.8977559, -77.0704521)
c2 = Coordinates(34.9201086, -95.6922305)


assert hasattr(Coordinates, "distance_from") and callable(Coordinates.distance_from), "Must define .distance_from()"
try:
    assert math.isclose(c1.distance_from(c2), 1713, abs_tol=5.0), "Distance off by greater than 5 km"
except TypeError as e:
    raise AssertionError("Did you forget to add a parameter to .distance_from()?") from e

### Exercise 10

Below you'll see a `Coordinates` class that represents a single pair of geometric coordinates.

**<span style="color:red">Challenge:</span>** Implement the `Coordinates.from_string()` **classmethod** to let a user form a new coordinates pair from a `str` representing a coordinates pair.

Allow the input coordinates to be separated by:

- a comma
- whitespace
- or a combination of the two

Examples:

```python
c1 = Coordinates.from_string("38.8977559, -77.0704521")
c2 = Coordinates.from_string("38.8977559,-77.0704521")
```

**<span style="color:green">Add your answer here</span>** (by adding Python code to the `.from_string()` classmethod body):

In [None]:
from math import radians

class Coordinates(object):

    def __init__(self, lat: float, lng: float):
        """Make a new pair of coordinates."""
        self.lat = lat
        self.lng = lng
        
        # Functions from `math` expect coordinates expressed in radians, not degrees
        self._phi = radians(lat)
        self._lambda = radians(lng)

    def __str__(self):
        """Make a human-readable string representation of the coordinates pair."""
        return f"<{self.lat}, {self.lng}>"

    @classmethod
    def from_string(cls, coords: str):
        """Parse a string into a Coordinates object.
        
        coords: a str representing the coordinates pair, such as '38.8977559, -77.0704521'
        """
        # Allow the input string to be separated by a comma, whitespace, or combination of the two
        raise NotImplementedError("TODO: Write me!")

**<span style="color:blue">Check your answer:</span>**

In [None]:
try:
    import numpy as np
except ModuleNotFoundError as e:
    raise AssertionError("Looks like you're missing Numpy to check this solution") from e

assert hasattr(Coordinates, "from_string") and callable(Coordinates.from_string), "Must implement .from_string()"
c1 = Coordinates(38.8977559, -77.0704521)
c2 = Coordinates.from_string("38.8977559, -77.0704521")
c3 = Coordinates.from_string("38.8977559 -77.0704521")
c4 = Coordinates.from_string("38.8977559,-77.0704521")

assert np.ptp([c1.lat, c2.lat, c3.lat, c4.lat]) < 0.0001
assert np.ptp([c1.lng, c2.lng, c3.lng, c4.lng]) < 0.0001

## Conclusion

Here's a summary of what you covered in this tutorial:

- `list`, `set`, and `dict` comprehension: Create new data structures through concise **Pythonic** syntax.
- Object-oriented programming: Objects **encapsulate data** and **provide functionality**.

## More Resources

Interested in diving deeper?  Here are some places to start:

- **docs.python.org**: [List comprehensions](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions)
- **attrs.org**: [`attrs` - classes without boilerplate](https://www.attrs.org/en/stable/)
- **realpython.com**: [Object-oriented programming (OOP)](https://realpython.com/search?q=oop)
- **wikipedia.org**: [Haversine formula](https://en.wikipedia.org/wiki/Haversine_formula)
- **wikipedia.org**: [Standard deviation](https://en.wikipedia.org/wiki/Standard_deviation)
- **docs.python.org**: [PEP 8, Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/)