## Data-Oriented and ASyncIO

Reference

- https://towardsdatascience.com/data-oriented-programming-with-python-ef478c43a874/
- https://medium.com/velotio-perspectives/an-introduction-to-asynchronous-programming-in-python-af0189a88bbb
- https://realpython.com/python-async-features/

## Data-Oriented

A recap on Data-Oriented Programming by Yehonathan Sharvit (book published in 2022). Book used JavaScript and Java, in this treatment Python is used

![DOD Book Cover](gfx/dod-book-cover.png)

Python is a hybrid of OOP and FP

### Principles are language-agnostic

1. Separate code from data in a way that the code resides in functions whose behavior does not depend on data that is encapsulated in the function’s context.
2. Data is represented with generic data structures, such as maps (or dictionaries) and arrays (or lists).
3. Data should never change! Instead of mutating data, a new version of it is created.
4. The expected shape of data (it's schema) is represented as (meta) data that is kept separately from the main data representation.

In [None]:
# Principle #1
from dataclasses import dataclass

# A natural way of adhering to this principle in Python is to use top-level functions (for code)
# and data classes that only have fields (for data).

@dataclass     # <- a decorator! (a function that takes a function)
               # automatically adds generated special methods such as __init__() and __repr__() to user-defined classes
class AuthorData:
    """Class for keeping track of an author in the system"""

    first_name: str
    last_name: str
    n_books: int

# The code that deals with full name calculation is separate from the code that deals with the creation of author data.
def calculate_name(first_name: str, last_name: str):
    return f"{first_name} {last_name}"

author_data = AuthorData("Isaac", "Asimov", 500)
calculate_name(author_data.first_name, author_data.last_name)

In [None]:
# Principle #2
# the "class" defines a schema, or how information is organized
class FullName:
    def __init__(self, first_name, last_name, suffix):
        self.first_name = first_name
        self.last_name = last_name
        self.suffix = suffix

obj = FullName(fist_name="Jane", last_name="Doe", suffix="II") # this leads to an actual error

In [None]:
# Principle #2
# using a generic data structure is easier, but can lead to errors

# The existence of data schema at a class level makes it easy to discover the expected data shape.
# When data is represented with generic data structures, data schema is not part of the
# data representation.

names = []
names.append({"first_name": "Jane", "last_name": "Doe", "suffix": "III"})
names.append({"first_name": "Isaac", "last_name": "Asimov"})
names.append({"fist_name": "John", "last_name": "Smith"}) # error, "fist_name" should be "first_name"

print(f"{names[2].get('first_name')} {names[2].get('last_name')}")
# no schema, and using a generic data structure leads to a silent error - "None" is printed

In [None]:
# Principle #3
from dataclasses import dataclass

# The immutable data types in built-in Python are int, float, decimal, bool,
# string , tuple and range. Note that dict , list and set are mutable.
@dataclass(frozen=True)
class AuthorData:
    """Class for keeping track of an author in the system"""

    first_name: str
    last_name: str
    n_books: int

In [None]:
# Principle #3
# list is mutable
def append_to_list(el, ls=[]):
    ls.append(el)
    return ls

print(append_to_list(1))   # [1]
print(append_to_list(2))   # [1, 2]
print(append_to_list(3))   # [1, 2, 3]

In [None]:
# Principle #3
# fix to make list immutable
def append_to_list(el, ls=None):
    if ls is None:
        ls = []    
    ls.append(el)
    return ls

print(append_to_list(1))   # [1]
print(append_to_list(2))   # [2]

### Free concurrency safety

When data is mutable in a multi-thread environment, race condition failure can occur.

In [None]:
# Principle #3
# list is mutable and tuple is immutable, as we expand both objects,
# list identity remains the same whereas a brand new tuple is created with a different identity
list1 = [1, 2, 3]
tuple1 = (1, 2, 3)

print(id(list1))   # 1859329589504
print(id(tuple1))  # 1859328732288

list1 += [4, 5]
tuple1 += (4, 5)

print(id(list1))   # 1859329589504 (identity did not change)
print(id(tuple1))  # 1859329720944 (identity changed)
# The need to copy contents of immutable object into a new object every time we modify
# it requires additional memory and creates added cost on CPU power, especially for
# a very large collection.

## JSON Schema

```json
schema = {
    "required": ["first_name", "last_name"],
    "properties": {
        "first_name": {"type": str},
        "last_name": {"type": str},
        "books": {"type": int},
    }
}
```

## Asynchronous Programming

Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread

![Async Programming](gfx/async-programming.png)

## How Does Python Do Multiple Things at Once?

![Programming Models](gfx/programming-models.png)

The OS is not participating. As far as OS is concerned you’re going to have one process and there’s going to be a single thread within that process, but you’ll be able to do multiple things at once.

## AsyncIO

- asyncio is the new concurrency module introduced in Python 3.4. It is designed to use coroutines and futures to simplify asynchronous code and make it almost as readable as synchronous code as there are no callbacks.

- asyncio uses different constructs: event loops, coroutines and futures.