`advanced_classes.ipynb` [5-Oct-2021] is provided to NHS England under licence from Faculty Science Ltd.

# Dataclasses

One thing that python introduced in version 3.7 was the concept of a data class. These are best used for classes that are really more around storing data rather than having lots of complex logic. The great thing they do is remove the need to write a load of boiler plate code

First we need to import the dataclasses library

In [None]:
from dataclasses import dataclass 
from typing import List

A dataclass can be defined as follows. Note it is required to use type hints for the implementation

In [None]:
@dataclass
class Person:
    name: str
    age: int
    friends: List[str]

What we get for "free"

### Initialisation:

In [None]:
person_1 = Person("Alice", 23, ["Bob", "Peter"])
person_2 = Person("Alice", 23, ["Bob", "Peter"])
person_3 = Person("Sam", 23, ["Bob", "Peter"])

### Representation:

In [None]:
print(person_1)

### Comparision: 

In [None]:
print(person_1==person_2)
print(person_1==person_3)

We can easily add a couple of extra features we might want:

### Ordering:

In [None]:
@dataclass(order=True)
class Person:
    name: str
    age: int
    friends: List[str]

In [None]:
person_1 = Person("Alice", 23, ["Bob", "Peter"])
person_2 = Person("Alice", 23, ["Bob", "Peter"])
person_3 = Person("Sam", 21, ["Bob", "Peter"])

In [None]:
print(person_1 < person_3)

This can be somewhat confusing. Basically the comparison is done field be field looking at the ordering on the underlying datatype. As S is greater than A we can understand this result.

### Frozen:

In [None]:
@dataclass(frozen=True)
class Person:
    name: str
    age: int
    friends: List[str]

The frozen attribute makes the dataclass immutatible. This is useful for keeping data consistent in an application.

In [None]:
person_1 = Person("Alice", 23, ["Bob", "Peter"])
person_1.name = "Betty"

## Default fields

We can set default values in data classes similarly to how we did for classes

In [None]:
from dataclasses import field

@dataclass
class Person:
    name: str = "name"
    age: int = 18
    friends: List[str] = field(default_factory=list)

Note that for mutable datatypes we have to use the field function from the dataclass library. This stops the problem we saw before of all objects sharing the same list

In [None]:
dummy_person = Person()
print(dummy_person)

## Adding Methods

Works the same way as before

In [None]:
@dataclass
class Person:
    name: str = "name"
    age: int = 18
    friends: List[str] = field(default_factory=list)
    
    def print_name(self):
        print(self.name)

In [None]:
dummy_person = Person()
dummy_person.print_name()

## Design idea

A good idea for designing your code is to write immutable data classes and functions that act on them. This way functions are "pure" and only act on the data and do not have "side effects" making them much easier to test and reason about. 

In [None]:
def add_friends(person: Person, new_friends: List[str]) -> Person:
    all_friends = person.friends + new_friends
    return Person(person.name, person.age, all_friends)

In [None]:
dummy_person = Person()
print(dummy_person)
new_person = add_friends(dummy_person, ["Bob"])
print(new_person)

This is a more functional approach to programming, the idea is to encode the logic in "pure" functions rather than classes. Because of this it is often easier to scale the code to distributed systems.  

# Abstract Classes

The idea behind abstract classes is to provide an interface for other developers to be able to extend the code but leave core functionality alone.

In [None]:
class MeanDataProcessor:
    
    def __init__(self, data: List[float]):
        self.data = data
    
    def compute(self):
        return sum(self.data)/len(self.data)


In [None]:
def display_statistics(data_processor: MeanDataProcessor):
    processed_data = data_processor.compute()
    print(processed_data)

In [None]:
def main():
    data = [1,2,3,4]
    data_processor = MeanDataProcessor(data)
    display_statistics(data_processor)

main()

Now the display_statistics function doesn't need to know anything about the MeanDataProcessor's implmenetation, it only needs to know that it has a method called compute. We can generalise this function to work on basically any DataProcessor style of class provided it has a compute method. We can now rewrite display_statistics to take in a generic DataProcessor class that promises to always have this method. That way we can extend this function without ever needing to rewrite it. To do this we use Abstract Classes

In [None]:
from abc import ABC, abstractmethod

class DataProcessor(ABC):
    
    def __init__(self, data: List[float]):
        self.data = data
    
    @abstractmethod
    def compute(self):
        pass

We need to import two objects from the abc library. ABC which is the base class to inherit from and the decorator abstractmethod to mark methods we will promise to implement.

You can think of this class as a template or an interface. As such we can't actually create objects from it. Only inherit from it.

In [None]:
data_processor = DataProcessor()

Lets reimplement MeanDataProcessor:

In [None]:
class MeanDataProcessor(DataProcessor):
    def compute(self):
        return sum(self.data)/len(self.data)        

In [None]:
data = [1,2,3,4]
data_processor = MeanDataProcessor(data)
data_processor.compute()

So this still works, we can now update the typing on display_statistics()

In [None]:
def display_statistics(data_processor: DataProcessor):
    processed_data = data_processor.compute()
    print(processed_data)

In [None]:
def main():
    data = [1,2,3,4]
    data_processor = MeanDataProcessor(data)
    display_statistics(data_processor)

main()

We can now change the behaviour of this function without having to update this function as we know whatever object we pass to it will have a compute method!

In [None]:
class MaxDataProcessor(DataProcessor):
    def compute(self):
        return max(self.data)

def main():
    data = [1,2,3,4]
    data_processor = MaxDataProcessor(data)
    display_statistics(data_processor)

main()

The above example may seem a bit too minimal to actually be useful. Hopefully the next example will show why this is actually useful. 

## The strategy pattern

This example was heavily inspired from https://www.youtube.com/channel/UCVhQ2NnY5Rskt6UjCUkJ_DA I recommend his channel if you want to learn more advanced software design patterns!

In this example we'll set up a trading bot using abstract classes and the strategy pattern. A typical trading flow will look like:
- Connect the the exchange
- Fetch market data
- Determine how to act of the market data
- Close the connection the the exchange

We can imagine that the only step we might want to alter is how to act on the market data.

First we set up a template of what a trading strategy should look like. Given market data should we buy or should we sell?

In [None]:
from abc import ABC, abstractmethod

class TradingStrategy(ABC):
    
    @abstractmethod
    def should_buy(self, prices: List[float]) -> bool:
        pass
    
    @abstractmethod
    def should_sell(self, prices: List[float]) -> bool:
        pass
    

We also need an exchange object to interact with. Typically this would form an interface with the actual exchange but we'll just mock this out for the example

In [None]:
class ExchangeConnectionError(Exception):
    """Custom exception raise if we request prices but aren't connected to the exchange"""
    pass

class Exchange:
    """Simulates an exchange"""
    def __init__(self) -> None:
        self.connected = False
        
    def connect(self) -> None:
        self.connected = True
        print("Connection established")
    
    def disconnect(self) -> None:
        self.connected = False
        print("Connection closed")
    
    def check_connection(self):
        if not self.connected:
            raise ExchangeConnectionError()
    
    def get_market_data(self) -> List[float]:
        self.check_connection()
        return [3.4,3.5,3.2,3.6,3.1,3.4]
    
    def buy(self, amount: float) -> None:
        self.check_connection()
        print(f"You bought {amount}.")
    
    def sell(self, amount: float) -> None:
        self.check_connection()
        print(f"You sold {amount}.")
    

Now lets build out the bot:

In [None]:
class TradingBot:
    
    def __init__(self, exchange: Exchange, trading_strategy: TradingStrategy) -> None:
        self.exchange = exchange
        self.trading_strategy = trading_strategy
              
    def run(self) -> None:
        prices = self.exchange.get_market_data()
        should_buy = self.trading_strategy.should_buy(prices)
        should_sell = self.trading_strategy.should_sell(prices)
        if should_buy:
            self.exchange.buy(10)
        elif should_sell:
            self.exchange.sell(10)
        else:
            print("Hold postion")
        

Now lets build a trading strategy. Here's a strategy that doesn't really work in the real world but seems sensible. Compute the average of the market data and buy/sell if the last price is above or below it

In [None]:
class BetterThanAverage(TradingStrategy):
    
    @staticmethod
    def _compute_average(prices: List[float]):
        return sum(prices)/len(prices)
    
    def should_buy(self, prices: List[float]) -> bool:
        return prices[-1] > self._compute_average(prices)
    
    def should_sell(self, prices: List[float]) -> bool:
        return prices[-1] <= self._compute_average(prices)

Now lets implement all this in a main function

In [None]:
def main() -> None:
    exchange = Exchange()
    exchange.connect()
    
    trading_strategy = BetterThanAverage()
    
    bot = TradingBot(exchange, trading_strategy)
    bot.run()
    
    exchange.disconnect()

main()

Seems like a lot of work but now if we want to change the trading strategy we only have to write a new class and update main rather than worrying about checking all the code thats to do with the process of buying and selling on the exchange. This is good because we can heavily test that code and then never touch it again!

Lets add a new trading strategy which is just buy when price is above a given value

In [None]:
from dataclasses import dataclass 

@dataclass
class AboveAValue(TradingStrategy):
    buy_value: float = 3.0
    
    def should_buy(self, prices: List[float]) -> bool:
        return prices[-1] > self.buy_value
    
    def should_sell(self, prices: List[float]) -> bool:
        return prices[-1] <= self.buy_value

In [None]:
def main() -> None:
    exchange = Exchange()
    exchange.connect()
    
    trading_strategy = AboveAValue(3.6)
    
    bot = TradingBot(exchange, trading_strategy)
    bot.run()
    
    exchange.disconnect()

main()

Barely any work!