<a href="https://colab.research.google.com/github/goteguru/kmooc_python/blob/main/notebooks/en/kmooc_04_1_python_classes_en.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Custom types (Classes)

In every program we work with data (types that formalize them) and functions that process them. Certain functions operate on certain types — they are closely tied to those types. For example, concatenation or character replacement is closely tied to the text type; these operations don't make much sense for floating point numbers. It is therefore logical to bind these functions more closely to the type.

In Python this binding is expressed by functions attached to types, which are written by placing the type before the function name separated by a dot. It might be more accurate to say we write the function name after the type name. Such a function bound to a type is called a method.



In [None]:
int.bit_count # this is a "method", a function bound to the int type

If you run the code block above, you'll see that Python itself calls it a method. The data itself is called an object. So methods are functions defined on int-type objects (data).

A type can have not only functions but also data bound to it. (Of course, in Python functions are also data, so one could say a type has all kinds of data attached, including functions.)

Such data bound to a type is called an attribute. (More precisely the label is called that, but that's splitting hairs).

In [None]:
int.denominator # this is, for example, an attribute bound to the int type
int.imag # and this is another one

We have also seen that concrete data items (often called instances) inherit these attributes and methods from the type, so they can be applied on them as well:

In [None]:
(42).imag # now we finally find out the imaginary part of 42...!

Of course, usually we attach a label and use it like this:

In [None]:
n_val = 42
n_val.denominator # now we also find out its denominator!

In [None]:
n_val.bit_count() # call a method on it!

Here, however, there is a small twist with methods: when we call a method on the data (the instance), the rule is that the method receives the data itself as the first parameter.

So in the example above int.bit_count() received n_val as a parameter; in other words the following happened:

In [None]:
int.bit_count(n_val)

### Custom Types

We can create custom types ourselves with the `class` keyword! In the same way we can give them attributes and methods, which every instance of that type will inherit.

The format is super simple, it looks like this:

In [None]:
class CustomType:
  custom_attr = 42

In [None]:
CustomType.custom_attr

Now we have a custom type that even has an attribute! But how do we create a data item that is exactly of this type?
Well, we've already seen this with other types: use the type name as a function. So we automatically get a type-constructor function.

In [None]:
custom_obj = CustomType()
custom_obj.custom_attr

### Storing information in a custom type

So far so good, we can create a custom type, which is nice, but it would be even better if instances of that type could store some information, otherwise they are not very useful.

The fact is we can already store anything in it, because in Python the basic thing we get is dict-like (in fact it's implemented in terms of a dict) and we can put anything into it. The difference is that instead of square-bracket indexing we use the dot operator to "index".


In [None]:
custom_obj.flower = "peony"
custom_obj.value = 998712

print(custom_obj.flower, custom_obj.value)

Why did they do it this way? Perhaps to resemble the syntax used in object-oriented programming in other languages. But if you know that from another language, don't be misled — there isn't really the same object-oriented machinery here. At best you can simulate some object-oriented behavior; there is some similarity but it's a completely different system.

Anyway it's much easier to write `custom_obj.flower` than `custom_obj["flower"]`. Of course a `dict` key could be anything, e.g. Chinese characters or a string with spaces or even a float. With this attribute syntax you give up that flexibility: an attribute can only be a name that is valid as a variable identifier.

So we can store information in our object. But it's not a good idea to write into the object directly as we did above. It's better if all information in the object can be written only via functions associated with the type (methods), so only data the type knows about is present.

After all, this is not a `dict`. It's not intended for stuffing random things into it. The purpose is to give a format to some abstraction — for example we could make a `Point` type with `x` and `y` attributes, or a `Person` type that has a name and can do a few things, like greet.

It's not good if the rest of the program randomly puts things into our point or person. The safe approach is to create such attributes only inside methods.

In [None]:
class Person:
  def set_name(self, name): # self here is the data instance
    self.name = name

  def greet(self):
    print(f"Hello, {self.name}")

someone = Person()
someone.set_name("Béla") # try commenting out this line
someone.greet()

Hello, Béla


Here we created a `Person` type. The person type has two attached functions (member functions or methods) that only make sense for the Person type. One is `set_name`, which gives a name to the person, the other is `greet`. First we created a Person instance, labeled it (`someone`), then named and greeted it using the methods.

Both methods receive the instance itself as the first parameter; this is indicated by the `self` parameter (it could be named differently, but conventionally `self` is used). This is how the method can put data into the object or read from it.

But what would happen if we didn't name the person and tried to greet?
We would of course get an error, because greeting uses a data attribute (`name`) that doesn't exist yet!

We can avoid this by assigning a value when the data is created (initializing it).

In [None]:
# type with an initializer (initialization)

class Person:
  def __init__(self, name="unknown"):
    self.name = name

  def greet(self):
    print(f"Hello, {self.name}")

someone = Person()
kati = Person("Katika")

someone.greet()
kati.greet()

`__init__` is a special function (in Python names wrapped with double underscores are always special; avoid defining such functions unless you mean to). Its special property is that when we use the type name as a function (`Person("Katika")`) this function is called: it receives a new instance of the type as its first parameter (which goes into `self`) and the rest of the arguments as the remaining parameters (in this case the `"Katika"` goes into `name`). Because of these implicit, hidden properties, Python calls such special functions "magic functions".


We can define other special names to customize operations for our type (class). For example we can define addition or multiplication so that `+` or `*` mean whatever we want for our type. Remember that for `float` the `+` operator does arithmetic sum, while for `str` it does concatenation. Similarly we can define what operators mean for our types. This behavior is referred to as polymorphism in programming.

In [None]:
class Vector:
  def __init__(self, x, y):
    self.x = x
    self.y = y

  # define addition of two vectors
  def __add__(self, other_vector):
    return Vector(self.x + other_vector.x, self.y + other_vector.y)

  # multiplication behaves differently for scalar and vector
  def __mul__(self, a):
    if type(a) == int or type(a) == float:
      return Vector(self.x * a, self.y * a)
    elif type(a) == Vector:
      return Vector(self.x * a.x, self.y * a.y)
    else:
      raise ValueError("Can only multiply Vector by Vector or by a scalar.")

  # this magic function affects how the type is displayed when printed
  def __str__(self):
    return f"[{self.x}->{self.y}]"


######### Usage ############

v1 = Vector(34, 89)
v2 = Vector(11, 2)

print(v1 + v2)
print(v1 * v2)
print(v1 * 2)

v3 = v1 * v2 + v1 * 6 + v1 * v1
print(v3)



## Inheritance

One useful feature of types is inheritance. When defining our type we can specify which existing type's capabilities it should inherit. So if there is already a type that's almost what we need but missing some capability, we can simply derive a new type from it that has everything the original had plus whatever we add. You can call it a subtype.

Many sources call these the base class and derived class; that's where the keyword comes from, but we'll stick to the term type.

In [None]:
# The Name type is just like str...
class NameStr(str):
  # you can even make it greet:
  def greet(self):
    print(f"Hello {self}!")


n = NameStr("Károly")

# the new method (member function) also works
n.greet()

# but we can also use the old (inherited from str) capabilities:
print(n.upper())
print(n + '_' + n)
print(n * 5)


# Enum

Sometimes we create a type just to distinguish a small number of discrete variants. For example, whether someone is male or female, or the days of the week. Such a type is best derived from the `Enum` (enumeration) type.

In [None]:
from enum import Enum

class Gender(Enum):
  Male = 1
  Female = 2
  Other = 3

class Day(Enum):
  Monday = 'Mon'
  Tuesday = 'Tue'
  Wednesday = 'Wed'
  Thursday = 'Thu'
  Friday = 'Fri'
  Saturday = 'Sat'
  Sunday = 'Sun'

class Person:
  def __init__(self, name="unknown", gender=Gender.Other, birthday=None):
    self.name = name
    self.gender = gender
    self.birthday = birthday


petya = Person("Péter", Gender.Male, Day.Wednesday)
kati = Person("Kati", Gender.Female, Day.Tuesday)

print(petya.gender)
print(kati.birthday)

print ("born on the same day?", petya.birthday == kati.birthday)


Obviously we could have used plain numbers to distinguish genders or strings (or numbers) for the days, but that would hurt readability. Clarity is often useful. This way we're sure a person was born on some specific day type and not on a "magicday" or "two-monday". If we require that a person's birthday must be of type Day (which we could enforce with a simple if statement) then we can't mess it up — the program would raise an error when we try.

Try to modify the code above so that __init__ refuses to create a person whose birthday is not of type Day!

## Dataclass

Python is not a strongly typed language, meaning it's not mandatory to declare the types of variables (labels) in advance. You can attach any type to any label.
Sometimes, however, you want to ensure that incorrect types simply cannot be stored somewhere — for example, that a person's name is guaranteed to be a string and not a float.

In such cases you can create classes enhanced with the @dataclass decorator which enforce the correct format.

In [None]:
from dataclasses import dataclass

@dataclass
class PersonData:
  name: str
  age: int
  height: float
  dependent: bool


dorka = PersonData("Dorka", 18, 1.81, False)
karcsi = PersonData("Sommer Károly", 11, 1.11, True)

## A more complex example:

Below we create a new list type that can be indexed as if walking around a clock face: after the last element the first follows again, so it can return an element for arbitrarily large index values:

In [None]:
class ModList(list):
  def __getitem__(self, n):
    return list.__getitem__(self, n % len(self))

ml = ModList([1,2,3,4,5])

print(ml[0])
print(ml[7])
print(ml[39457])


The `__getitem__` magic method is responsible for square-bracket indexing, so when `v[something]` is used, the `__getitem__` method of the type of `v` is called with two parameters: the first is the instance itself and the second is the something.

Note that in our `__getitem__` implementation we simply used the built-in list's `__getitem__` — we don't care how it does it, only that it does it — we just took the modulus of the index with the list length.

What would happen if we indexed a ModList instance with a slice instead of a simple number? (e.g. `ml[0:4]`)

Why?


## Challenge

1. Create a Team type that represents a group of people. It should keep track of who is in the team (their names). The addition operator should be defined for teams so that adding two teams produces their union (everyone who is in either team). It should be possible to ask whether a given person is a member of the team.
2. Every team should have a leader, which can be queried. When two teams are merged, the leader of the first team should be the leader of the new team. If the leader is removed from the team, someone else should become the leader.
3. For every member record when they joined the team! So store the current date at the time of addition.

So it should be usable like this:
```
management = Team("Béla", "Kati")
secretariat = Team("Szilvi", "Kati")
sales = Team("Eszter", "Norbi", "Réka")

backoffice = management + secretariat
everyone = backoffice + sales

secretariat.has_member("Norbi") # --> False
everyone.has_member("Norbi") # --> True
```

