# Deep dive into lazy iterators in Python
### Definition
**Lazy evaluation** refers to the process of only doing computations/evaluating expressions when said value/computation is required. In general it helps save a large amount of memory. This is in contrast to **eager evaluation**.

For instance, consider a large dataset. Say you want to do some computations. The eager approach (using Pandas) would do every intermediate computation line by line, taking up a lot of memory. The lazy approach (using Polars) would be to write the instructions to reach the desired computation, then optimize them, and only execute them upon a specific call (e.g. ```df.collect()```).

### Iterables, Iterators, Generators.
**Iterables** are any objects which implements the ```__iter__()``` method that returns an iterator. Python has various built-in iterables like lists, sets, dicts, tuples and strings. An iterable will have all its elements already in memory, so it is not ideal for memory efficiency.

Categories of built-in iterables:
- sequence: An iterable that is ordered and has access using integer indices via ```__getitem__()``` and which defines ```__len__()``` to return the length of the iterable. This includes ```lists``` and ```tuples```.
- set objects: Unordered iterable with unique elements only. Includes ```sets``` and ```frozen sets```. Its like a dictionary but without values to match the keys, i.e. the ```__getitem__(key)``` and ```__setitem__(key,value)``` functions do not exist.
- mapping: An iterable that has ```__getitem__()``` and ```__len__()``` and a hash function that is used for indexing rather than integers as in sequences. Includes ```dicts``` and ```defaultdict```.
- file objects: Objects instantiated by ```open(file_name,open_mode)```.
- generators and generator expressions: More on them below.

Sidenote on **slice** objects: When a sequence is called using a range of indices via ```sequence[i:j:k]``` notation, a slice object is instantiated which stores the set of indices. It has ```start```, ```stop``` and ```step``` corresponding to each of i, j, and k.

**Iterators** are objects which implement the ```__next__()``` method. An iterable turns into an iterator when you call ```iter(your_iterable)``` The most common place you see iterables being turned into iterators is in for loops as seen below:
```python
lst = [1,2,3]
for x in lst:
    print(x)

"""
Under the hood:
it = iter(lst)
next(it)
next(it)
next(it)
"""
```

When an iterable is turned into an iterator ```it=iter(your_iterable)```, calling ```__next__()``` returns the next element until it raises ```StopIteration```. Then you have to call ```it = iter(your_iterable)``` again as the original ```it``` is no longer usable.

A note on memory usage:
- An **eager iterator** uses precomputed data like a list, tuple, string as reference so that the iteration is just a reading of stored values. This means we have an iterable that stores all the data and an iterator which stores a reference and index.
- A **lazy iterator** or **generator iterator** computes the values on demand and saves memory when working with large or infinite data. 

**Generator functions** are functions which return a **generator iterator**. It uses ```yield``` instead of return expressions to produce values one at a time. To use a generator function, you first call it to get the iterator, then use repeatedly call ```next()``` on the iterator to get the values out of it one by one. This is done via a for loop or manually.

```python
# Generator function defined with a for loop
def squares_for(n):
    for i in range(n):
        yield i*i

# Generator function defined with a while loop
def squares_while(n):
    i = 0
    while i < n:
        yield i*i
        i += 1

# Infinite generator
def squares_inf(n):
    i = 0
    while True
        yield i*i
        i += 1
```

When ```yield``` is called the function returns the output and freezes the process while remembering the execution state (i.e. all the local variables and their values in the function). Then the generator only continues when it is called again.

You can also nest generator functions inside on another. This is shown in the example below:
```python
# Generator function defined with another generator
def gen1():
    yield 1
    yield 2

def gen2():
    yield "a"
    yield "b"

def combined():
    for x in gen1():
        yield x
    for y in gen2():
        yield y

list(combined())  # [1, 2, 'a', 'b']

"""
Implementation using "yield from" keyword
def combined():
    for x in gen1():
        yield x
    for y in gen2():
        yield y
"""
```

A **generator expression** makes a lazy iterator using list comprehension like syntax, except you use (...) instead of [...]. In the case of say regular list comprehension, you build up the entire list in memory, while using a generator expression simply returns a generator iterator.

# Note on memoization

**Caching** or memoization refers to the storage of frequently accessed/expensive computations and data. This data is known as the **cache**. During a call for data (e.g. a function call), there will be a **cache lookup**, followed by either a **cache hit** or **cache miss** (in which case you run the computation/fetch of data).

The cache is limited in size so when it is filled up, it uses **eviction policies** to get rid of existing data. One impotant policy is LRU. **LRU** stands for least recently used. It removes the oldest used cache data when full. Python contains a built-in decorator ```@lru_cache(maxsize = n)``` for functions which are called frequently and with similar parameters. It memoizes the results of function calls and is often used in recursion and expensive computations.


# Deep dive into threads and processes

### Threads and thread-safety

A **thread** is a single independent seqeuence of execution within a **process** and thus share the same memory of the process. Multiple threads in the same process are not guaranteed to have the same order of execution every time.

Under shared memory, multiple threads in the same process face issues of corruption and race conditions. **Corruption** occurs when data is modified unintentionally (e.g. x += 1 called multiple times). A **race condition** occurs when threads are not synchronized and introduce randomness in outcomes. When code is **thread safe**, that means that being accessed by multiple threads at the same time does not cause corruption or race conditions.
- Under the hood, python has **GIL** (global interpreter lock) built-in which is a mutex (mutual exclusion) that makes memory management thread-safe by only one thread executes bytecode  at a time. Part of the memory management done in Python requires keeping track of the number of pointers/references to every object, only freeing memory when this count reaches 0. Without GIL, memory may randomly be freed (i.e. things just get deleted), causing future access by other threads to lead to crashes.
- However, whatever code and logic you write may not be thread-safe. For instance doing ```counter += 1``` requires loading, adding and storing. If 2 threads do this operation at close enough times, instead of load 1 -> update 2 -> load 2 -> update 3, we get load 1 -> load 1 -> update 2 -> update 2, losing an increment and corrupting the ```counter``` variable. The above steps still execute one bytecode at a time, but not one thread at a time. To make code thread-safe, we can use thread locks/mutex (```threading.Lock()```) among other threading primitives.

Usually, you run a process on a single thread which executes all instructions one at a time **Multithreading** is the idea of having multiple threads for a single process run **concurrently** (still following GIL). This doesn't necessarily speed up code, however, since the GIL forces only one thread to run at a time. In fact it can slow things down by having to switch between threads (interleaving execution).

**Free threading** is the idea of removing the GIL entirely so that you can actually run things simultaneously, i.e. multiple threads can be executing bytecode at a time. This is where a lot of big speed ups can occur, however at the same time a lot of thread-safety issues too.

### Processes, parallelism and multiprocessing
A **process** is an instance of a running program that has its own memory space (variables, heap, stack), system resources and execution state separate from other processes. It consists of either just one or multiple threads executing together. By default, running say a simple ```.py``` python file will just activate a main thread and process. Using multiple threads can be specified, however, using the ```threading``` library.

**Multiprocessing** is the idea of setting up multiple processes for a single progam (```.py ``` file). Python's ```multiprocessing``` library is the built-in implementation of this functionality.It allows for **parallelism** whereby multiple bytecodes are being executed at once, under the condition that we have multiple cpu cores. If we have only one core however, we end up with concurrenct execution since a single cpu core can only do one instruction at a time. One important thing, even if running on one core, is that multiprocessing is "safer" than multithreading in the sense of corruption and race conditions.

# Strings in Python
### Text sequence objects
```str``` objects are iterables known as the text sequences. They essentially have all the same dunder functions as lists: ```__getitem__()```,```__len__()```,```__reversed__()```,```contains__()```, etc. The primary difference with say a list is that you can't change strings inplace, so ```s[i]=x``` returns an error.

### Built-in string functions
There are many string functions. These can be split into those to do with casing (capitalization), creating strings with variables, boolean checks, formatting/modifying strings, indexing/searching, and converting between list and string form.

- Capitalization:
    - ```str.lower()``` : Turns everything lowercase
    - ```str.upper()``` : Turns everything uppercase
    - ```str.capitalize()``` : First letter capitalized and the rest lowercase
    - ```str.casefold()``` : Another form of lower()
    - ```str.swapcase()``` : Inverts the cases.
    - ```str.title()``` : Converts the string into titlecase.
- Adding variables:
    - ```str.format(args,kwargs)``` : The string can contain literal text or replacement fields delimited by {}. The replacement fields either have the numeric index of the positional argument or a keyword argument.
    - ```str.format_map(mapping)``` : Same as formatting but allows mapping to be a subclass of dict which lets you do stuff like specify what happens if the kwarg is missing using ```__missing__(self,key)```.
- Boolean checks:
    - Alphabet and numbers:
        - ```str.isalnum()``` : Checks if the string is made up of only alphanumeric characters and is non empty.
        - ```str.isalpha()``` : Checks if the string is made up of only the alphabet and is non empty.
        - ```str.isnumeric()``` : Checks if the string is made up of only numbers (including fractions, roman numerals, etc)  and is non empty.
        - ```str.isdigit()``` : Checks if the string is made up of only digits and is non empty. Digits includes things like chinese numbers too, or superscripted numbers.
        - ```str.isdecimal()``` : Checks if the string is only made of the regular 0-9 digit characters.
    - Others:
        - ```str.islower()``` : Checks if the string is only lower case.
        - ```str.isupper()``` : Checks if the string is only upper case.
        - ```str.isspace()``` : Checks if the string is only whitespace
        - ```str.istitle()``` : Checks if the string is in titlecase
- Formatting and modification of strings:
    - ```str.center(width, character)``` : Centers the string with padding using character if the width is larger than the string length
    - ```str.lstrip(characters)``` : Removes every character to the left of str in the string characters, defaulting to only consider whitespace.
    - ```str.rstrip(character)```
    - ```str.removeprefix(prefix)``` : Removes the string prefix if it starts with prefix.
    - ```str.removesuffix(suffix)``` : Removes the string suffix if it ends with suffix.
    - ```str.strip([chars])``` : Removes all leading and trailing characters in chars. Defaults to just removing whitespace.
    - ```str.replace(substr1, substr2)``` : Switches all instances of substr1 with substr2.
- Indexing and searching:
    - ```str.count(substr, start, end)``` : Finds the number of occurences of substr from start to end.
    - ```str.startswith(prefix, start, end)``` : Returns true false if the string from the start to end starts with prefix.
    - ```str.endswith(suffix, start,end)``` : Returns true false if the string from start to end ends with suffix.
    - ```str.find(substr, start, end)``` : Returns the lowest index if the string contains substr, otherwise -1.
    - ```str.index(substr, start, end)``` : Same thing as find but if not found raises an error.
    - ```str.rfind(substr, start, end)``` : Returns the highest index if the string contrains subtr, otherwise -1.
    - ```str.rindex(substr, start, end)``` 
- Conversions to and from lists:
    - ```str.join(iterable)``` : Takes a list of strings in the iterable and concatenates all of them with str in between.
    - ```str.split(sep, maxsplit=-1)``` : Returns the string as a list delimited by sep. Defaults to splitting into as many elements as possible if maxsplit = -1.

### Format specifiers in f-strings
f-strings are strings of the form ```f"str"``` so that you can embed arbitrary replacement fields without using ```.format()```. To include { or } characters, use double brackets ```f"{{}}"```. Another way of string formatting is using ```printf(...)```.

When it comes to evaluating numerical expressions, we sometimes need to truncate the decimal places. This can be done without rounding:
```python
#Using colon and .numf formatting, we truncate the sqrt2 to 5 decimal places.
f"{sqrt(2):.5f}"
```

### String operation quirks

Multiplying strings by integers:
- Returns the string repeated n times for n > 0
- Returns an empty string for n <= 0.

### Time and datetime objects
Python's datetime library contains various objects and functions to manipulate dates and times. 

Every object is either **aware** and has a timezone, or is **naive** and doesn't have a timzone.

Below are the objects available that can store time related information:

1. ```date``` : Represents calendar information, i.e. contains ```date.year```, ```date.month``` and ```date.day``` values. 
Some relevant functions:
- Getting objects:
    - ```date.today()``` : Returns the current date.
    - ```date.weekday()``` : Returns the weekday of the date as an integer, 0-indexed from Monday.
    - ```date.isoweekday()``` : Returns the weekday of the date as an integer, 1-indexed from Monday.
    - ```date.isocalendar()``` : Returns a tuple of (year, week, weekday).
    - ```date.isoformat()``` : Returns YYYY-MM-DD string.
    - ```date.ctime()``` : Returns a string representing the time.
    - ```date.strftime(format)``` : Returns a string based on the ```format```. Some examples of formats: 
- Forming a date object from strings:
    - ```date.fromtimestamp(timestamp)``` : Converts from a POSIX timestamp to a date object. This is defined as the number of seconds that have passed since January 1st 1970 midnight.
    - ```date.fromordinal(string)``` : Converts from xxxx-xx-xx string (year-month-day) to a date object.
    - ```date.fromisoformat(string)``` : Converts from xxxx-xx-xx string (e.g. 2024-03-30) or a yearmonthday string (e.g. 20240330) to a date object. 
    - ```date.strptime(format)``` : Allows for strings to be parsed into a date object based on the ```format``` specified.
- Others:
    - ```date.replace(year, month, day)``` : Allows you to replace the year, month or day in the date object with another to return a new date object. Defaults to no changes if no inputs are given.


2. ```time``` : Represents time of day information, containing ```time.hour```, ```time.minute```, ```time.second``` and ```time.microsecond``` values. Hours is bounded between 0 and 24, minutes and seconds between 0 and 60, and microseconds between 0 and 1 million.

Some relevant functions:
- Getting objects:
    - ```time.replace(hour, minutes, seconds, microseconds, tz)``` : Returns a new time object with replace attributes
    - ```time.isoformat()``` : Outputs a string in HH:MM:SS.ffffff or HH:MM:SS format (+HH:MM) depending on the timezones.
    - ```time.strftime(format)``` : Outputs a string representing the time in the form format.
- Forming a time object from strings:
    - ```time.fromisoformat(string)``` : Takes in strings of the form HH:MM:SS or THH:MM:SS or THHMMSS or HH:MM:SS.ffffff or HH:MM:SS,ffffff or HH:MM:SS+HH:MM and returns a time object.

3. ```datetime``` : This object combines both ```time``` and ```date```. Thus it has ```datetime.year```, ```datetime.month```, ```datetime.day```, ```datetime.hour```, ```datetime.minute```, ```datetime.second```, ```datetime.microsecond``` and ```datetime.tzinfo``` attributes.

Some relevant functions:
- Getting objects:
    - ```datetime.tzinfo``` : Returns the timezone of the datetime object.
    - ```datetime.date()``` : Returns the corresponding date object.
    - ```datetime.time()``` : Returns the corresponding time object without timezone specification.
    - ```datetime.timetz()``` : Returns the corresponding time object with timezone specification.
    - ```datetime.timestamp()``` : Returns the POSIX float of the datetime.
    - ```datetime.replace(year,...,microsecond,tz)`` : Makes a new datetime object with the same attributes except with those specified to be changed.
    - ```datetime.astimezone(tz)``` : Get a new datetime object with updated timezone where the time is the same as UTC but in local time of tz.
    - ```datetime.today()``` : Returns the current date and time without timezone info.
    - ```datetime.now(tz)``` : Returns the current date and time, but you can specify the timezone.
    - ```datetime.utcnow()```: Returns the UTC timezone date and time.
    - ```datetime.weekday()``` : Returns the weekday of the datetime as an integer, 0-indexed from Monday.
    - ```datetime.isoweekday()``` : Returns the weekday of the datetime as an integer, 1-indexed from Monday.
    - ```datetime.isocalendar()``` : Returns a tuple of (year, week, weekday).
    - ```datetime.isoformat(sep="T", timespec)``` : Returns YYYY-MM-DDTMM:SS.ffffff string, excluding the decimal if microseconds is 0.
    - ```datetime.ctime()``` : Returns a string representing date and time without timezone info.
    - ```datetime.strftime(format)``` : Returns a string representing the datetime according to format.
- Forming a datetime object from strings:
    - ```datetime.fromtimestamp(string, tz)``` : Converts from a POSIX timestamp to a datetime object.
    - ```datetime.utcfromtimestamp(string)``` : Converts from a POSIX timestamp to a datetime object correpsonding to UTC time.
    - ```datetime.fromisoformat(string)``` : Takes in a strings of the form YYYY-MM-DD or YYYYMMDD or YYYY-MM-DDTHH:MM:SS (with time) or YYYY-MM-DDTHH:MM:SS.ffffff (with time up to microseconds) or YYYY-MM-DDTHH:MM:SS.ffffff+HH:MM (with time and timezone specification).
    - ```datetime.fromisocalendar(year, week, day)``` : Takes isocalendar values as inputs and returns a datetime object.
    - ```datetime.strptime(string, format)``` : Takes in a string and parses it according to format.
    - ```datetime.combine(date, time)``` : Takes a date and time object and outputs a datetime object.
- Some numerical operations:
    - +/- with time delta = shifted datetime
    - Difference between datetimes = timedelta
    - Equality and order comparisons work as you'd expect.

Note that in the functions above, the timezone if not specified will default to the system's timezone.

4. ```timedelta``` : Represents a duration with ```weeks```, ```days```, ```hours```, ```minutes```, ```seconds```, ```milliseconds``` and ```microseconds``` attributes. It represents the difference between either two ```date``` objects, or the difference between two ```datetime``` objects. Durations can be positive or negative.

Operational functionality among floats,ints and timedeltas:
- Adding two deltas: Results in the direct sum of durations.
- Subtracting two deltas: Results in the difference between two deltas.
- Multiplying a delta by a number: If an integer, directly multiplies the duration, otherwise also rounds off if its a multiplication by float.
- Division of two deltas: Returns a float representing the ratio of the durations.
- Division,floor division of a delta by a number: Returns the duration divided by the number and rounded
- Modulus of two deltas: Returns the remainder of the duration.
- absolute value of delta: Converts it to a positive duration.

Sidenote: ```timedelta.total_seconds()``` is a function that converts the timedelta into just seconds as a float.
 
5. ```timezone``` : A subclass of the abstract tzinfo which represents timezones. It is defined by a fixed ```timezone.offset``` from the UTC timezone. For more customization, you'd need to create your own subclass of tzinfo.
Some relevant functions:
- ```timezone.tzname()``` : Returns the timezone offset from UTC as a string representing the corresponding timedelta.
- ```timezone.utc``` : The utc timezone

#### Guide to formatting in strptime and strftime
```.strftime(format)``` is used for converting an object to a string according to a format while ```.strptime(format)``` does the opposite by parsing the string into an object.

Below are the codes that are usable in the format string:
- For years we have ```%Y``` for the full 4 digit year like 0001 or 2025. There is also ```%y``` for the first two digits of the year, e.g. 25.
- For months we have ```%m```. This corresponds to 01,...,12. We also have ```%b``` for the abbreviated month name and ```%B``` for the full.
- For weeks of the year, we have ```%U``` and ```%W``` for counting from Sunday first and Monday first respectively.
- For days we have ```%d``` for the day of the month, ```%a``` for the abbreviated weekday, ```%A``` for the full weekday name, ```%w``` for the weekday with 0-indexing from Monday. We also have ```%j``` for the day of the year from 000 to 366.
- For hours we have ```%H``` corresponding to the 24 hour clock and ```%I``` corresponding to the 12 hour clock. We also have ```%p``` for AM and PM. 
- For minutes, seconds and microseconds we have %M, %S, %f respectively.
- For the timezones we have ```%z``` for UTC offset form and ```%Z``` for the timezone name.

# Deep dive into serialization in Python

### Defining serialization and its use cases
**Serialization** is when you convert in-memory objects into a bytes or text format to be stored in a file, transmitted, or reconstructed. **Deserialization** is the reverse of this process. When you "serialize" data, you are converting it into some file format which can we written/read, one of the most common being CSV files.

### Basic string examples
One of the simplest ways of serializing an object or data would be to convert it into some standardised string format.
```
# Example list
nums = [1, 2, 3, 4, 5]

# Serialize: list -> string
s = ",".join(map(str, nums))
print(s)  # "1,2,3,4,5"
```

There are plenty more examples of this being done on things like linked-lists, trees and graphs on leetcode.

### Text formats : JSON, XML and YAML
**JSON**, **XML** and **YAML** are all standard text-formats for serialization which can be used across languages. 

**JSON** stands for JavaScript Object Notation and uses key-value pairs, like a Python dict in order to represent objects/data. It is often used for web APIs and data storage/interchange.
```json
{
  "user": "Alice",
  "age": 30,
  "active": true
}
```

**XML** stands for eXtensible Markup Language and uses tags and attributes to represent data. Its structure is similar to HTML code. It is often used for storing documents and on older systems.
```xml
<user id="123">
  <name>Alice</name>
  <age>30</age>
  <active>true</active>
</user>
```

**YAML** stands for YAML Ain't Markup Language and is essentially like XML except it uses indentations instead of tags. It is often used for config files (like application settings for Flask, Django, Node.js) and CI/CD, and infrastructure as code (like defining servers, networks, cloud resources etc as code).
```yaml
user:
  id: 123
  name: Alice
  age: 30
  active: true
```

Sidenote: CI/CD stands for continuous integration and deployment and defines what happens when you push/test/deploy code. By having a YAML file with specs, you can create and automate CI/CD pipelines.
One primary way this is done is through github actions where YAML is parsed via github actions and it executes the specs. For example
```yaml
on: push
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: echo "Hello World"
```

### CSV
**CSV** stands for comma-separated-values and is a common way of serializing tabular/table data. It follows the convention of the first row representing the headers and every subsequent row the entries, with commas as delimiters (i.e. the separators) in every row. Sometimes the delimiters can be ":" or ";" as well.
```csv
name,age,active
Alice,30,True
Bob,25,False
```

### Pickling
**Pickling** is python's built-in serialization method for converting python objects into a byte stream. **Unpickling** is the correspondingwh deserialization process. One main drawback, however, is that deserializing pickle data is not secure since it can execute any code as well.

The following are some common use cases for pickling in Python:
- Saving objects like ML models between script runs
- Caching objects locally

To pickle python objects, we use the ```pickle``` module. The module provides the ability to define your own ```Pickler``` and ```Unpickler``` subclasses for customizing how you want to serialize and de-serialize your objects. By default, however, it provides the following functions:
- ```pickle.loads(pickled)``` : Takes the pickled representation and returns the original object. If you specify a file instead of a bytes literal string, then it will deserialize the file contents.
- ```pickle.dumps(object, file = None)``` : Returns the pickled representation of the object (e.g. b'\x80\x04\x95...') in terms of a bytes literal string. If file is specified with a file-like object, then you can write to it.

# Some extra details about classes in Python
### Encapsulation
**Encapsulation** is the idea of classes in Python having attributes and methods and restricting direct access to some of them. This is done via naming conventions and special decorators.

By default, all attributes in python are **public** and freely accessible outside of the class if an object is instantiated. To make an attribute **protected**, you can add an underscore(```_```) to prefix the name of the attribute. This doesn't stop a person from accessing it but is a naming convention. To make an attribute **private** and inaccessible outside of the class, you can add a double underscore prefix(```__```).
```python
class Class:
    def __init__(self):
        self.public = 1
        self._protected = 1
        self.__private = 1

c = Class()
print(c.public) # Works
print(c._protected) # Works
print(c.__private) # Raises attribute error
print(c._Class__private) # Works since Python renames private attributes according to _classname__private.
```

Given a private attribute of a class (and not wanting to use _Class__private), we can define functions with decorators in python known as **getters** and **setters**. A getter is a method with the ```@property``` decorator and can be as simple as returning ```self.__private```, allowing access to the private variable. A setter is a method with the ```@gettermethod.setter``` decorator and is meant for modifying the private variable ("gettermethod" is a placeholder for the name of the getter function).
```python
class Class:
    def __init__(self):
        self.__private
    
    @property 
    def private(self): # Getter method
        return self.__private
    
    @private.setter
    def private(self,value): # Setter method
        self.__private = value
```

### Creating subclasses
A **subclass** is a class that inherits all the attributes and methods from a base class and can override them. Suppose we have class ```Class```. You create a subclass by doing ```class Subclass(Class)```. 
```python
class A:
    pass

class B(A):
    pass
```
**Multi-inheritance** is the idea of inheriting from multiple classes at once. The notation is ```class A(B,C)``` for a class A inheriting from both ```B``` and ```C```. Suppose both parent classes have the same method ```hello()```, then Python uses **MRO** (method resolution order) to determine which method to call upon, respecting the following rules:
- Child overrides parent
- Respect parent order
- No class appears before its parent in linearization (the order which classes are searched).
I.e. if A has ```hello()``` defined inside it, then it overrides and calls that. If A doesn't have such a method, then it looks at B,C, D... and any other classes it inherits from in order of how they are put in the class definition.

The ```super()``` returns a proxy object which is like an instance of the class next in the MRO order, allowing you to call methods and attributes from super. For example, ```super().__init__()``` calls the ```__init__()``` method of the class that is next in the MRO and is often used in pytorch neural nets. 
There are two ways to create a proxy object using ```super()```:
```python
# Implicit/No argumnents: used for Python 3 and up
# it automatically finds the first class next in the MRO order such that it has .method() implemented
super().method(*args)

# Explicit form: You specify where the super() should start searching from in the MRO using Class and then include an instance for object binding (i.e. so anything requiring self can then be called).
super(Class, instance).method(*args)
```
Some nuances about the explicit form super call:
```python
class A:
    def hello():
    ...

class B(A):
    ...

super().hello()           # zero-arg, walks the MRO starting from the next class after B to find the definition of hello that appears first (A.hello)
super(B, self).hello()    # explicit, next in MRO after B (A.hello). Essentially does the same thing as the zero-arg case

#______________
class A:
    def hello(self):
        print("Hello from A")

class B(A):
    pass  # no hello

class C(B):
    def hello(self):
        print("Hello from C")
        super().hello()  # walks the MRO until we find hello in A.
        super(B,self).hello() # walks the MRO starting from the point after B.
 
```

Some other class related methods:
- ```issubclass(obj,class)``` : Returns true if your object is an instantiation of a subclass of class
- ```isinstance(obj,class)``` : Returns true if your object is an instantiation of class or one of its subclasses.
- ```Class().mro()``` : Returns a list of objects representing the order in which classes are searched via MRO principle.

Sidenote: The "automatic" nature of how Python determines which implementations to use is called **dynamic dispatch** and said methods are called **polymorphic** methods. Generally, this automatic nature is referred to as **polymorphism**.

### Composition over inheritance principle
The **composition over inheritance** principle follows the idea that classes should prioritize a "has-a" instead of an "is-a" relationship. I.e. you instantiate class objects inside your would-be subclass rather than inherit where possible.
```python
class Engine:
    def start(self):
        print("Engine starting...")

class Car(Engine):  # Car inherits Engine
    def drive(self):
        print("Driving...")

c = Car()
c.start()  # Car inherits start() from Engine
c.drive()


class Engine:
    def start(self):
        print("Engine starting...")

class Car:
    def __init__(self):
        self.engine = Engine()  # Car has-a Engine

    def drive(self):
        self.engine.start()
        print("Driving...")

c = Car()
c.drive()

```

This principle just makes code more flexible, modular and maintainable.

### Abstact base classes and similarity with C++
**Abstract base classes** in Python are classes that are not able to be instantiated directly. They are made as subclasses of ```ABC``` from the ```abc``` module, with any method that must be implemented in the subclasses having an ```@abstractmethod``` decorator.
```python
from abc import ABC, abstractmethod

class Shape(ABC):
    @abstractmethod
    def area(self):
        # abstract must implement method
        pass

    def describe(self):
        # concrete method
        print("I am a shape")

class Circle(Shape):
    def __init__(self, radius):
        self.radius = radius

    def area(self):
        return 3.14 * self.radius ** 2
```
Alternatively, you can use ```register()``` so you don't explicitly write out the inheritance of the ABC class you define. This is done via ```ABCclass.register(Subclass)```. Unlike inheritance, any methods and attributes are not inherited specified in ```ABCclass```.

Sidenote comparison to cpp syntax: In cpp, we also have abstract classes, where the idea is to add ```virtual``` as a prefix keyword prior to any methods that are the equivalent of abstract methods. They also follow the same rule of not being instantiable. However, any virtual functions do not need to be overridden explicitly. Additionally inheritance is a must, there is no ```.register()``` equivalent. Lastly, there is no MRO so inheritance needs to be carefully planned for any cpp subclasses.

```cpp
#include <iostream>
using namespace std;

class Base {
public:
    virtual void greet() {
        cout << "Hello from Base" << endl;
    }
};

class Derived : public Base { // inheritance
public:
    void greet() override { //overriding the virtual function
        cout << "Hello from Derived" << endl;
    }
};

int main() {
    Base* obj = new Derived();
    obj->greet();  // Prints "Hello from Derived"
}
```

Extra note: in cpp, dynamic dispatch only happens for virtual function and not everything (i.e. everything needs to be specified in the code if not virtual).

# Extra python features
The ```help()``` function: Calling help() on any built-in functions/objects or user-defined ones retrieves and displays things like docstrings, type, attributes and methods, parameters and default values, etc. It also works on module objects by listing all the submodules, functions and classes available.

```inspect``` library: A more in-depth and specific version of the help() function's functionality which lets you do a lot more like looking at source code. Some important functiosn:
- ```inspect.getsource(...)``` : gets you the source code as a string
- ```inspect.isfunction(...)``` : Looks at if the input is a function.
- ```inspect.isclass(...)``` : Looks at if the input is a class.
- ```inspect.signature(...)``` : Retrieves the arguments of a function
- ```inspect.getdoc(...)``` : retrieves the docstring of code.