## About last lecture

- Checklists represent a **way to document research** while trading-off content quality and clarity with workload

- They are still only recommendations and they can be quite **time-consuming with barely no reward** (feel depressed again?)

- Still, researchers should be **educated to be responsible**. Each output is a building block for the next ones.

## It's time to talk about coding!

<center>
<div>
<img src="../Images/Lecture-6/programming-skills.png" width="1800" alt='programming_skills'/>
</div>
</center>


## Why do we care about coding?

Whether you like it or not, experimental setting might require you to do some coding stuff.

Coding translates to: 

1. Transparency (*don't you dare do some cheap tricks!*)
2. Correctness (*your code should reflect your paper statements*) 
3. **Readability** (*please, don't make this a nightmare*)
4. Efficiency (*time is money*)
5. **Maintainability** (*I'm sure you'll re-use this code*)

## About this lecture

- Debugging
- General coding best practices
- Code documentation

*The 5-minute-in-the-future of yourself and your friends will appreciate!*

# Debugging

<center>
<div>
<img src="../Images/Lecture-6/programming-meme.png" width="350"/>
</div>
</center>


### What are we going to see

- Using a debugger
- Type hints, annotations, typechecking
- Logging
- Assertions and Unit tests
- Profiling

## As always, let's first check your experience!

Time for another [Google Form](https://forms.gle/DSjiz1v85PfTtwuw5) (**10 mins**)

<center>
<div>
<img src="../Images/Lecture-6/qsn_coding.png" width="600" alt='Debugging'/>
</div>
</center>

### The debugging slice

Whether you've already realized it or you still have to, debugging usually takes **around 90%** of your work time

- 9% the idea
- 1% code writing
- 90% debugging

It is tiresome, boring, stressful, annoying $\rightarrow$ **we all know that!**

#### My point

- We **don't really need** to be <u>good</u> programmers to write correct code (*what do you with mean with 'good'?*)

- This lecture is **not** a computer science 101 course about programming (*I'm far from being competent on this matter*)

- What I want to say is that you should learn **how to think** to tackle debugging

#### Debugging is like an investigation game where you have to find the culprit, and usually the culprit is you!

#### What are our weapons?

- [**Dynamic**] Debugger
- [**Static**] Type hints
- [**Dynamic**] Typechecking
- [**Dynamic**] Assertions
- [**Static**] Logging
- [**Dynamic**] Unit tests
- [**Dynamic**] Profiling

All these methods, **combined**, allow us to better inspect our code to find unwanted behaviours/features (*bugs*)

### Using a Debugger

Long story short, we have a powerful tool to inspect our code

- Stop using ```print(...)``` for debugging

- Stop running your script piecewise

- Just use a **debugger**

- Just use a **debugger**

- Got it?

There are **many powerful IDEs** that support a debugger: PyCharm, Spider, VisualStudio Code

#### Unless you are a skilled programmer (*why would you do that?*), just avoid programming with 
   - text editors like vim, sublimetext, nano, notepad++, etc.
   - notebooks (*unless you need to run real experiments, not all of us are rich and have personal GPUs*)

$\rightarrow$ *personal opinion*

#### What if we can't use an IDE (e.g., working on a remote server without GUI)?

Python has a neat functionality that allows to run the Python Debugger [```pdb```](https://docs.python.org/3/library/pdb.html#debugger-commands) from command line

$\rightarrow$ just run your python script with ```python -i myscript.py``` for interactive mode and

```
    import pdb
    pdb.pm()   # <-- run post-mortem debugger
```

In [None]:
# crashing_app.py
SOME_VAR = 42

class SomeError(Exception):
    pass

def func():
    raise SomeError("Something went wrong...")

func()

In [None]:
!python3 -i crashing_app.py

Traceback (most recent call last):
  File "crashing_app.py", line 9, in <module>
    func()
  File "crashing_app.py", line 7, in func
    raise SomeError("Something went wrong...")
__main__.SomeError: Something went wrong...
>>> # We are interactive shell
>>> import pdb
>>> pdb.pm()  # start Post-Mortem debugger
> .../crashing_app.py(7)func()
-> raise SomeError("Something went wrong...")
(Pdb) # Now we are in debugger and can poke around 
      # and run some commands:
(Pdb) p SOME_VAR  # Print value of variable
42
(Pdb) l  # List surrounding code we are working with
  2
  3  	class SomeError(Exception):
  4  	    pass
  5
  6  	def func():
  7  ->	    raise SomeError("Something went wrong...")
  8
  9  	func()
[EOF]
(Pdb)  # Continue debugging... set breakpoints, step through the code..

### Type hints [[Read1](https://peps.python.org/pep-0483/), [Read2](https://bernat.tech/posts/the-state-of-type-hints-in-python/)]

If debugging is our dynamic way of inspecting code, type hints are our way to **statically analyze it**

In [None]:
from typing import Typle, List

# With type hints
def parse_inputs(input_data: Tuple[str, int]) -> [List[int], int]:
    text, label = input_data
    text = preprocess_text(text=text)
    tokens = vocab(tokenizer(text))
    return tokens, label


# Without type hints
def parse_inputs(input_data):
    text, label = input_data
    text = preprocess_text(text=text)
    tokens = vocab(tokenizer(text))
    return tokens, label

#### Advantages

1. **Integrated** documentation with code $\rightarrow$ **much more readable** than docstrings
2. Accurate code **re-factoring** for IDEs
3. Allows code **auto-completion** for IDEs
4. Linters (included in IDEs) can tell **wrong function calls** based on type hints $\rightarrow$ warnings!!
5. We can define **compund types** like ```List[int]```

#### Disadvantages

1. We need at least ```Python 3.6``` (*reasonable, we are in 2025...*)
2. May have **conflicts with docstrings** depending on the tool used $\rightarrow$ check for plugins!
3. Minor added computation **overhead**
4. Forces to import **all type dependencies**, even though they are not used at runtime at all
5. Compound type may require some **additional operations** by the interpreter

Points [**4-5**] are solved via post-poned evaluation of annotations (*requires ```Python 3.7```*)

In [2]:
# w/o post-poned evaluation
class A:
    def f(self) -> A: # NameError: name 'A' is not defined
        pass
    
# w/ post-poned evaluation
from __future__ import annotations

class A:
    def f(self) -> A:
        pass

#### An example

We can use Python's reference linter ```mypy``` to run our type hinted code

You may need to install it 

```pip install mypy```

In [None]:
from typing import Union


class CustomClass:
    def __init__(self, identifier: Union[str, int], name: str):
        self.identifier = identifier
        self.name = name


x = CustomClass(identifier=3.0, name='test')   # this is incorrect
print(x.identifiers)                           # this does not exist

In [None]:
!mypy mypy_example.py 

>>>> mypy_example.py:11: error: Argument "identifier" to "CustomClass" 
    has incompatible type "float"; expected "str | int"  [arg-type]
>>>> mypy_example.py:12: error: "CustomClass" has no attribute 
    "identifiers"; maybe "identifier"?  [attr-defined]
>>>> Found 2 errors in 1 file (checked 1 source file)

#### Types of type hints

- **Nominal types**: ```int```, ```float```, ```bool```, etc... (*all bultin type*)

- **Compound types**: ```List[int]```
    - We can also define type aliases for readability: ```CustomType = Optional[List[int], Dict[str, str]]```

- **Compotional types**: ```Union[...]``` (*one of*), ```Intersection[...]``` (*each one*), ```Optional[...]``` (*can be None*)

- **Generic types**:

In [None]:
from typing import TypeVar, Generic, Iterable, Iterator

T = TypeVar('T')   # must use the same variable name

class CustomClass(Generic[T]):
    def __init__(self, value: T) -> None:
        self.value: T = value
            
    def get_iterator(values: Iterable[CustomClass[int]]) -> Iterator:
        for value in values:
            yield value

####  Proper function overloading

Suppose you have the following function

In [None]:
def do_stuff(x: Union[int, List[int]]) -> Union[int, List[int]]:
    ...

The type hinter understands that you can call ```do_stuff``` with
   - ```input x is int returns int```
   - ```input x is int returns List[int]```
   - ```input x is List[int] returns int```
   - ```input x is List[int] returns List[int]```
  

#### How to avoid this?

In [None]:
from typing import overload

@overload
def do_stuff(x: int) -> int:
    ...
    
@overload
def do_stuff(x: List[int]) -> List[int]:
    ...


def do_stuff(x: Union[int, List[int]]) -> Union[int, List[int]]:
    ...

Actually, we **should rely** on patterns like ```functools.singledispatch``` or [```multipledispatch```](https://pypi.org/project/multipledispatch/)

$\rightarrow$ check [this interesting blog](https://martinheinz.dev/blog/50) about proper function overloading

#### What type hints don't do

- Runtime type inference $\rightarrow$ we need some libraries that leverage type hints
- No performance tuning (*by the interpreter*) $\rightarrow$ type hints are treated just like comments

#### Only type hinted code is type-checked!

#### Bonus: type hints and Sphinx for merging documentation

We can remove all typing information from docstrings and infer them from type hints

Sphinx has the plugin [```agronoholm/sphinx-autodoc-typehints```](https://github.com/agronholm/sphinx-autodoc-typehints)

Then add the following extensions to Sphinx's ```conf.py```

```
extensions = ["sphinx.ext.autodoc", "sphinx_autodoc_typehints"]
```

### Typechecking

Type hints **do not** perform type checking at execution time!

$\rightarrow$ you may still **pass wrong typed data** without raising any error

#### Typechecking leverages type hints to perform execution control

There exist several libraries for doing typechecking: [Enforce](https://pypi.org/project/enforce/), [Pydantic](https://pypi.org/project/pydantic/), [Pytypes](https://pypi.org/project/pytypes/), and [Typeguard](https://pypi.org/project/typeguard/) are among the most popular ones

### [An example with Pydantic](https://lyz-code.github.io/blue-book/coding/python/pydantic_functions/)

In [None]:
from pydantic import validate_arguments, ValidationError

@validate_arguments
def repeat(s: str, count: int, *, separator: bytes = b'') -> bytes:
    b = s.encode()
    return separator.join(b for _ in range(count))

a = repeat('hello', 3)
print(a)
#> b'hellohellohello'

b = repeat('x', '4', separator=' ')
print(b)
#> b'x x x x'

try:
    c = repeat('hello', 'wrong')
except ValidationError as exc:
    print(exc)
    """
    1 validation error for Repeat
    count
      value is not a valid integer (type=type_error.integer)
    """

### Logging [[Read1](https://martinheinz.dev/blog/24)]

**Not having any logs** from your application can make it very difficult to troubleshoot any bugs.

$\rightarrow$ replace ```print(...)``` with ```logging.info(...)```

In [None]:
import logging
logging.basicConfig(
    filename='application.log',
    level=logging.WARNING,
    format= '[%(asctime)s] {%(pathname)s:%(lineno)d}'\ 
            '%(levelname)s - %(message)s',
    datefmt='%H:%M:%S'
)

logging.info('Here AAAAAAAAAAH')
logging.error("Some serious error occurred.")
logging.warning('Function you are using is deprecated.')

#### Decorating functions to avoid messing up their body

In [None]:
from functools import wraps, partial
import logging

# Helper function that attaches function as attribute of an object
def attach_wrapper(obj, func=None):  
    if func is None:
        return partial(attach_wrapper, obj)
    setattr(obj, func.__name__, func)
    return func

In [1]:
def log(level, message):  # Actual decorator
    def decorate(func):
        logger = logging.getLogger(func.__module__)  # Setup logger
        formatter = logging.Formatter(
            '%(asctime)s - %(name)s - %(levelname)s - %(message)s')
        handler = logging.StreamHandler()
        handler.setFormatter(formatter)
        logger.addHandler(handler)
        log_message = f"{func.__name__} - {message}"

        # Logs the message and before executing the decorated function
        @wraps(func)
        def wrapper(*args, **kwargs):  
            logger.log(level, log_message)
            return func(*args, **kwargs)

        # Attaches "set_level" to "wrapper" as attribute
        @attach_wrapper(wrapper)
        # Function that allows us to set log level
        def set_level(new_level):  
            nonlocal level
            level = new_level

        # Attaches "set_message" to "wrapper" as attribute
        @attach_wrapper(wrapper)
        # Function that allows us to set message
        def set_message(new_message):  
            nonlocal log_message
            log_message = f"{func.__name__} - {new_message}"

        return wrapper
    return decorate

In [2]:
# Example Usage
@log(logging.WARN, "example-param")
def somefunc(args):
    return args

somefunc("some args")
# Change log level by accessing internal decorator function
somefunc.set_level(logging.CRITICAL)
# Change log message by accessing internal decorator function
somefunc.set_message("new-message")   
somefunc("some args")

2023-05-03 10:08:54,430 - __main__ - CRITICAL - somefunc - new-message


'some args'

#### Overwriting ```__repr__``` and ```__str__``` for debugging

It might be a **good idea to overload** these methods to better inspect objects via logging

In [None]:
class Circle:
    def __init__(self, x, y, radius):
        self.x = x
        self.y = y
        self.radius = radius

    def __repr__(self):
        return f"Rectangle({self.x}, {self.y}, {self.radius})"

...
c = Circle(100, 80, 30)
repr(c)
# Circle(100, 80, 30)

### Assertions

In addition to typechecking, you may also want to **enforce specific sanity checks**: pre-conditions, post-conditions, invariants

- Correct batches
- Correct pre-processing
- etc...

$\rightarrow$ In the simplest form, we can insert assertions in our code: ```assert condition, message```

#### Make sure you use assertions for debugging only!

Proper control flow (e..g, ```if-then-else```) is the **recommended way** for handling multiple behaviours

Assertions may be disabled via ```python -0 myscript.py```, thus, your code relying on assertion may not perform any sanity check at all

$\rightarrow$ In practice, in research, **we are never doing production code**, thus we may not need to follow these advices slavishly

### Unit tests [[Read1](https://martinheinz.dev/blog/7)]

A better way to make use of assertions to define sanity checks is to define **proper unit tests**.

Python offers the ```pytest``` and ```unittest``` libraries to define unit tests

#### Testing exception raise

In [None]:
import pytest

def test_my_function():
    with pytest.raises(Exception, match='My Message') as e:
        my_function()
        assert e.type is ValueError

#### Testing stout and stderr

In [2]:
def test_my_function(capsys):
    my_function()  # function that prints stuff
    captured = capsys.readouterr()  # Capture output
    # Test stdout
    assert f"Received invalid message ..." in captured.out  
    # Test stderr
    assert f"Fatal error ..." in captured.err  

#### Patching objects

We can replace objects used in functions under test with ```mock.patch```

In [None]:
from unittest import mock

def test_my_function():
    # We are replacing 'method_of_class' to return None
    with mock.patch.object(SomeClass,
                           'method_of_class',
                           return_value=None) as mock_method:
        instance = SomeClass()
        instance.method_of_class('arg')

        mock_method.assert_called_with('arg')  # True

def test_my_function():
    r = Mock()
    r.content = b'{"success": true}'
    # Avoid doing actual GET request
    with mock.patch('requests.get', return_value=r) as get:
        some_function()  # Function that calls requests.get
        get.assert_called_once()

#### Fixtures

If we have **multiple unit tests**, you can **share fixtures** (i.e., test init functions) between unit tests via ```conftest.py```

Pytest **automatically** discovers them

In [None]:
import pytest, os

# Executed before any pytest
@pytest.fixture(scope='function')
def reset_sqlite_db(request):
    path = request.param  # Path to database file
    with open(path, 'w'): pass
    yield None
    os.remove(path)
    
# Defining pytest that initially invokes the above fixture
@pytest.mark.parametrize('reset_sqlite_db',
                         ['/tmp/test_db.sql'],
                         indirect=True)
def test_send_message(reset_sqlite_db):
    ...  # Perform tests that access prepared SQLite database

Alternatively, you can create an ad-hoc file ``fixtures.py`` where to store all fixtures.

### Cinnamon example

A quick glimpse to my 'stupid' library, where the number of lines of code for unittests is greater than those defining the library.

Total number of unittests: **78**

Check https://github.com/nlp-unibo/cinnamon/tree/main/tests

### Profiling [[Read1](https://martinheinz.dev/blog/13), [Read2](https://martinheinz.dev/blog/68), [Read3](https://martinheinz.dev/blog/83), [Read4](https://martinheinz.dev/blog/64)]

Debugging and typechecking are **not the only way** to find a bug.

$\rightarrow$ we can also **monitor employed resources** to gather insights about efficiency, memory leaks and potential bottlenecks

#### Using ```timeit```

In [None]:
# In the simplest form:
python -m timeit "7 + 28"
50000000 loops, best of 5: 5.72 nsec per loop

# Creates "x" variable before running the test
python -m timeit -s "x = range(10000)" "sum(x)"
2000 loops, best of 5: 160 usec per loop

#### Using ```pyperf```

In [None]:
# Uses "-l" and "-n" instead of "-n" and "-r" respectively
python -m pyperf timeit -s "x = range(10000)" "sum(x)" -o result.json
.....................
Mean +- std dev: 157 us +- 12 us

python -m pyperf stats result.json

Total duration: 13.4 sec
Start date: 2022-09-09 12:36:27
End date: 2022-09-09 12:36:42
...

  0th percentile: 134 us (-9% of the mean) -- minimum
  5th percentile: 135 us (-9% of the mean)
 25th percentile: 137 us (-8% of the mean) -- Q1
 50th percentile: 141 us (-4% of the mean) -- median
 75th percentile: 161 us (+9% of the mean) -- Q3
 95th percentile: 163 us (+10% of the mean)
100th percentile: 168 us (+14% of the mean) -- maximum

In [None]:
python3 -m pyperf hist result.json

python3 -m pyperf hist result.json --bins 10
133 us: 13 ############################################
136 us: 14 #################################################
139 us:  6 ############################
143 us:  1 #####
146 us:  0 |
150 us:  0 |
153 us:  0 |
156 us:  7 #################################
160 us: 17 ######################################################
163 us:  0 |
167 us:  2 #########

#### Using ```cProfile```

In [None]:
!python -m cProfile -s time your_script.py
         1297 function calls (1272 primitive calls) in 11.081 seconds

Ordered by: internal time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    3   11.079    3.693   11.079    3.693 slow_program.py:4(exp)
    1    0.000    0.000    0.002    0.002 {built-in method _imp.create_dynamic}
  4/1    0.000    0.000   11.081   11.081 {built-in method builtins.exec}
    6    0.000    0.000    0.000    0.000 {built-in method __new__ of type object at 0x9d12c0}
    6    0.000    0.000    0.000    0.000 abc.py:132(__new__)
   23    0.000    0.000    0.000    0.000 _weakrefset.py:36(__init__)
  245    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
    2    0.000    0.000    0.000    0.000 {built-in method marshal.loads}
   10    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap_external>:1233(find_spec)
  8/4    0.000    0.000    0.000    0.000 abc.py:196(__subclasscheck__)
   15    0.000    0.000    0.000    0.000 {built-in method posix.stat}
    6    0.000    0.000    0.000    0.000 {built-in method builtins.__build_class__}
    1    0.000    0.000    0.000    0.000 __init__.py:357(namedtuple)
   48    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap_external>:57(_path_join)
   48    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap_external>:59(<listcomp>)
    1    0.000    0.000   11.081   11.081 slow_program.py:1(<module>)
...

Just shows **function calls**!

#### Using ```line_profiler```

In [None]:
kernprof -l -v some-code.py  # This might take a while...

Wrote profile results to some-code.py.lprof
Timer unit: 1e-06 s

Total time: 13.0418 s
File: some-code.py
Function: exp at line 3

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     3                                           @profile
     4                                           def exp(x):
     5         1          4.0      4.0      0.0      getcontext().prec += 2
     6         1          0.0      0.0      0.0      i, lasts, s, fact, num = 0, 0, 1, 1, 1
     7      5818       4017.0      0.7      0.0      while s != lasts:
     8      5817       1569.0      0.3      0.0          lasts = s
     9      5817       1837.0      0.3      0.0          i += 1
    10      5817       6902.0      1.2      0.1          fact *= i
    11      5817       2604.0      0.4      0.0          num *= x
    12      5817   13024902.0   2239.1     99.9          s += num / fact
    13         1          5.0      5.0      0.0      getcontext().prec -= 2
    14         1          2.0      2.0      0.0      return +s

#### Using ```pyheat```

In [None]:
!pyheat some-code.py --out image_file.png

<center>
<div>
<img src="../Images/Lecture-6/pyheat.png" width="1000"/>
</div>
</center>

#### Using decorators

In [None]:
def evaluate_time(
        func: Callable
):
    def compute_time(
            *args,
            **kwargs
    ):
        start_time = time.perf_counter()    # <---
        func_result = func(*args, **kwargs)
        end_time = time.perf_counter()      # <---
        total_time = end_time - start_time
        print(f'Function func {func.__module__} --
              {func.__name__} took {total_time} seconds')
        return func_result

    return compute_time

@evaluate_time
def load_ibm2015_dataset(
        samples_amount: int = -1
):
    ...

#### Using ```memory_profiler``` and ```psutil```

In [None]:
python -m memory_profiler some-code.py
Filename: some-code.py

Line #    Mem usage    Increment  Occurrences   Line Contents
============================================================
    15   39.113 MiB   39.113 MiB            1   @profile
    16                                          def memory_intensive():
    17   46.539 MiB    7.426 MiB            1       small_list = [None] * 1000000
    18  122.852 MiB   76.312 MiB            1       big_list = [None] * 10000000
    19   46.766 MiB  -76.086 MiB            1       del big_list
    20   46.766 MiB    0.000 MiB            1       return small_list

# Coding Best Practices

### What are we going to see

- Variable naming
- Comments
- Nesting
- Inheritance
- Abstraction
- Code optimization

## Oh no, not again!

Time for another [Google Form](https://forms.gle/M3ZYQSGQKWzMTTqaA) (**5 mins**)

<center>
<div>
<img src="../Images/Lecture-6/qsn_best-practices.png" width="600" alt='Best Practices'/>
</div>
</center>

All taken from the **wonderful youtube channel** of [```CodeAeshetic```](https://www.youtube.com/@CodeAesthetic)

### Variable Naming [[Read1](https://www.youtube.com/watch?v=-J3wNP6u5YU&t=43s)]

We've talked about debugging, but can we prevent **feeding our 'internal demon' about making errors**?

$\rightarrow$ Yes! F\*\*\*ing use proper **naming convention**!

#### Why? Because we spend more time reading code than writing code!

#### Don'ts

- **Avoid** single letter variable names (*wtf?*)
- **Avoid** abbreviations (*may induce wrong interpretation*)
- **Don't put types** in variable names (e.g., ```my_output_str```, ```BaseModel```)

#### Do's

- **Put units** in variable names (e.g., ```float sleep_time_seconds```) $\rightarrow$ if we can use a **custom type**, it is much better
- Refactor code to **avoid general functions** like ```utilities``` $\rightarrow$ usually, these functions can be moved to specific classes

### Comments

Writing comments is usually the best way of explaining the code, **increasing readability**

But, **extremes** (i.e., 'no comments at all' or 'wall of texts')  can be potentially **harmful**

Comments get 'bugs' like code (i.e., code is updated but not comments!) $\rightarrow$ <u>the cake is a lie</u>

#### Code should speak by itself

We should really comment on **WHY** that code is needed rather on **WHAT** that code is doing
   - Code Documentation: how code is used (e.g., ```Sphinx```)
   - Code Comments: how code works

#### Should we always avoid comments? Hell no!

- In many cases, we might have **hard to grasp code** (especially when doing code optimization)
- If we are following a particular **reference** (e.g., formula, paper, github issue, etc...)

### Never a Nester

''*If you need more than 3 levels of indentation, you're screwed anyway, and should fix your program*" - from Linux style guidelines

Nesting means **adding inner code blocks** to your code (e.g., function, control flow, etc...)

In [None]:
def calculate(bottom: int, top: int):        # <--- Level 1
    if top > bottom:                         # <--- Level 2
        sum = 0
        for number in range(bottom, top):    # <--- Level 3
            if number % 2 == 0:              # <--- Level 4
                sum += number
        return sum
    else:
        return 0

#### What can we do?

1. **Extraction**: move a block to a separate function
2. **Inversion**: flip control flow to avoid nesting

#### Extraction

In [None]:
# We moved this part to a separate function
def filter_number(number: int):
    if number % 2 == 0:
        return number
    return 0

def calculate(bottom: int, top: int):       # <--- 1 
    if top > bottom:                        # <--- 2 
        sum = 0
        for number in range(bottom, top):   # <--- 3
            sum += filter_number(number)
        return sum
    else:
        return 0

#### Inversion

In [None]:
def filter_number(number: int):
    if number % 2 == 0:
        return number
    return 0

def calculate(bottom: int, top: int):                # <--- 1
    # We moved this control flow at the beginning
    if top < bottom:                                 # <--- 2
        return 0
                             
    sum = 0
    for number in range(bottom, top):                # <--- 2
        sum += filter_number(number)
    return sum


### Abstraction

When writing code, we usually learn that **code repetition** is **bad** and that **abstraction** is **good**

$\rightarrow$ We really need to understand the side-effect of abstraction: **coupling**

#### Worth it

- Many implementations with **complex construction**
- **Deferred execution** from creation

#### Not worth it

- Sharing member variables
- Not valuable for abstraction user

### Inheritance and Composition

Inheritance is one of the **most popular** features of OOP

Two functionalities

- **Re-use code** $\rightarrow$ extending functionalities
- **Abstraction** $\rightarrow$ hiding details about from which class a method is being invoked

In [None]:
class Image(ABC):
    
    def resize(scale: float):
        ...
        
    def flip():
        ...
        
    @abc.abstractmethod
    def save():
        pass

class PngImage(Image):
    
    def save():
        ...
        
class JpegImage(Image):
    
    def save():
        ...
        
# Exception!
class DrawableImage(Image):
    
    def save():
        raise Exception('Not supported')

In [None]:
class Image(ABC):
    ...
    
class FileImage(Image):
    
    @abc.abstractmethod
    def save():
        pass
    
class PngImage(FileImage):
    ...
    
class JpegImage(FileImage):
    ...
    
class DrawableImage(Image):
    ...

Such a refactoring can be **quite costly**!

#### What about Composition?

Inheritance **breaks down** as soon we find an **exception** to our 'perfect' design (i.e., when we grab common terms to define a shared parent class)

$\rightarrow$ In these cases, we can rely on **Composition**!

In [None]:
class Image(ABC):
    
    def resize(scale: float):
        ...
        
    def flip():
        ...
        
class PngImage:
    
    def save(image: Image):
        ...
        
class JpegImage:
    
    def save(image: Image):
        ...
        
        
class DrawableImage:
    
    def __init__(self, image: Image):
        ...


#### Pros:

- **Reduces coupling** to re-used code
- **Adaptable** as new requirements come in

#### Cons:

- **Boilerplate code** to inizialize internal types
- We may need to write **wrapper methods**o expose information from internal types

### Code Optimization

**Don't** immediately focus on performance (i.e., code optimization), but rather on **velocity of implementation** and **ease of use**

$\rightarrow$ Only care about performance when **strictly needed**! 
$\rightarrow$ Is it **worth** it?
$\rightarrow$ Does it lead to **less readable code**?

#### Example:

Don't worry if your training routine is taking a lot, focus first on its correctness and test your code with small controlled inputs.

## Documentation

- Code documentation
- Writing a proper README
- Controlled environments

### Code documentation

We have talked about type annotations and other recommendations for code readability.

However, we still miss a way to understand **HOW** to use such code

$\rightarrow$ Code documentation (a.k.a. **docstrings**)

In [None]:
def my_function(param1: int, param2: str):
    """
        This function takes ..., computes ... and then returns ...
        
        Args:
            :param param1: *Description of param1*
            :param param2: *Description of param2*
        
        Returns:
            Description of return value
    """
        ...

#### Different docstring styles

In [None]:
# Numpy docstring style
def my_function(param1: int, param2: str):
    """
        This function takes ..., computes ... and then returns ...
        
        Parameters
        ----------
        param1 : Description of param1
        param2 : Description of param2

        Returns
        -------
        int
            Description of return value
    """
        ...

#### Tools for automatic web documentation

There exist several tools that can **automatically parse documented code** to build a documentation website of your project

Among the many, [```Sphinx```](https://www.sphinx-doc.org/en/master/) is one of the most used for Python

$\rightarrow$ If you are building small libraries, then you **must build** the corresponding web documentation for ease use!

### Writing a proper README

During research, we often define a **code repository** for storing our experiments.

In case of paper acceptance, the code repository should be **made publicly available** and linked in the publication.

#### What you should do?

- Write a proper README file **to guide users** through your code repository

However, in many cases, ```README``` files are **partially informative**, **outdated**, and **sometimes useless**.

#### A recommended README template for research

1. Project title

2. Authors

3. Link to publication (if available)

4. Abstract/Project description

5. Repository Overview

6. Instructions to perform experiments
      1. Prerequisites
      2. Scripts
    
7. FAQ

8. Contributors

9. How to cite

10. [Optional] Disclaimer

11. [Optional] License 

If your project has **multiple** subfolders/sub-projects, make sure to include a **dedicated README**!

### Controlled environments

There are many frameworks to build controlled coding environments, mainly focusing on delivering self-contained executions.

One of most used frameworks is [Docker](https://www.docker.com/)

I recommend checking [Giovanni Ciatto's PhD course](https://disi.unibo.it/en/teaching/phd-programmes/computer-science-and-engineering/courses-of-the-phd-program-in-computer-science-and-engineering/phd-courses-2024-25-academic-year) on containerization and virtualization.

# Concluding Remarks

1. Catching bugs requires a **lot of effort**! Luckily, we have many weapons

2. Many best practices for debugging and code organization also **greatly increase readability**!

3. There are **several tips** concerning **code organization and structure** that we should depending on our scenario

5. There are many more, often **problem specific** and **very hard to generalize**

# Any questions?

<center>
<div>
<img src="../Images/Lecture-3/jojo-arrivederci.gif" width="1200" alt='JOJO_arrivederci'/>
</div>
</center>