## Step 1: Installing libraries

You only need to install required libraries when you're running the notebooks locally (e.g. on your own computer).

The only _required_ library is the toolkit library, which you'll use to write your formula. You can install it by running the following command in your terminal:

```bash
pip install pybox-toolkit
```

You can also install the library from within the Jupyter notebook, by adding the following line into a code cell:

```
%pip install pybox-toolkit
```

Please only install the libraries on the list below, as they are guaranteed to be present within the website environment as well. Any other python packages, apart from the standard library, may not be available when running in the website.

### List of available libraries

 - [`sympy`](https://www.sympy.org/en/index.html)

## Step 2: Importing libraries

Import required libraries. At the minimum, you need to import the `toolkit` library, but you can also rely on `sympy` or the python standard library.

The `toolkit` library acts as the interface between your Jupyter notebook and the website. In addition, it also provides you with some convenient features, such as physical bounds checking, SI units, unit composition, etc, but these features will be discussed later as we encounter them.

In [78]:
# Required import for any formula
import toolkit
# Importing the Python math library (https://docs.python.org/3/library/math.html)
# We will use the library to obtain constants, such as `math.pi` and `math.e`
import math
# Inside of pure formulas, we need to use the math functions provided by sympy
# This is explained later on
from sympy import sin, log 

## Step 3: Using units

As the library is designed with engineering formulas in mind, it has support for units. You have three different options:

 - stick to the SI units provided by the toolkit library
 - create your own units
 - create new units by composition of existing units

### Using provided SI units 

Available units in the library are:

 - Meter
 - Second
 - Kilogram
 - Ampere
 - Kelvin
 - Mole
 - Candela

You can import them from the toolkit library using a simple import statement

In [79]:
# Replace "Meter" or "Kilogram" with your SI unit of choice
from toolkit.typing.si import Meter, Kilogram

### Creating new units

If you can't find the unit you need in the list of available SI units, you can create your own unit types.

A unit at its core is a Python class with two fields:

 - `units`: The unit symbol (e.g. "m" for meters)
 - `physical_range` (optional): Physical range in mathematical notation, e.g. `(-inf, 59]`. Default value if not provided is `(-inf, inf)`.

Your unit must extend from the `Unit` base class, meaning that a new unit definition with name "Example" and symbol "x" can be implemented as follows:

In [80]:
from toolkit.typing import Unit

class Probability(Unit):
    """Probability"""
    units = "[-]"
    physical_range = "[0, 1]"

class RandomVariable(Unit):
    """Random variable with unknown units"""
    units = "[-]"
    physical_range = "(-inf, inf)"

class ParameterMu(Unit):
    """Distribution parameter mu with unknown units"""
    units = "[-]"
    physical_range = "(-inf, inf)"

class ParameterBeta(Unit):
    """Distribution parameter beta with unknown units"""
    units = "[-]"
    physical_range = "[0, inf)"

In [81]:
class Example(Unit):
    """ Example unit """
    units = "x"
    physical_range = "(-inf, 59]"

## Step 4: Defining a formula

A formula is a Python function, annotated with the appropriate "function decorator". The decorator tells the toolkit information about the formula, such as the units of the formula output, keywords/categories of the formula (used for catgorization on the website), etc.

There exist two different annotations, `PureFormula` and `ImpureFormula`. They differ in additional functionality they provide and constraints, but we will discuss them later.

Both decorators require you to specify two fields:

 - `outputs`: a list of 2 element [tuples](https://www.programiz.com/python-programming/tuple) (_not a list of lists!_), where each tuple contains the name of an output variable and the (instantiated) unit of the variable (see the section about unit instantiation for more details).
 - `keywords`: a list of string labels that categorize a specific formula. The keywords are used on the website to display formulas in their corresponding categories. Note that a single formula can have more than one keyword, and the keywords are case-insensitive.

The function itself can accept any number of arguments, as long as they are type-annotated (specify the unit type for the specific argument). Check out the full formula examples to see how this should look in practice.

A function can return multiple values by returning a tuple instead of a single value - works for both pure and impure formulas.

Finally, the function should contain a short "docstring". This is a string that comes immediately after the function signature (`def` statement), and starts & ends with three quote symbols. It should contain a short description of the formula.

That's a lot to take in at once, so here is a simple example!




In [82]:
@toolkit.PureFormula(
    outputs = [("example_output", (Example**2)("This is covered later ;)"))],
    keywords = ["label 1", "label 2", "3 labels, are you crazy!?"]
)
def example_formula(example_input: Example("I Wonder why this is here")):
    """ This is the docstring """
    i = example_input / 2
    i += 5
    return i

### Unit instantiation

Both the output unit types and the function argument annotations require you to "instantiate" the unit. While it sounds scary, it simply means that you need to provide a short description for the argument which uses that particular unit.

Example unit instantiations:

In [83]:
ProbabilityInstance = Probability("Probability from the Gumbel CDF, F(x).")
RandomVariableInstance = RandomVariable("Random variable with the Gumbel distribution, f(x) and F(x).")
ParameterMuInstance = ParameterMu("Distribution parameter mu with unknown units")
ParameterBetaInstance = ParameterBeta("Distribution parameter beta with unknown units")

### Pure formulas

"Pure" formulas are formulas which can be expressed as a single mathematical expression. They operate under stricter conditions than impure formulas, as they don't allow you to perform any form of flow control (no loops, `if` statements, etc.).

The expression within the function should return a [sympy](https://docs.sympy.org/latest/reference/index.html#reference) expression composed of input arguments (which act as sympy variables), constants, and sympy functions (e.g. `sympy.sin` or `sympy.sqrt`).

While the constraints on the pure formulas may seem arbitrary and discouraging, they provide the website users with one major benefit: pure formulas can be "inverted", meaning that if inputs `a` and `b` result in output `c`, the expression can be **automatically** changed to express `b` from `a` and `c`.

In the example below, we create a pure formula which can be used to calculate the area of the sphere. The formula has one output, the computed area, which is expressed in meters squared. The description of the output is "Area of sphere".

The formula also has one input, namely the radius. It is expressed in meters, and the description of the parameter is "Radius of sphere".

The formula also contains a docstring, which is "Compute the area of a sphere".

In [84]:
@toolkit.PureFormula(
    outputs = [("mu", ParameterMu("Parameter mu")),
               ("beta", ParameterBeta("Parameter beta"))],
    keywords = ["Probability", "Gumbel", "Distribution Parameters"]
)
def gumbel_2_points(p_1: Probability("Probability at Point 1"),
                    x_1: RandomVariable("Random variable at Point 1"),
                    p_2: Probability("Probability at Point 2"),
                    x_2: RandomVariable("Random variable at Point 2")):
    """Compute Gumbel distribution parameters from two points of the CDF."""
    beta = -(x_2 - x_1) / log(log(p_2) / log(p_1))
    mu = x_1 + beta * log(-log(p_1))
    return mu, beta

In [85]:
gumbel_2_points(.192, 0, .545, 2) # Roughly for mu = 1, beta = 2 (graphed using demos)

[{'mu': 1.00165568577621, 'beta': 1.99958097378786}]

### Impure formulas

"Impure" formulas are declared in the same way as pure formulas, with the only difference being that you are allowed to use any Python code in the method body, including if statements, loops, etc. However, the expression cannot be inverted automatically, resulting in a worse user experience when compared to pure formulas.

Below is an example of an impure formula. The formula has one input argument, namely a choice between 4 different colours. The description of the `Choice` is "Color we gossip about". It has one output, which has the description "Score for the chosen colour".

In [86]:
@toolkit.ImpureFormula(
    outputs = [("score", (Unit)("Score for the chosen colour"))],
    keywords = ["Colours"]
)
def we_like_green(colour: Choice([ChoiceEntry("green"), ChoiceEntry("red"), ChoiceEntry("yellow"), ChoiceEntry("pink")], "Colour we gossip about")):
    """ We really like green, who knows why. """
    scores = {"green": 10000, "red": 5, "yellow": 2}
    
    if len(scores) < 2:
        return -3

    if colour not in scores: return -1
    return scores[colour]

NameError: name 'Choice' is not defined

## Step 5: Writing documentation

Formulas must have documentation provided. It uses a specific markdown format, that should start with the following sequence:

```markdown
# Documentation: <function_name>
```

`function_name` is the exact name of the function you want to document. An example for the function above would be:

```markdown
# Documentation: we_like_green
```

Then, you can follow it up with an arbitrary number of key-value pairs. They are currently unused, but there may be a use-case for them in the future.

```markdown
**Key1**: value1

**Key2**: value2
```

After that, write documentation as you would normal markdown document. LaTeX **is supported**.

## Step 6: Writing tests (recommended)

Tests for a function are a collection of functions within a test class. The class itself should be decorated with the `test.set_function` decorator, with the name of the function under test provided as the argument. Furthermore, the test class should extend the `toolkit.test.ToolkitTests` class.

Each individual test is a function in the test class, which takes only one argument, `self`. It should be decorated with `@toolkit.test.test`.

Within the function, you can call `self.documented_test` and provide two arguments:

 - `arguments`: Input arguments for the formula, expressed as a dictionary
 - `expected`: A dictionary of expected output values.

When you submit the notebook to the repository, the test cases will be automatically validated & extracted and then uploaded to the website, next to the formula documentation. Toolkit tests work with [unittest](https://docs.python.org/3/library/unittest.html) behind the scenes. That means you can use methods from unittest, but those tests won't be documented on the website.

In [None]:
@toolkit.test.set_function(
    area_of_sphere
)
class AreaOfSphereTests(toolkit.test.ToolkitTests):
    @toolkit.test.test
    def radius1(self):
        self.documented_test(
            arguments = {"radius": 1}, 
            expected = {"area": 4 * math.pi}
        )
    
    @toolkit.test.test
    def radius2(self):
        calculated_radius = self.function(area = 4)[0]["radius"] # Outputs follow the form: [{"radius": ...}, {"radius": ...}, ...]
        self.assertTrue(calculated_radius < 2) # This won't be documented, but will run

### Running tests locally

You can run tests locally by calling `<function_name>.run_tests()`:

In [None]:
area_of_sphere.run_tests()

For now that's all you need to worry about! Before you know it, your notebooks too can be on the website :)